From nobody Thu Oct 2 07:46:32 2025 Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [80.241.56.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 720D0258ED6; Fri, 19 Sep 2025 02:00:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.152 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247213; cv=none; b=VPtek00abHZHsuXRDIk0puyCo4oBa2fSv5r5Cg+tYc0Cy3bwlZ0UBEZu3Y7ybTvAsr3KOexms5GbLJ60KUawy3of8PxCrh/UxsoXKFel0Laognla36RCddrsf3Hgl0Xe4hwF/ZUI3bNlnXOpS8Jams2aWg+zmNXvkCH7Z9P7lLI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247213; c=relaxed/simple; bh=0foHFKsVISgRDKCD5chs3frnytAecaOPF/iJXka2vNU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=DnzGpduVOux6CJGxI4TIM2Jl1/MwYbV5DHE6od5pVH+zUpLwNk4mkgJ72Eh5rv3/YAG1WxkNDK7pwImhsaX30G6W0uS6RMOjGgRa1sNFFmHRLDOJymBF9BgeCW18J8vId3tdeIvPiQQy2ePAWGSML8KaPiJSjoce5mnZGoDzqwg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com; spf=pass smtp.mailfrom=cyphar.com; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b=zSTJe/nc; arc=none smtp.client-ip=80.241.56.152 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cyphar.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b="zSTJe/nc" Received: from smtp2.mailbox.org (smtp2.mailbox.org [10.196.197.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4cSbLH41qHz9sjJ; Fri, 19 Sep 2025 04:00:07 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cyphar.com; s=MBO0001; t=1758247207; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=l26kFBE6hyGWioTfsXgOOyNO21cgB9nDi07HYHTYlmQ=; b=zSTJe/ncq8LYrMKyPYbol2AecKZyGDmj1P9mMpvlj39CkgSBwlPNTXBNSCI1MLEPKXQYsV WuXSCJ2v9D9uCqYnOyM/Ie0yfeMartCn7l0Rqt43Z27wRBs9jgH3OgDdgNNI7WiTx5YE/7 Gg7HoDCYa4V9evkPQbyGVO7w7AX4F6nZCKDNZi2ysFvRafujUD/DJCxDlhm4i4KF6B9wPD dUiNz1X3SMWxM8jSdbdJLERyFeD/hwCKZtTfP8LiImBtiFvhSal4hr48dAn2RO2uRXNg0G 5vdo/vNtOGh3aePykE5d08f7J7ahSG+czF6kwfonJYd1klXtiXBkmmLbfG9UOA== From: Aleksa Sarai Date: Fri, 19 Sep 2025 11:59:42 +1000 Subject: [PATCH v4 01/10] man/man2/mount_setattr.2: move mount_attr struct to mount_attr(2type) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250919-new-mount-api-v4-1-1261201ab562@cyphar.com> References: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> In-Reply-To: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> To: Alejandro Colomar Cc: "Michael T. Kerrisk" , Alexander Viro , Jan Kara , Askar Safin , "G. Branden Robinson" , linux-man@vger.kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, David Howells , Christian Brauner , Aleksa Sarai X-Developer-Signature: v=1; a=openpgp-sha256; l=3215; i=cyphar@cyphar.com; h=from:subject:message-id; bh=0foHFKsVISgRDKCD5chs3frnytAecaOPF/iJXka2vNU=; b=owGbwMvMwCWmMf3Xpe0vXfIZT6slMWSc2Smadrrj9RVfG5YHnxtesH7Qkp0712L6kquVgn1NG cfM3Lbs7ZjIwiDGxWAppsiyzc8zdNP8xVeSP61kg5nDygQyRFqkgYGBgYGFgS83Ma/USMdIz1Tb UM/QUMdIx4iBi1MApjpanJHhy/uLd2b/v/Wsz557UovzifVrDf1+7Jhuo+zHyPAltXjKUobfLKs tTz/UmLrjrIYxz/vkgrQrh003nllQp/p+Z9K3+SZWnAA= X-Developer-Key: i=cyphar@cyphar.com; a=openpgp; fpr=C9C370B246B09F6DBCFC744C34401015D1D2D386 As with open_how(2type), it makes sense to move this to a separate man page. In addition, future man pages added in this patchset will want to reference mount_attr(2type). Signed-off-by: Aleksa Sarai --- man/man2/mount_setattr.2 | 17 ++++-------- man/man2type/mount_attr.2type | 61 +++++++++++++++++++++++++++++++++++++++= ++++ 2 files changed, 66 insertions(+), 12 deletions(-) diff --git a/man/man2/mount_setattr.2 b/man/man2/mount_setattr.2 index 586633f48e894bf8f2823aa7755c96adcddea6a6..4b55f6d2e09d00d9bc4b3a085f3= 10b1b459f34e8 100644 --- a/man/man2/mount_setattr.2 +++ b/man/man2/mount_setattr.2 @@ -114,18 +114,11 @@ .SH DESCRIPTION .I attr argument of .BR mount_setattr () -is a structure of the following form: -.P -.in +4n -.EX -struct mount_attr { - __u64 attr_set; /* Mount properties to set */ - __u64 attr_clr; /* Mount properties to clear */ - __u64 propagation; /* Mount propagation type */ - __u64 userns_fd; /* User namespace file descriptor */ -}; -.EE -.in +is a pointer to a +.I mount_attr +structure, +described in +.BR mount_attr (2type). .P The .I attr_set diff --git a/man/man2type/mount_attr.2type b/man/man2type/mount_attr.2type new file mode 100644 index 0000000000000000000000000000000000000000..f5c4f48be46ec1e6c0d3a211b67= 24a1e95311a41 --- /dev/null +++ b/man/man2type/mount_attr.2type @@ -0,0 +1,61 @@ +.\" Copyright, the authors of the Linux man-pages project +.\" +.\" SPDX-License-Identifier: Linux-man-pages-copyleft +.\" +.TH mount_attr 2type (date) "Linux man-pages (unreleased)" +.SH NAME +mount_attr \- what mount properties to set and clear +.SH LIBRARY +Linux kernel headers +.SH SYNOPSIS +.EX +.B #include +.P +.B struct mount_attr { +.BR " u64 attr_set;" " /* Mount properties to set */" +.BR " u64 attr_clr;" " /* Mount properties to clear */" +.BR " u64 propagation;" " /* Mount propagation type */" +.BR " u64 userns_fd;" " /* User namespace file descriptor */" + /* ... */ +.B }; +.EE +.SH DESCRIPTION +Specifies which mount properties should be changed with +.BR mount_setattr (2). +.P +The fields are as follows: +.TP +.I .attr_set +This field specifies which +.BI MOUNT_ATTR_ * +attribute flags to set. +.TP +.I .attr_clr +This field specifies which +.BI MOUNT_ATTR_ * +attribute flags to clear. +.TP +.I .propagation +This field specifies what mount propagation will be applied. +The valid values of this field are the same propagation types described in +.BR mount_namespaces (7). +.TP +.I .userns_fd +This field specifies a file descriptor that indicates which user namespace= to +use as a reference for ID-mapped mounts with +.BR MOUNT_ATTR_IDMAP . +.SH STANDARDS +Linux. +.SH HISTORY +Linux 5.12. +.\" commit 2a1867219c7b27f928e2545782b86daaf9ad50bd +glibc 2.36. +.P +Extra fields may be appended to the structure, +with a zero value in a new field resulting in +the kernel behaving as though that extension field was not present. +Therefore, a user +.I must +zero-fill this structure on initialization. +.SH SEE ALSO +.BR mount_setattr (2) --=20 2.51.0 From nobody Thu Oct 2 07:46:32 2025 Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [80.241.56.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BB84C25DCF0; Fri, 19 Sep 2025 02:00:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.152 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247219; cv=none; b=fp7WSXX/uuAc5T6OoVONcm7vf+xOSssFbt04g9xjzKaJS8+TRCq0D61YbNo4nwCUYQhE8OpMqj+6ogaWpj4ZRr0dWLgbTr1jIVMVxebQonwSYrL2QG77GJP12ogHZw4LXbf1eHAZ5MAkHtsLbVTAAbGx0aYxu9kqiMDEZ2Ybs7o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247219; c=relaxed/simple; bh=Qt9JJwP7Po2UlxNm12cpyXuleZSZg10i2LGinn40v9o=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=BDSYXEk6UkmDXJPYghPQbPHduTXAhAorlm0yO/IRHfQuidAggxUH1EU/H2QHyevJh4L8CHDBryJQbq+miayT0y/a2IKRV0SgWQ1AfGYCdikhbW6XH5hULi1WLZBuGwVbt77iLxWkMACqOr1PZ7t/v96x2mTOO0fMRpVsN4Xmzvg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com; spf=pass smtp.mailfrom=cyphar.com; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b=AYpNDpOU; arc=none smtp.client-ip=80.241.56.152 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cyphar.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b="AYpNDpOU" Received: from smtp2.mailbox.org (smtp2.mailbox.org [10.196.197.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4cSbLP5PJxz9sqC; Fri, 19 Sep 2025 04:00:13 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cyphar.com; s=MBO0001; t=1758247213; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Bnl6p0PV1/r3ERamNk0D3Pkl7+jfK7FLKiG+ClsUZ+I=; b=AYpNDpOU1VhAKbdl5Jldh4A7kYmTdetzDtGhE/fQrK6OAmYG4DXmf0DaFGX2cZBoZEuMtG HgX0b0wLr8rpgSnhGyRyVkC73pNsISdX2L5WfQdD/H6pMI3MuHDwumTkQwCgWHVlzLrdHO MIBoUEZvqoh9J+1ZQZDLR75MhXEBRP17W5NaVtsKtt43P4eSvHj3MoEo23rUSzoonWP1AE HNagD4oWSkez6JPlHNSqguH3p4ow6j4GCU0X6Uk0ryJKkWcPIXrZQCaGmMGR7omeSwYcpB LtYC664A8wQdZ269uJETwiaB3T3FM1R3fyKjDCW8dyvMB9KmrImzqCO9t9fFhg== From: Aleksa Sarai Date: Fri, 19 Sep 2025 11:59:43 +1000 Subject: [PATCH v4 02/10] man/man2/fsopen.2: document "new" mount API Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250919-new-mount-api-v4-2-1261201ab562@cyphar.com> References: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> In-Reply-To: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> To: Alejandro Colomar Cc: "Michael T. Kerrisk" , Alexander Viro , Jan Kara , Askar Safin , "G. Branden Robinson" , linux-man@vger.kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, David Howells , Christian Brauner , Aleksa Sarai X-Developer-Signature: v=1; a=openpgp-sha256; l=12327; i=cyphar@cyphar.com; h=from:subject:message-id; bh=Qt9JJwP7Po2UlxNm12cpyXuleZSZg10i2LGinn40v9o=; b=owGbwMvMwCWmMf3Xpe0vXfIZT6slMWSc2Skq0JC5k6vM4d2xn5753At/umr095n2CL6qkNzcJ COc7SjcMZGFQYyLwVJMkWWbn2fopvmLryR/WskGM4eVCWSItEgDAwMDAwsDX25iXqmRjpGeqbah nqGhjpGOEQMXpwBMdWwjw18Z0Uk7C/+FbfrWL9L85YL7tOfHm+PXy7Rt//hx09Tzp1mXMfz3CEh h6ra8r3zMykfZz2WD+PQPS1urfm1/lP6Jz+Z8UTkbAA== X-Developer-Key: i=cyphar@cyphar.com; a=openpgp; fpr=C9C370B246B09F6DBCFC744C34401015D1D2D386 This is loosely based on the original documentation written by David Howells and later maintained by Christian Brauner, but has been rewritten to be more from a user perspective (as well as fixing a few critical mistakes). Co-authored-by: David Howells Signed-off-by: David Howells Co-authored-by: Christian Brauner Signed-off-by: Christian Brauner Signed-off-by: Aleksa Sarai --- man/man2/fsopen.2 | 384 ++++++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 384 insertions(+) diff --git a/man/man2/fsopen.2 b/man/man2/fsopen.2 new file mode 100644 index 0000000000000000000000000000000000000000..7cdbeac7d64b7e5c969dee619a0= 39ec947d1e981 --- /dev/null +++ b/man/man2/fsopen.2 @@ -0,0 +1,384 @@ +.\" Copyright, the authors of the Linux man-pages project +.\" +.\" SPDX-License-Identifier: Linux-man-pages-copyleft +.\" +.TH fsopen 2 (date) "Linux man-pages (unreleased)" +.SH NAME +fsopen \- create a new filesystem context +.SH LIBRARY +Standard C library +.RI ( libc ,\~ \-lc ) +.SH SYNOPSIS +.nf +.B #include +.P +.BI "int fsopen(const char *" fsname ", unsigned int " flags ); +.fi +.SH DESCRIPTION +The +.BR fsopen () +system call is part of +the suite of file descriptor based mount facilities in Linux. +.P +.BR fsopen () +creates a blank filesystem configuration context within the kernel +for the filesystem named by +.I fsname +and places it into creation mode. +A new file descriptor +associated with the filesystem configuration context +is then returned. +The calling process must have the +.B \%CAP_SYS_ADMIN +capability in order to create a new filesystem configuration context. +.P +A filesystem configuration context is +an in-kernel representation of a pending transaction, +containing a set of configuration parameters that are to be applied +when creating a new instance of a filesystem +(or modifying the configuration of an existing filesystem instance, +such as when using +.BR fspick (2)). +.P +After obtaining a filesystem configuration context with +.BR fsopen (), +the general workflow for operating on the context looks like the following: +.IP (1) 5 +Pass the filesystem context file descriptor to +.BR fsconfig (2) +to specify any desired filesystem parameters. +This may be done as many times as necessary. +.IP (2) +Pass the same filesystem context file descriptor to +.BR fsconfig (2) +with +.B \%FSCONFIG_CMD_CREATE +to create an instance of the configured filesystem. +.IP (3) +Pass the same filesystem context file descriptor to +.BR fsmount (2) +to create a new detached mount object for +the root of the filesystem instance, +which is then attached to a new file descriptor. +(This also places the filesystem context file descriptor into +reconfiguration mode, +similar to the mode produced by +.BR fspick (2).) +Once a mount object has been created with +.BR fsmount (2), +the filesystem context file descriptor can be safely closed. +.IP (4) +Now that a mount object has been created, +you may +.RS +.IP (4.1) 7 +use the detached mount object file descriptor as a +.I dirfd +argument to "*at()" system calls; and/or +.IP (4.2) 7 +attach the mount object to a mount point +by passing the mount object file descriptor to +.BR move_mount (2). +This will also prevent the mount object from +being unmounted and destroyed when +the mount object file descriptor is closed. +.RE +.IP +The mount object file descriptor will +remain associated with the mount object +even after doing the above operations, +so you may repeatedly use the mount object file descriptor with +.BR move_mount (2) +and/or "*at()" system calls +as many times as necessary. +.P +A filesystem context will move between different modes +throughout its lifecycle +(such as the creation phase +when created with +.BR fsopen (), +the reconfiguration phase +when an existing filesystem instance is selected with +.BR fspick (2), +and the intermediate "awaiting-mount" phase +.\" FS_CONTEXT_AWAITING_MOUNT is the term the kernel uses for this. +between +.BR \%FSCONFIG_CMD_CREATE +and +.BR fsmount (2)), +which has an impact on +what operations are permitted on the filesystem context. +.P +The file descriptor returned by +.BR fsopen () +also acts as a channel for filesystem drivers to +provide more comprehensive diagnostic information +than is normally provided through the standard +.BR errno (3) +interface for system calls. +If an error occurs at any time during the workflow mentioned above, +calling +.BR read (2) +on the filesystem context file descriptor +will retrieve any ancillary information about the encountered errors. +(See the "Message retrieval interface" section +for more details on the message format.) +.P +.I flags +can be used to control aspects of +the creation of the filesystem configuration context file descriptor. +A value for +.I flags +is constructed by bitwise ORing +zero or more of the following constants: +.RS +.TP +.B FSOPEN_CLOEXEC +Set the close-on-exec +.RB ( FD_CLOEXEC ) +flag on the new file descriptor. +See the description of the +.B O_CLOEXEC +flag in +.BR open (2) +for reasons why this may be useful. +.RE +.P +A list of filesystems supported by the running kernel +(and thus a list of valid values for +.IR fsname ) +can be obtained from +.IR /proc/filesystems . +(See also +.BR proc_filesystems (5).) +.SS Message retrieval interface +When doing operations on a filesystem configuration context, +the filesystem driver may choose to provide +ancillary information to userspace +in the form of message strings. +.P +The filesystem context file descriptors returned by +.BR fsopen () +and +.BR fspick (2) +may be queried for message strings at any time by calling +.BR read (2) +on the file descriptor. +Each call to +.BR read (2) +will return a single message, +prefixed to indicate its class: +.RS +.TP +\fBe\fP <\fImessage\fP> +An error message was logged. +This is usually associated with an error being returned +from the corresponding system call which triggered this message. +.TP +\fBw\fP <\fImessage\fP> +A warning message was logged. +.TP +\fBi\fP <\fImessage\fP> +An informational message was logged. +.RE +.P +Messages are removed from the queue as they are read. +Note that the message queue has limited depth, +so it is possible for messages to get lost. +If there are no messages in the message queue, +.B read(2) +will return \-1 and +.I errno +will be set to +.BR \%ENODATA . +If the +.I buf +argument to +.BR read (2) +is not large enough to contain the entire message, +.BR read (2) +will return \-1 and +.I errno +will be set to +.BR \%EMSGSIZE . +(See BUGS.) +.P +If there are multiple filesystem contexts +referencing the same filesystem instance +(such as if you call +.BR fspick (2) +multiple times for the same mount), +each one gets its own independent message queue. +This does not apply to multiple file descriptors that are +tied to the same underlying open file description +(such as those created with +.BR dup (2)). +.P +Message strings will usually be prefixed by +the name of the filesystem or kernel subsystem +that logged the message, +though this may not always be the case. +See the Linux kernel source code for details. +.SH RETURN VALUE +On success, a new file descriptor is returned. +On error, \-1 is returned, and +.I errno +is set to indicate the error. +.SH ERRORS +.TP +.B EFAULT +.I fsname +is NULL +or a pointer to a location +outside the calling process's accessible address space. +.TP +.B EINVAL +.I flags +had an invalid flag set. +.TP +.B EMFILE +The calling process has too many open files to create more. +.TP +.B ENFILE +The system has too many open files to create more. +.TP +.B ENODEV +The filesystem named by +.I fsname +is not supported by the kernel. +.TP +.B ENOMEM +The kernel could not allocate sufficient memory to complete the operation. +.TP +.B EPERM +The calling process does not have the required +.B \%CAP_SYS_ADMIN +capability. +.SH STANDARDS +Linux. +.SH HISTORY +Linux 5.2. +.\" commit 24dcb3d90a1f67fe08c68a004af37df059d74005 +.\" commit 400913252d09f9cfb8cce33daee43167921fc343 +glibc 2.36. +.SH BUGS +.SS Message retrieval interface and \fB\%EMSGSIZE\fP +As described in the "Message retrieval interface" subsection above, +calling +.BR read (2) +with too small a buffer to contain +the next pending message in the message queue +for the filesystem configuration context +will cause +.BR read (2) +to return \-1 and set +.BR errno (3) +to +.BR \%EMSGSIZE . +.P +However, +this failed operation still +consumes the message from the message queue. +This effectively discards the message silently, +as no data is copied into the +.BR read (2) +buffer. +.P +Programs should take care to ensure that +their buffers are sufficiently large +to contain any reasonable message string, +in order to avoid silently losing valuable diagnostic information. +.\" Aleksa Sarai +.\" This unfortunate behaviour has existed since this feature was merged= , but +.\" I have sent a patchset which will finally fix it. +.\" +.SH EXAMPLES +To illustrate the workflow for creating a new mount, +the following is an example of how to mount an +.BR ext4 (5) +filesystem stored on +.I /dev/sdb1 +onto +.IR /mnt . +.P +.in +4n +.EX +int fsfd, mntfd; +\& +fsfd =3D fsopen("ext4", FSOPEN_CLOEXEC); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "ro", NULL, 0); +fsconfig(fsfd, FSCONFIG_SET_PATH, "source", "/dev/sdb1", AT_FDCWD); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "noatime", NULL, 0); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "acl", NULL, 0); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "iversion", NULL, 0) +fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); +mntfd =3D fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_RELATIME); +move_mount(mntfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH); +.EE +.in +.P +First, +an ext4 configuration context is created and attached to the file descript= or +.IR fsfd . +Then, a series of parameters +(such as the source of the filesystem) +are provided using +.BR fsconfig (2), +followed by the filesystem instance being created with +.BR \%FSCONFIG_CMD_CREATE . +.BR fsmount (2) +is then used to create a new mount object attached to the file descriptor +.IR mntfd , +which is then attached to the intended mount point using +.BR move_mount (2). +.P +The above procedure is functionally equivalent to +the following mount operation using +.BR mount (2): +.P +.in +4n +.EX +mount("/dev/sdb1", "/mnt", "ext4", MS_RELATIME, + "ro,noatime,acl,user_xattr,iversion"); +.EE +.in +.P +And here's an example of creating a mount object +of an NFS server share +and setting a Smack security module label. +However, instead of attaching it to a mount point, +the program uses the mount object directly +to open a file from the NFS share. +.P +.in +4n +.EX +int fsfd, mntfd, fd; +\& +fsfd =3D fsopen("nfs", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "source", "example.com/pub/linux", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "nfsvers", "3", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "rsize", "65536", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "wsize", "65536", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "smackfsdef", "foolabel", 0); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "rdma", NULL, 0); +fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); +mntfd =3D fsmount(fsfd, 0, MOUNT_ATTR_NODEV); +fd =3D openat(mntfd, "src/linux-5.2.tar.xz", O_RDONLY); +.EE +.in +.P +Unlike the previous example, +this operation has no trivial equivalent with +.BR mount (2), +as it was not previously possible to create a mount object +that is not attached to any mount point. +.SH SEE ALSO +.BR fsconfig (2), +.BR fsmount (2), +.BR fspick (2), +.BR mount (2), +.BR mount_setattr (2), +.BR move_mount (2), +.BR open_tree (2), +.BR mount_namespaces (7) --=20 2.51.0 From nobody Thu Oct 2 07:46:32 2025 Received: from mout-p-201.mailbox.org (mout-p-201.mailbox.org [80.241.56.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BF2471F790F; Fri, 19 Sep 2025 02:00:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247225; cv=none; b=UVzIoMWiNv76S4rY/NkmybUtLcQgXhwjL3Iy6FK35jQgJOS+7Ii1ENI1HVHqCjK8B6yo6JHFd0sZx5orUGzG4DkeRosOqSX3o32ZX4LEGVHjmTMp2Vhzn2LJ+UE8+RGMZWCztnZ7xppY5rLnuvt93AVNMf759VNgKD80ESBUGPE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247225; c=relaxed/simple; bh=3JkEZK4xBrti4iIyGstioYHI62898EiUb7TklcJ8L+4=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=epQgjzSCYaa2l+HV7wjS38fdMdzo3YZTDzuIcbUM+QUPYe8DMZomgMXUtGsXyt0xxrXMVfBIZdxJdsaZrkMSJ1aWxRY2op9Hk46u26AvgjPVEgOwiWF0TnX/ueRxeMisEKpPl6XlYQuH70lBpmX5pq8r2lwtzRSJQLF9rUHtmt0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com; spf=pass smtp.mailfrom=cyphar.com; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b=RJoiOuWF; arc=none smtp.client-ip=80.241.56.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cyphar.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b="RJoiOuWF" Received: from smtp2.mailbox.org (smtp2.mailbox.org [10.196.197.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-201.mailbox.org (Postfix) with ESMTPS id 4cSbLW0VwWz9tWx; Fri, 19 Sep 2025 04:00:19 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cyphar.com; s=MBO0001; t=1758247219; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=nTOnNWLTLMrr3vBPQIh0rPBkBIi8RXn0VGLT3s/grSU=; b=RJoiOuWFYXBVueVlyBFunudQwfKpaZBg1Q4N/z6/FU7ji/wS/pA9x4wK4cuZVVyXEZcTvF BNPIY0nBcgn5mzJJWaMwvpoXsZhpmMVx0siv7fn+rlDior+QXVVrDOSbzKcHIhKMphTqTG lfHBDisDlHGTWM7nK0ZE03gZWB/OCEArFne3y22VUuEOpL70vZLwbigsn+1m7EmRBTLEyQ ZWPJ5zijgloFHvd7c8WFwMj4409e+E+evv9WT9EC4WtIqzoOZfhISpk/zpjKwa1E4bxZrr I8dXQsFDAsdD5YqrLH3ITkZSLcr12BrhyLgMb12ffaIR7TLuUhEjvKM+90EPKw== From: Aleksa Sarai Date: Fri, 19 Sep 2025 11:59:44 +1000 Subject: [PATCH v4 03/10] man/man2/fspick.2: document "new" mount API Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250919-new-mount-api-v4-3-1261201ab562@cyphar.com> References: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> In-Reply-To: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> To: Alejandro Colomar Cc: "Michael T. Kerrisk" , Alexander Viro , Jan Kara , Askar Safin , "G. Branden Robinson" , linux-man@vger.kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, David Howells , Christian Brauner , Aleksa Sarai X-Developer-Signature: v=1; a=openpgp-sha256; l=8488; i=cyphar@cyphar.com; h=from:subject:message-id; bh=3JkEZK4xBrti4iIyGstioYHI62898EiUb7TklcJ8L+4=; b=owGbwMvMwCWmMf3Xpe0vXfIZT6slMWSc2SkaH3/Gq8JacF+S5o2jnZIeW55XKT72/6R5oCg19 0bC3/0NHRNZGMS4GCzFFFm2+XmGbpq/+Eryp5VsMHNYmUCGSIs0MDAwMLAw8OUm5pUa6RjpmWob 6hka6hjpGDFwcQrAVG9PZWSYa2r5tfu0pPO21ikLLjsssZunON0v0PqXZLlQxYv49cVdjAw7HDb 2BtwNLkvRvt1TyWtXaib6ZOoOvUmsH2f99Y909mUHAA== X-Developer-Key: i=cyphar@cyphar.com; a=openpgp; fpr=C9C370B246B09F6DBCFC744C34401015D1D2D386 This is loosely based on the original documentation written by David Howells and later maintained by Christian Brauner, but has been rewritten to be more from a user perspective (as well as fixing a few critical mistakes). Co-authored-by: David Howells Signed-off-by: David Howells Co-authored-by: Christian Brauner Signed-off-by: Christian Brauner Signed-off-by: Aleksa Sarai --- man/man2/fspick.2 | 342 ++++++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 342 insertions(+) diff --git a/man/man2/fspick.2 b/man/man2/fspick.2 new file mode 100644 index 0000000000000000000000000000000000000000..1f87293f44658adeb7ab7cffebc= ac3174888f040 --- /dev/null +++ b/man/man2/fspick.2 @@ -0,0 +1,342 @@ +.\" Copyright, the authors of the Linux man-pages project +.\" +.\" SPDX-License-Identifier: Linux-man-pages-copyleft +.\" +.TH fspick 2 (date) "Linux man-pages (unreleased)" +.SH NAME +fspick \- select filesystem for reconfiguration +.SH LIBRARY +Standard C library +.RI ( libc ,\~ \-lc ) +.SH SYNOPSIS +.nf +.BR "#include " " /* Definition of " AT_* " constants */" +.B #include +.P +.BI "int fspick(int " dirfd ", const char *" path ", unsigned int " flags = ); +.fi +.SH DESCRIPTION +The +.BR fspick () +system call is part of +the suite of file descriptor based mount facilities in Linux. +.P +.BR fspick() +creates a new filesystem configuration context +for the extant filesystem instance +associated with the path described by +.IR dirfd +and +.IR path , +places it into reconfiguration mode +(similar to +.BR mount (8) +with the +.I -o remount +option). +A new file descriptor +associated with the filesystem configuration context +is then returned. +The calling process must have the +.BR CAP_SYS_ADMIN +capability in order to create a new filesystem configuration context. +.P +The resultant file descriptor can be used with +.BR fsconfig (2) +to specify the desired set of changes to +filesystem parameters of the filesystem instance. +Once the desired set of changes have been configured, +the changes can be effectuated by calling +.BR fsconfig (2) +with the +.B \%FSCONFIG_CMD_RECONFIGURE +command. +Please note that\[em]in contrast to +the behaviour of +.B MS_REMOUNT +with +.BR mount (2)\[em] fspick () +instantiates the filesystem configuration context +with a copy of +the extant filesystem's filesystem parameters, +meaning that a subsequent +.B \%FSCONFIG_CMD_RECONFIGURE +operation +will only update filesystem parameters +explicitly modified with +.BR fsconfig (2). +.P +As with "*at()" system calls, +.BR fspick () +uses the +.I dirfd +argument in conjunction with the +.I path +argument to determine the path to operate on, as follows: +.IP \[bu] 3 +If the pathname given in +.I path +is absolute, then +.I dirfd +is ignored. +.IP \[bu] +If the pathname given in +.I path +is relative and +.I dirfd +is the special value +.BR \%AT_FDCWD , +then +.I path +is interpreted relative to +the current working directory +of the calling process (like +.BR open (2)). +.IP \[bu] +If the pathname given in +.I path +is relative, +then it is interpreted relative to +the directory referred to by the file descriptor +.I dirfd +(rather than relative to +the current working directory +of the calling process, +as is done by +.BR open (2) +for a relative pathname). +In this case, +.I dirfd +must be a directory +that was opened for reading +.RB ( O_RDONLY ) +or using the +.B O_PATH +flag. +.IP \[bu] +If +.I path +is an empty string, +and +.I flags +contains +.BR \%FSPICK_EMPTY_PATH , +then the file descriptor +.I dirfd +is operated on directly. +In this case, +.I dirfd +may refer to any type of file, +not just a directory. +.P +See +.BR openat (2) +for an explanation of why the +.I dirfd +argument is useful. +.P +.I flags +can be used to control aspects of how +.I path +is resolved and +properties of the returned file descriptor. +A value for +.I flags +is constructed by bitwise ORing +zero or more of the following constants: +.RS +.TP +.B FSPICK_CLOEXEC +Set the close-on-exec +.RB ( FD_CLOEXEC ) +flag on the new file descriptor. +See the description of the +.B O_CLOEXEC +flag in +.BR open (2) +for reasons why this may be useful. +.TP +.B FSPICK_EMPTY_PATH +If +.I path +is an empty string, +operate on the file referred to by +.I dirfd +(which may have been obtained from +.BR open (2), +.BR fsmount (2), +or +.BR open_tree (2)). +In this case, +.I dirfd +may refer to any type of file, +not just a directory. +If +.I dirfd +is +.BR \%AT_FDCWD , +.BR fspick () +will operate on the current working directory +of the calling process. +.TP +.B FSPICK_SYMLINK_NOFOLLOW +Do not follow symbolic links +in the terminal component of +.IR path . +If +.I path +references a symbolic link, +the returned filesystem context will reference +the filesystem that the symbolic link itself resides on. +.TP +.B FSPICK_NO_AUTOMOUNT +Do not automount any automount points encountered +while resolving +.IR path . +This allows you to reconfigure an automount point, +rather than the location that would be mounted. +This flag has no effect if +the automount point has already been mounted over. +.RE +.P +As with filesystem contexts created with +.BR fsopen (2), +the file descriptor returned by +.BR fspick () +may be queried for message strings at any time by calling +.BR read (2) +on the file descriptor. +(See the "Message retrieval interface" subsection in +.BR fsopen (2) +for more details on the message format.) +.SH RETURN VALUE +On success, a new file descriptor is returned. +On error, \-1 is returned, and +.I errno +is set to indicate the error. +.SH ERRORS +.TP +.B EACCES +Search permission is denied +for one of the directories +in the path prefix of +.IR path . +(See also +.BR path_resolution (7).) +.TP +.B EBADF +.I path +is relative but +.I dirfd +is neither +.B \%AT_FDCWD +nor a valid file descriptor. +.TP +.B EFAULT +.I path +is NULL +or a pointer to a location +outside the calling process's accessible address space. +.TP +.B EINVAL +Invalid flag specified in +.IR flags . +.TP +.B ELOOP +Too many symbolic links encountered when resolving +.IR path . +.TP +.B EMFILE +The calling process has too many open files to create more. +.TP +.B ENAMETOOLONG +.I path +is longer than +.BR PATH_MAX . +.TP +.B ENFILE +The system has too many open files to create more. +.TP +.B ENOENT +A component of +.I path +does not exist, +or is a dangling symbolic link. +.TP +.B ENOENT +.I path +is an empty string, but +.B \%FSPICK_EMPTY_PATH +is not specified in +.IR flags . +.TP +.B ENOTDIR +A component of the path prefix of +.I path +is not a directory; +or +.I path +is relative and +.I dirfd +is a file descriptor referring to a file other than a directory. +.TP +.B ENOMEM +The kernel could not allocate sufficient memory to complete the operation. +.TP +.B EPERM +The calling process does not have the required +.B \%CAP_SYS_ADMIN +capability. +.SH STANDARDS +Linux. +.SH HISTORY +Linux 5.2. +.\" commit cf3cba4a429be43e5527a3f78859b1bfd9ebc5fb +.\" commit 400913252d09f9cfb8cce33daee43167921fc343 +glibc 2.36. +.SH EXAMPLES +The following example sets the read-only flag +on the filesystem instance referenced by +the mount object attached at +.IR /tmp . +.P +.in +4n +.EX +int fsfd =3D fspick(AT_FDCWD, "/tmp", FSPICK_CLOEXEC); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "ro", NULL, 0); +fsconfig(fsfd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0); +.EE +.in +.P +The above procedure is roughly equivalent to +the following mount operation using +.BR mount (2): +.P +.in +4n +.EX +mount(NULL, "/tmp", NULL, MS_REMOUNT | MS_RDONLY, NULL); +.EE +.in +.P +With the notable caveat that +in this example, +.BR mount (2) +will clear all other filesystem parameters +(such as +.B MS_NOSUID +or +.BR MS_NOEXEC ); +.BR fsconfig (2) +will only modify the +.I ro +parameter. +.SH SEE ALSO +.BR fsconfig (2), +.BR fsmount (2), +.BR fsopen (2), +.BR mount (2), +.BR mount_setattr (2), +.BR move_mount (2), +.BR open_tree (2), +.BR mount_namespaces (7) + --=20 2.51.0 From nobody Thu Oct 2 07:46:32 2025 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B15072580E2; Fri, 19 Sep 2025 02:00:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.151 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247231; cv=none; b=N8EMXkCeEf+BB/uR88viJpVTNq2ktSEimp8lIXJzmNkHF5nAtkcl3FDs98uK/AdoGFmZtpTozwpZpGrZoI2W5MeQBmxRDpMmEsKJErfeKFopOIBcBfd4GULVXQMcf6NLAUC41rSvjt9aAJy+x60pMPtbu/mhwr+xPqkSuv7iEJk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247231; c=relaxed/simple; bh=5VzbBC8gFJiuW8HH6p57MDD6jcCR4XmaedvXXLuPU6o=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=F9dnI5pWNQZkDG0wm8uDKE5NdKYueyzkPsm7MWgOqZ4V9GULVn8J/APq2CS62B49/96JHKRRDKqR+/e/9iyYRZmQvHyW9zhmsQq7pgTHLiyHc1DR3tCiqK8wOiJrDSR9rRDKZrwbe8z77kQuGljQJxI9LX2M/zKVOH9O+vRZJbo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com; spf=pass smtp.mailfrom=cyphar.com; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b=ByHOUsmq; arc=none smtp.client-ip=80.241.56.151 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cyphar.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b="ByHOUsmq" Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:b231:465::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4cSbLc6R64z9tqX; Fri, 19 Sep 2025 04:00:24 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cyphar.com; s=MBO0001; t=1758247224; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=7TmbrtI4vBNR2dcO7r1ZBQC4ADQOZltRsgZIZ8oSMTY=; b=ByHOUsmq+tHJ4l0yCXOdrwm94IyN+neoCnJRbeL7cFLnFDYXc22O76AuPvnD+jj5nrNtaD ti9tu8qeY8tADsWBFShg3zvFo7YOb1h+sEC8Ht+psUgegAPhUyREaVYkrkeYqshvSm/+uc +Dbng+tc28khqTyXeiK0OFOQWJ/nb7cO0xwVXXjUYXNbQJNXVpUNPOyOvLs2jUt/TNr8E2 G4E+gbRVGflPIoWxYeyk/AK7QRbyIaL8wgS7qtyh31Ug8GOrAcT3KD/E8y2zzfyj7xA/FU o9VI+QcvvL2YqxEHDkQEywgnfaZxFHghvur4dW6hpLP6EJi9bH4EooIHE14MRg== Authentication-Results: outgoing_mbo_mout; dkim=none; spf=pass (outgoing_mbo_mout: domain of cyphar@cyphar.com designates 2001:67c:2050:b231:465::2 as permitted sender) smtp.mailfrom=cyphar@cyphar.com From: Aleksa Sarai Date: Fri, 19 Sep 2025 11:59:45 +1000 Subject: [PATCH v4 04/10] man/man2/fsconfig.2: document "new" mount API Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250919-new-mount-api-v4-4-1261201ab562@cyphar.com> References: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> In-Reply-To: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> To: Alejandro Colomar Cc: "Michael T. Kerrisk" , Alexander Viro , Jan Kara , Askar Safin , "G. Branden Robinson" , linux-man@vger.kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, David Howells , Christian Brauner , Aleksa Sarai X-Developer-Signature: v=1; a=openpgp-sha256; l=20464; i=cyphar@cyphar.com; h=from:subject:message-id; bh=5VzbBC8gFJiuW8HH6p57MDD6jcCR4XmaedvXXLuPU6o=; b=owGbwMvMwCWmMf3Xpe0vXfIZT6slMWSc2SmmzhQfIMX7ZanZyYlTpR5e/qtjvub5Hc0tWzaz6 rpU+4ke7ZjIwiDGxWAppsiyzc8zdNP8xVeSP61kg5nDygQyRFqkgYGBgYGFgS83Ma/USMdIz1Tb UM/QUMdIx4iBi1MApjp9OiPDvu232hfZ74i1v9S5+MjnF2zcnnLp8daLxeuf8+r/mXhUjuG//5f A/UWTQ9U2pJw8d23x9ulbpQuskvbMET7woPLFUZ+7LAA= X-Developer-Key: i=cyphar@cyphar.com; a=openpgp; fpr=C9C370B246B09F6DBCFC744C34401015D1D2D386 X-Rspamd-Queue-Id: 4cSbLc6R64z9tqX This is loosely based on the original documentation written by David Howells and later maintained by Christian Brauner, but has been rewritten to be more from a user perspective (as well as fixing a few critical mistakes). Co-authored-by: David Howells Signed-off-by: David Howells Co-authored-by: Christian Brauner Signed-off-by: Christian Brauner Signed-off-by: Aleksa Sarai --- man/man2/fsconfig.2 | 727 ++++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 727 insertions(+) diff --git a/man/man2/fsconfig.2 b/man/man2/fsconfig.2 new file mode 100644 index 0000000000000000000000000000000000000000..5a18e08c700ac93aa22c341b413= 4944ee3c38d0b --- /dev/null +++ b/man/man2/fsconfig.2 @@ -0,0 +1,727 @@ +.\" Copyright, the authors of the Linux man-pages project +.\" +.\" SPDX-License-Identifier: Linux-man-pages-copyleft +.\" +.TH fsconfig 2 (date) "Linux man-pages (unreleased)" +.SH NAME +fsconfig \- configure new or existing filesystem context +.SH LIBRARY +Standard C library +.RI ( libc ,\~ \-lc ) +.SH SYNOPSIS +.nf +.B #include +.P +.BI "int fsconfig(int " fd ", unsigned int " cmd , +.BI " const char *_Nullable " key , +.BI " const void *_Nullable " value ", int " aux ); +.fi +.SH DESCRIPTION +The +.BR fsconfig () +system call is part of +the suite of file descriptor based mount facilities in Linux. +.P +.BR fsconfig () +is used to supply parameters to +and issue commands against +the filesystem configuration context +associated with the file descriptor +.IR fd . +Filesystem configuration contexts can be created with +.BR fsopen (2) +or be instantiated from an extant filesystem instance with +.BR fspick (2). +.P +The +.I cmd +argument indicates the command to be issued. +Some commands supply parameters to the context +(equivalent to mount options specified with +.BR mount (8)), +while others are meta-operations on the filesystem context. +The list of valid +.I cmd +values are: +.RS +.TP +.B FSCONFIG_SET_FLAG +Set the flag parameter named by +.IR key . +.I value +must be NULL, +and +.I aux +must be 0. +.TP +.B FSCONFIG_SET_STRING +Set the string parameter named by +.I key +to the value specified by +.IR value . +.I value +points to a null-terminated string, +and +.I aux +must be 0. +.TP +.B FSCONFIG_SET_BINARY +Set the blob parameter named by +.I key +to the contents of the binary blob +specified by +.IR value . +.I value +points to +the start of a buffer +that is +.I aux +bytes in length. +.TP +.B FSCONFIG_SET_FD +Set the file parameter named by +.I key +to the open file description +referenced by the file descriptor +.IR aux . +.I value +must be NULL. +.IP +You may also use +.B \%FSCONFIG_SET_STRING +for file parameters, +with +.I value +set to a null-terminated string +containing a base-10 representation +of the file descriptor number. +This mechanism is primarily intended for compatibility +with older +.BR mount (2)-based +programs, +and only works for parameters +that +.I only +accept file descriptor arguments. +.TP +.B FSCONFIG_SET_PATH +Set the path parameter named by +.I key +to the object at a provided path, +resolved in a similar manner to +.BR openat (2). +.I value +points to a null-terminated pathname string, +and +.I aux +is equivalent to the +.I dirfd +argument to +.BR openat (2). +See +.BR openat (2) +for an explanation of the need for +.BR \%FSCONFIG_SET_PATH . +.IP +You may also use +.B \%FSCONFIG_SET_STRING +for path parameters, +the behaviour of which is equivalent to +.B \%FSCONFIG_SET_PATH +with +.I aux +set to +.BR \%AT_FDCWD . +.TP +.B FSCONFIG_SET_PATH_EMPTY +As with +.BR \%FSCONFIG_SET_PATH , +except that if +.I value +is an empty string, +the file descriptor specified by +.I aux +is operated on directly +and may be any type of file +(not just a directory). +This is equivalent to the behaviour of +.B \%AT_EMPTY_PATH +with most "*at()" system calls. +If +.I aux +is +.BR \%AT_FDCWD , +the parameter will be set to +the current working directory +of the calling process. +.TP +.B FSCONFIG_CMD_CREATE +This command instructs the filesystem driver +to instantiate an instance of the filesystem in the kernel +with the parameters specified in the filesystem configuration context. +.I key +and +.I value +must be NULL, +and +.I aux +must be 0. +.IP +This command can only be issued once +in the lifetime of a filesystem context. +If the operation succeeds, +the filesystem context +associated with file descriptor +.I fd +now references the created filesystem instance, +and is placed into a special "awaiting-mount" mode +that allows you to use +.BR fsmount (2) +to create a mount object from the filesystem instance. +.\" FS_CONTEXT_AWAITING_MOUNT is the term the kernel uses for this. +If the operation fails, +in most cases +the filesystem context is placed in a failed mode +and cannot be used for any further +.BR fsconfig () +operations +(though you may still retrieve diagnostic messages +through the message retrieval interface, +as described in +the corresponding subsection of +.BR fsopen (2)). +.IP +This command can only be issued against +filesystem configuration contexts +that were created with +.BR fsopen (2). +In order to create a filesystem instance, +the calling process must have the +.B \%CAP_SYS_ADMIN +capability. +.IP +An important thing to be aware of is that +the Linux kernel will +.I silently +reuse extant filesystem instances +depending on the filesystem type +and the configured parameters +(each filesystem driver has +its own policy for +how filesystem instances are reused). +This means that +the filesystem instance "created" by +.B \%FSCONFIG_CMD_CREATE +may, in fact, be a reference +to an extant filesystem instance in the kernel. +(For reference, +this behaviour also applies to +.BR mount (2).) +.IP +One side-effect of this behaviour is that +if an extant filesystem instance is reused, +.I all +parameters configured +for this filesystem configuration context +are +.I silently ignored +(with the exception of the +.I ro +and +.I rw +flag parameters; +if the state of the read-only flag in the +extant filesystem instance and the filesystem configuration context +do not match, this operation will return +.BR EBUSY ). +This also means that +.BR \%FSCONFIG_CMD_RECONFIGURE +commands issued against +the "created" filesystem instance +will also affect any mount objects associated with +the extant filesystem instance. +.IP +Programs that need to ensure +that they create a new filesystem instance +with specific parameters +(notably, security-related parameters +such as +.I acl +to enable POSIX ACLs\[em]as described in +.BR acl (5)) +should use +.B \%FSCONFIG_CMD_CREATE_EXCL +instead. +.TP +.BR FSCONFIG_CMD_CREATE_EXCL " (since Linux 6.6)" +.\" commit 22ed7ecdaefe0cac0c6e6295e83048af60435b13 +.\" commit 84ab1277ce5a90a8d1f377707d662ac43cc0918a +As with +.BR \%FSCONFIG_CMD_CREATE , +except that the kernel is instructed +to not reuse extant filesystem instances. +If the operation +would be forced to +reuse an extant filesystem instance, +this operation will return +.B EBUSY +instead. +.IP +As a result (unlike +.BR \%FSCONFIG_CMD_CREATE ), +if this operation succeeds +then the calling process can be sure that +all of the parameters successfully configured with +.BR fsconfig () +will actually be applied +to the created filesystem instance. +.TP +.B FSCONFIG_CMD_RECONFIGURE +This command instructs the filesystem driver +to apply the parameters specified in the filesystem configuration context +to the extant filesystem instance +referenced by the filesystem configuration context. +.I key +and +.I value +must be NULL, +and +.I aux +must be 0. +.IP +This is primarily intended for use with +.BR fspick (2), +but may also be used to modify +the parameters of a filesystem instance +after +.BR \%FSCONFIG_CMD_CREATE +was used to create it +and a mount object was created using +.BR fsmount (2). +In order to reconfigure an extant filesystem instance, +the calling process must have the +.B CAP_SYS_ADMIN +capability. +.IP +If the operation succeeds, +the filesystem context is reset +but remains in reconfiguration mode +and thus can be reused for subsequent +.B \%FSCONFIG_CMD_RECONFIGURE +commands. +If the operation fails, +in most cases +the filesystem context is placed in a failed mode +and cannot be used for any further +.BR fsconfig () +operations +(though you may still retrieve diagnostic messages +through the message retrieval interface, +as described in +the corresponding subsection of +.BR fsopen (2)). +.RE +.P +Parameters specified with +.BI FSCONFIG_SET_ * +do not take effect +until a corresponding +.B \%FSCONFIG_CMD_CREATE +or +.B \%FSCONFIG_CMD_RECONFIGURE +command is issued. +.SH RETURN VALUE +On success, +.BR fsconfig () +returns 0. +On error, \-1 is returned, and +.I errno +is set to indicate the error. +.SH ERRORS +If an error occurs, the filesystem driver may provide +additional information about the error +through the message retrieval interface for filesystem configuration conte= xts. +This additional information can be retrieved at any time by calling +.BR read (2) +on the filesystem instance or filesystem configuration context +referenced by the file descriptor +.IR fd . +(See the "Message retrieval interface" subsection in +.BR fsopen (2) +for more details on the message format.) +.P +Even after an error occurs, +the filesystem configuration context is +.I not +invalidated, +and thus can still be used with other +.BR fsconfig () +commands. +This means that users can probe support for filesystem parameters +on a per-parameter basis, +and adjust which parameters they wish to set. +.P +The error values given below result from +filesystem type independent errors. +Each filesystem type may have its own special errors +and its own special behavior. +See the Linux kernel source code for details. +.TP +.B EACCES +A component of a path +provided as a path parameter +was not searchable. +(See also +.BR path_resolution (7).) +.TP +.B EACCES +.B \%FSCONFIG_CMD_CREATE +was attempted +for a read-only filesystem +without specifying the +.RB ' ro ' +flag parameter. +.TP +.B EACCES +A specified block device parameter +is located on a filesystem +mounted with the +.B \%MS_NODEV +option. +.TP +.B EBADF +The file descriptor given by +.I fd +(or possibly by +.IR aux , +depending on the command) +is invalid. +.TP +.B EBUSY +The filesystem context associated with +.I fd +is in the wrong state +for the given command. +.TP +.B EBUSY +The filesystem instance cannot be reconfigured as read-only +with +.B \%FSCONFIG_CMD_RECONFIGURE +because some programs +still hold files open for writing. +.TP +.B EBUSY +A new filesystem instance was requested with +.B \%FSCONFIG_CMD_CREATE_EXCL +but a matching superblock already existed. +.TP +.B EFAULT +One of the pointer arguments +points to a location +outside the calling process's accessible address space. +.TP +.B EINVAL +.I fd +does not refer to +a filesystem configuration context +or filesystem instance. +.TP +.B EINVAL +One of the values of +.IR name , +.IR value , +and/or +.I aux +were set to a non-zero value when +.I cmd +required that they be zero +(or NULL). +.TP +.B EINVAL +The parameter named by +.I name +cannot be set +using the type specified with +.IR cmd . +.TP +.B EINVAL +One of the source parameters +referred to +an invalid superblock. +.TP +.B ELOOP +Too many links encountered +during pathname resolution +of a path argument. +.TP +.B ENAMETOOLONG +A path argument was longer than +.BR PATH_MAX . +.TP +.B ENOENT +A path argument had a non-existent component. +.TP +.B ENOENT +A path argument is an empty string, +but +.I cmd +is not +.BR \%FSCONFIG_SET_PATH_EMPTY . +.TP +.B ENOMEM +The kernel could not allocate sufficient memory to complete the operation. +.TP +.B ENOTBLK +The parameter named by +.I name +must be a block device, +but the provided parameter value was not a block device. +.TP +.B ENOTDIR +A component of the path prefix +of a path argument +was not a directory. +.TP +.B EOPNOTSUPP +The command given by +.I cmd +is not valid. +.TP +.B ENXIO +The major number +of a block device parameter +is out of range. +.TP +.B EPERM +The command given by +.I cmd +was +.BR \%FSCONFIG_CMD_CREATE , +.BR \%FSCONFIG_CMD_CREATE_EXCL , +or +.BR \%FSCONFIG_CMD_RECONFIGURE , +but the calling process does not have the required +.B \%CAP_SYS_ADMIN +capability. +.SH STANDARDS +Linux. +.SH HISTORY +Linux 5.2. +.\" commit ecdab150fddb42fe6a739335257949220033b782 +.\" commit 400913252d09f9cfb8cce33daee43167921fc343 +glibc 2.36. +.SH NOTES +.SS Generic filesystem parameters +Each filesystem driver is responsible for +parsing most parameters specified with +.BR fsconfig (), +meaning that individual filesystems +may have very different behaviour +when encountering parameters with the same name. +In general, +you should not assume that the behaviour of +.BR fsconfig () +when specifying a parameter to one filesystem type +will match the behaviour of the same parameter +with a different filesystem type. +.P +However, +the following generic parameters +apply to all filesystems and have unified behaviour. +They are set using the listed +.BI \%FSCONFIG_SET_ * +command. +.TP +\fIro\fP and \fIrw\fP (\fB\%FSCONFIG_SET_FLAG\fP) +Configure whether the filesystem instance is read-only. +.TP +\fIdirsync\fP (\fB\%FSCONFIG_SET_FLAG\fP) +Make directory changes on this filesystem instance synchronous. +.TP +\fIsync\fP and \fIasync\fP (\fB\%FSCONFIG_SET_FLAG\fP) +Configure whether writes on this filesystem instance +will be made synchronous +(as though the +.B O_SYNC +flag to +.BR open (2) +was specified for +all file opens in this filesystem instance). +.TP +\fIlazytime\fP and \fInolazytime\fP (\fB\%FSCONFIG_SET_FLAG\fP) +Configure whether to reduce on-disk updates +of inode timestamps on this filesystem instance +(as described in the +.B \%MS_LAZYTIME +section of +.BR mount (2)). +.TP +\fImand\fP and \fInomand\fP (\fB\%FSCONFIG_SET_FLAG\fP) +Configure whether the filesystem instance should permit mandatory locking. +Since Linux 5.15, +.\" commit f7e33bdbd6d1bdf9c3df8bba5abcf3399f957ac3 +mandatory locking has been deprecated +and setting this flag is a no-op. +.TP +\fIsource\fP (\fB\%FSCONFIG_SET_STRING\fP) +This parameter is equivalent to the +.I source +parameter passed to +.BR mount (2) +for the same filesystem type, +and is usually the pathname of a block device +containing the filesystem. +This parameter may only be set once +per filesystem configuration context transaction. +.P +In addition, +any filesystem parameters associated with +Linux Security Modules (LSMs) +are also generic with respect to the underlying filesystem. +See the documentation for the LSM you wish to configure for more details. +.SH CAVEATS +.SS Filesystem parameter types +As a result of +each filesystem driver being responsible for +parsing most parameters specified with +.BR fsconfig (), +some filesystem drivers +may have unintuitive behaviour +with regards to which +.BI \%FSCONFIG_SET_ * +commands are permitted +to configure a given parameter. +.P +In order for +filesystem parameters to be backwards compatible with +.BR mount (2), +they must be parseable as strings; +this almost universally means that +.B \%FSCONFIG_SET_STRING +can also be used to configure them. +.\" Aleksa Sarai +.\" Theoretically, a filesystem could check fc->oldapi and refuse +.\" FSCONFIG_SET_STRING if the operation is coming from the new API, but= no +.\" filesystems do this (and probably never will). +However, other +.BI \%FSCONFIG_SET_ * +commands need to be opted into +by each filesystem driver's parameter parser. +.P +One of the most user-visible instances of +this inconsistency is that +many filesystems do not support +configuring path parameters with +.B \%FSCONFIG_SET_PATH +(despite the name), +which can lead to somewhat confusing +.B EINVAL +errors. +(For example, the generic +.I source +parameter\[em]which is usually a path\[em]can only be configured +with +.BR \%FSCONFIG_SET_STRING .) +.P +When writing programs that use +.BR fsconfig () +to configure parameters +with commands other than +.BR \%FSCONFIG_SET_STRING , +users should verify +that the +.BI \%FSCONFIG_SET_ * +commands used to configure each parameter +are supported by the corresponding filesystem driver. +.\" Aleksa Sarai +.\" While this (quite confusing) inconsistency in behaviour is true today +.\" (and has been true since this was merged), this appears to mostly be= an +.\" unintended consequence of filesystem drivers hand-coding fsparam par= sing. +.\" Path parameters are the most eggregious causes of confusion. Hopeful= ly we +.\" can make this no longer the case in a future kernel. +.SH EXAMPLES +To illustrate the different kinds of flags that can be configured with +.BR fsconfig (), +here are a few examples of some different filesystems being created: +.P +.in +4n +.EX +int fsfd, mntfd; +\& +fsfd =3D fsopen("tmpfs", FSOPEN_CLOEXEC); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "inode64", NULL, 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "uid", "1234", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "huge", "never", 0); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "casefold", NULL, 0); +fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); +mntfd =3D fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NOEXEC); +move_mount(mntfd, "", AT_FDCWD, "/tmp", MOVE_MOUNT_F_EMPTY_PATH); +\& +fsfd =3D fsopen("erofs", FSOPEN_CLOEXEC); +fsconfig(fsfd, FSCONFIG_SET_STRING, "source", "/dev/loop0", 0); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "acl", NULL, 0); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0); +fsconfig(fsfd, FSCONFIG_CMD_CREATE_EXCL, NULL, NULL, 0); +mntfd =3D fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NOSUID); +move_mount(mntfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH); +.EE +.in +.P +Usually, +specifying the same parameter named by +.I key +multiple times with +.BR fsconfig () +causes the parameter value to be replaced. +However, some filesystems may have unique behaviour: +.P +.in +4n +.EX +\& +int fsfd, mntfd; +int lowerdirfd =3D open("/o/ctr/lower1", O_DIRECTORY | O_CLOEXEC); +\& +fsfd =3D fsopen("overlay", FSOPEN_CLOEXEC); +/* "lowerdir+" appends to the lower dir stack each time. */ +fsconfig(fsfd, FSCONFIG_SET_FD, "lowerdir+", NULL, lowerdirfd); +fsconfig(fsfd, FSCONFIG_SET_STRING, "lowerdir+", "/o/ctr/lower2", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "lowerdir+", "/o/ctr/lower3", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "lowerdir+", "/o/ctr/lower4", 0); +.\" fsconfig(fsfd, FSCONFIG_SET_PATH, "lowerdir+", "/o/ctr/lower5", AT_FDC= WD); +.\" fsconfig(fsfd, FSCONFIG_SET_PATH_EMPTY, "lowerdir+", "", lowerdirfd); +.\" Aleksa Sarai: Hopefully these will also be supported in the future. +fsconfig(fsfd, FSCONFIG_SET_STRING, "xino", "auto", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "nfs_export", "off", 0); +fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); +mntfd =3D fsmount(fsfd, FSMOUNT_CLOEXEC, 0); +move_mount(mntfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH); +.EE +.in +.P +And here is an example of how +.BR fspick (2) +can be used with +.BR fsconfig () +to reconfigure the parameters +of an extant filesystem instance +attached to +.IR /proc : +.P +.in +4n +.EX +int fsfd =3D fspick(AT_FDCWD, "/proc", FSPICK_CLOEXEC); +fsconfig(fsfd, FSCONFIG_SET_STRING, "hidepid", "ptraceable", 0); +fsconfig(fsfd, FSCONFIG_SET_STRING, "subset", "pid", 0); +fsconfig(fsfd, FSCONFIG_CMD_RECONFIGURE, NULL, NULL, 0); +.EE +.in +.SH SEE ALSO +.BR fsmount (2), +.BR fsopen (2), +.BR fspick (2), +.BR mount (2), +.BR mount_setattr (2), +.BR move_mount (2), +.BR open_tree (2), +.BR mount_namespaces (7) + --=20 2.51.0 From nobody Thu Oct 2 07:46:32 2025 Received: from mout-p-202.mailbox.org (mout-p-202.mailbox.org [80.241.56.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 62CAD2580E2; Fri, 19 Sep 2025 02:00:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247242; cv=none; b=t9RGy4uyu3UXDsux8UlyNtXrciPXVzTVWRh3y8IHkmhGEQQjFlJIqw/AbUQwOxAi4hLihuIqNMuje9S8p8jZD8VvNAqoNhHGIBql1yc5cqoL16y/R+flp/rh03MuMdVYjp7tdY+O6GlkN+F3xt1bMQWkD48e/ehLW6/Y92WmPlQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247242; c=relaxed/simple; bh=qXaSaDudSthlQFTGMniG/9MOQZxPmVJfi5d4/BACW5Y=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=UyxEoHVIPEiB4wumXt9p/jXO/tQ4Bs2iN6uVbgjmckP7rRD/zyUkH8AyIBRxxgeQ8/2s2fsJcrm31QYq/kl/haxfl9fMNATLhbFODgAdt76Tsx7JjCeIZF6Tjnz2sHA53yTkd5YzfdIfMr5M5FsOSEj6i4At1xmL5vH3HDS5RZw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com; spf=pass smtp.mailfrom=cyphar.com; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b=IXqFy9VN; arc=none smtp.client-ip=80.241.56.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cyphar.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b="IXqFy9VN" Received: from smtp2.mailbox.org (smtp2.mailbox.org [10.196.197.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-202.mailbox.org (Postfix) with ESMTPS id 4cSbLk3PCZz9t7f; Fri, 19 Sep 2025 04:00:30 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cyphar.com; s=MBO0001; t=1758247230; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8hJFrhCfM9NDkzHdXtwu3Sc4WrKTnmNW2zAaU4xBs3k=; b=IXqFy9VNV/boUP8WG4/Xgo1FufkQuhvhaBh5oD3tazdcpJVHOHovHYVcCVP4Nr6LHkf9kn kWgEMosE6ZAeo8KgDwn+TG7M4wfdBxE0/bCdZlqpTiwgtxtamlaOjaEpeFfbUQljGFU+BM 1YmvgMIcGY65y3g62taWXuAcGiwrSx2jCNKQkaUQd24T7mTv81LBvAxK9wmTN5y41z6bIp Ryo/hdABG097pyCuGZ7HTWxwsAxDuKuzkSli72g59MLMh0KHEPVIckwf1Kj7g0nz8UZlDv HspMElh1yElJn5HDWgmk5Cwiu9hyK5a1fhVG1iCJ4RPjyhu6fub1ii9S89pDkw== From: Aleksa Sarai Date: Fri, 19 Sep 2025 11:59:46 +1000 Subject: [PATCH v4 05/10] man/man2/fsmount.2: document "new" mount API Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250919-new-mount-api-v4-5-1261201ab562@cyphar.com> References: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> In-Reply-To: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> To: Alejandro Colomar Cc: "Michael T. Kerrisk" , Alexander Viro , Jan Kara , Askar Safin , "G. Branden Robinson" , linux-man@vger.kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, David Howells , Christian Brauner , Aleksa Sarai X-Developer-Signature: v=1; a=openpgp-sha256; l=6666; i=cyphar@cyphar.com; h=from:subject:message-id; bh=qXaSaDudSthlQFTGMniG/9MOQZxPmVJfi5d4/BACW5Y=; b=owGbwMvMwCWmMf3Xpe0vXfIZT6slMWSc2SmmGnns68/AMw9/MP4MSji0SP52yo/MO8bztktMV oi/w+ZT3DGRhUGMi8FSTJFlm59n6Kb5i68kf1rJBjOHlQlkiLRIAwMDAwMLA19uYl6pkY6Rnqm2 oZ6hoY6RjhEDF6cATHVWFSPDVr9bllln2l1qjyt7F7jNrdNa8aziuvKvwxwmr1c7f3Jcwcjw+h7 zNO8N3XvStYJkDhlHz7mmwCPOf6171zExpzUWZW3cAA== X-Developer-Key: i=cyphar@cyphar.com; a=openpgp; fpr=C9C370B246B09F6DBCFC744C34401015D1D2D386 This is loosely based on the original documentation written by David Howells and later maintained by Christian Brauner, but has been rewritten to be more from a user perspective (as well as fixing a few critical mistakes). Co-authored-by: David Howells Signed-off-by: David Howells Co-authored-by: Christian Brauner Signed-off-by: Christian Brauner Signed-off-by: Aleksa Sarai --- man/man2/fsmount.2 | 231 +++++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 231 insertions(+) diff --git a/man/man2/fsmount.2 b/man/man2/fsmount.2 new file mode 100644 index 0000000000000000000000000000000000000000..c054c04376975c620aec08b76ad= 5151d8b6ae2ed --- /dev/null +++ b/man/man2/fsmount.2 @@ -0,0 +1,231 @@ +.\" Copyright, the authors of the Linux man-pages project +.\" +.\" SPDX-License-Identifier: Linux-man-pages-copyleft +.\" +.TH fsmount 2 (date) "Linux man-pages (unreleased)" +.SH NAME +fsmount \- instantiate mount object from filesystem context +.SH LIBRARY +Standard C library +.RI ( libc ,\~ \-lc ) +.SH SYNOPSIS +.nf +.B #include +.P +.BI "int fsmount(int " fsfd ", unsigned int " flags ", \ +unsigned int " attr_flags ); +.fi +.SH DESCRIPTION +The +.BR fsmount () +system call is part of +the suite of file descriptor based mount facilities in Linux. +.P +.BR fsmount () +creates a new detached mount object +for the root of the new filesystem instance +referenced by the filesystem context file descriptor +.IR fsfd . +A new file descriptor +associated with the detached mount object +is then returned. +In order to create a mount object with +.BR fsmount (), +the calling process must have the +.BR \%CAP_SYS_ADMIN +capability. +.P +The filesystem context must have been created with a call to +.BR fsopen (2) +and then had a filesystem instance instantiated with a call to +.BR fsconfig (2) +with +.B \%FSCONFIG_CMD_CREATE +or +.B \%FSCONFIG_CMD_CREATE_EXCL +in order to be in the correct state +for this operation +(the "awaiting-mount" mode in kernel-developer parlance). +.\" FS_CONTEXT_AWAITING_MOUNT is the term the kernel uses for this. +Unlike +.BR open_tree (2) +with +.BR \%OPEN_TREE_CLONE, +.BR fsmount () +can only be called once +in the lifetime of a filesystem context +to produce a mount object. +.P +As with file descriptors returned from +.BR open_tree (2) +called with +.BR OPEN_TREE_CLONE , +the returned file descriptor +can then be used with +.BR move_mount (2), +.BR mount_setattr (2), +or other such system calls to do further mount operations. +This mount object will be unmounted and destroyed +when the file descriptor is closed +if it was not otherwise attached to a mount point +by calling +.BR move_mount (2). +(Note that the unmount operation on +.BR close (2) +is lazy\[em]akin to calling +.BR umount2 (2) +with +.BR MOUNT_DETACH ; +any existing open references to files +from the mount object +will continue to work, +and the mount object will only be completely destroyed +once it ceases to be busy.) +The returned file descriptor +also acts the same as one produced by +.BR open (2) +with +.BR O_PATH , +meaning it can also be used as a +.I dirfd +argument +to "*at()" system calls. +.P +.I flags +controls the creation of the returned file descriptor. +A value for +.I flags +is constructed by bitwise ORing +zero or more of the following constants: +.RS +.TP +.B FSMOUNT_CLOEXEC +Set the close-on-exec +.RB ( FD_CLOEXEC ) +flag on the new file descriptor. +See the description of the +.B O_CLOEXEC +flag in +.BR open (2) +for reasons why this may be useful. +.RE +.P +.I attr_flags +specifies mount attributes +which will be applied to the created mount object, +in the form of +.BI \%MOUNT_ATTR_ * +flags. +The flags are interpreted as though +.BR mount_setattr (2) +was called with +.I attr.attr_set +set to the same value as +.IR attr_flags . +.BI \%MOUNT_ATTR_ * +flags which would require +specifying additional fields in +.BR mount_attr (2type) +(such as +.BR \%MOUNT_ATTR_IDMAP ) +are not valid flag values for +.IR attr_flags . +.P +If the +.BR fsmount () +operation is successful, +the filesystem context +associated with the file descriptor +.I fsfd +is reset +and placed into reconfiguration mode, +as if it were just returned by +.BR fspick (2). +You may continue to use +.BR fsconfig (2) +with the now-reset filesystem context, +including issuing the +.B \%FSCONFIG_CMD_RECONFIGURE +command +to reconfigure the filesystem instance. +.SH RETURN VALUE +On success, a new file descriptor is returned. +On error, \-1 is returned, and +.I errno +is set to indicate the error. +.SH ERRORS +.TP +.B EBUSY +The filesystem context associated with +.I fsfd +is not in the right state +to be used by +.BR fsmount (). +.TP +.B EINVAL +.I flags +had an invalid flag set. +.TP +.B EINVAL +.I attr_flags +had an invalid +.BI MOUNT_ATTR_ * +flag set. +.TP +.B EMFILE +The calling process has too many open files to create more. +.TP +.B ENFILE +The system has too many open files to create more. +.TP +.B ENOSPC +The "anonymous" mount namespace +necessary to contain the new mount object +could not be allocated, +as doing so would exceed +the configured per-user limit on +the number of mount namespaces in the current user namespace. +(See also +.BR namespaces (7).) +.TP +.B ENOMEM +The kernel could not allocate sufficient memory to complete the operation. +.TP +.B EPERM +The calling process does not have the required +.B CAP_SYS_ADMIN +capability. +.SH STANDARDS +Linux. +.SH HISTORY +Linux 5.2. +.\" commit 93766fbd2696c2c4453dd8e1070977e9cd4e6b6d +.\" commit 400913252d09f9cfb8cce33daee43167921fc343 +glibc 2.36. +.SH EXAMPLES +.in +4n +.EX +int fsfd, mntfd, tmpfd; +\& +fsfd =3D fsopen("tmpfs", FSOPEN_CLOEXEC); +fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); +mntfd =3D fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NODEV | MOUNT_ATTR_NOE= XEC); +\& +/* Create a new file without attaching the mount object. */ +int tmpfd =3D openat(mntfd, "tmpfile", O_CREAT | O_EXCL | O_RDWR, 0600); +unlinkat(mntfd, "tmpfile", 0); +\& +/* Attach the mount object to "/tmp". */ +move_mount(mntfd, "", AT_FDCWD, "/tmp", MOVE_MOUNT_F_EMPTY_PATH); +.EE +.in +.SH SEE ALSO +.BR fsconfig (2), +.BR fsopen (2), +.BR fspick (2), +.BR mount (2), +.BR mount_setattr (2), +.BR move_mount (2), +.BR open_tree (2), +.BR mount_namespaces (7) + --=20 2.51.0 From nobody Thu Oct 2 07:46:32 2025 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 66FC925D1F7; Fri, 19 Sep 2025 02:00:40 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.151 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247242; cv=none; b=STwahhdHMcbSU59WnkfOPB2Gw0bqzyjrLePBNrIrKgmp+ZeQ2Gk6PVi0UCXMUzcbpm6IIWrn04EllCak+VeJAAWewOya1sn10ZRoJtC7xmBEJxCJYU7751gukoO8qSX78ljK5QAtJAXHlC6q+dDkGoqCBppT59dykaiEbWBcLD4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247242; c=relaxed/simple; bh=Sb05VklF3fgJS6vByj9gvJbam3e2Vzsd40zH7kv+YyU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=bYUP+xKIMxwotobylc9qtWpr8dNBBzngl0WMN8yUczbKzE7EeEzUXVXqZDYQl46iIlsXGkqcFmeXkKVFYzkfb1HSDmdBLJMz/9A3COQwNmb+UiuMtqmf8gdSQzSsC1OMl8GwkoGon5ZeCd2Rl5RZrrjw5GCrTcB20xAdDskQWAk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com; spf=pass smtp.mailfrom=cyphar.com; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b=yhNPm2wd; arc=none smtp.client-ip=80.241.56.151 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cyphar.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b="yhNPm2wd" Received: from smtp2.mailbox.org (smtp2.mailbox.org [10.196.197.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4cSbLr5v60z9trP; Fri, 19 Sep 2025 04:00:36 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cyphar.com; s=MBO0001; t=1758247236; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NKNF+NWEjGNddE6Mgt8eyKHz2K2Duy9bcteK6UeGcaY=; b=yhNPm2wd7Ou7lIQPBUEDMNz09Ml/PIUneAVAB3iZHPXAwPcsMaSigcRWib1pCGeHpccZ7+ k338ti9UtTp35sSyviC0vRqdSuKZ33pNz4Z8WNqTKCPMNcNBv3b9jD8vmqgVVwRmtCF/ZJ H51IHOAWN41EntHIZ/GjRt8TQtLd9s7FjBjmllw5ON9916S4Ra/E5IE3VesCJeI6/hxpFE p3W7clL80X+p638F4zaTg39d7pYm+PbJlmotMvqMXPN3kC+Z7Agd5k0PKrg7pO/ZyO7jUl UryKRwqkYNdHHwzj85rRgjpLS8d+ygMW2V0REP2znsLLqCvpvuPlVkHPzLZjGg== From: Aleksa Sarai Date: Fri, 19 Sep 2025 11:59:47 +1000 Subject: [PATCH v4 06/10] man/man2/move_mount.2: document "new" mount API Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250919-new-mount-api-v4-6-1261201ab562@cyphar.com> References: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> In-Reply-To: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> To: Alejandro Colomar Cc: "Michael T. Kerrisk" , Alexander Viro , Jan Kara , Askar Safin , "G. Branden Robinson" , linux-man@vger.kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, David Howells , Christian Brauner , Aleksa Sarai X-Developer-Signature: v=1; a=openpgp-sha256; l=15586; i=cyphar@cyphar.com; h=from:subject:message-id; bh=Sb05VklF3fgJS6vByj9gvJbam3e2Vzsd40zH7kv+YyU=; b=kA0DAAoWKJf60rfpRG8ByyZiAGjMuRajvU/DckuWfism3vIuvUHTeW+76yJSthw3KfKxf0ITp YiRBAAWCgA5FiEEtk5JVbKfo9Rj8qkGKJf60rfpRG8FAmjMuRYbFIAAAAAABAAObWFudTIsMi41 KzEuMTEsMiwyAAoJECiX+tK36URvclcA/3k+GmN1Y6Je1Yd36Fad6wgjOhGuyJbcG39eWi5ShXs XAP4orvitXVbdks2yeofcSED72kfADRCzyKP/HnyXCKLpDw== X-Developer-Key: i=cyphar@cyphar.com; a=openpgp; fpr=C9C370B246B09F6DBCFC744C34401015D1D2D386 This is loosely based on the original documentation written by David Howells and later maintained by Christian Brauner, but has been rewritten to be more from a user perspective (as well as fixing a few critical mistakes). Co-authored-by: David Howells Signed-off-by: David Howells Co-authored-by: Christian Brauner Signed-off-by: Christian Brauner Signed-off-by: Aleksa Sarai --- man/man2/move_mount.2 | 646 ++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 646 insertions(+) diff --git a/man/man2/move_mount.2 b/man/man2/move_mount.2 new file mode 100644 index 0000000000000000000000000000000000000000..13801d61ba0e99e45c693bb83b2= 2cd24b4c04f28 --- /dev/null +++ b/man/man2/move_mount.2 @@ -0,0 +1,646 @@ +.\" Copyright, the authors of the Linux man-pages project +.\" +.\" SPDX-License-Identifier: Linux-man-pages-copyleft +.\" +.TH move_mount 2 (date) "Linux man-pages (unreleased)" +.SH NAME +move_mount \- move or attach mount object to filesystem +.SH LIBRARY +Standard C library +.RI ( libc ,\~ \-lc ) +.SH SYNOPSIS +.nf +.BR "#include " " /* Definition of " AT_* " constants */" +.B #include +.P +.BI "int move_mount(int " from_dirfd ", const char *" from_path , +.BI " int " to_dirfd ", const char *" to_path , +.BI " unsigned int " flags ); +.fi +.SH DESCRIPTION +The +.BR move_mount () +system call is part of +the suite of file descriptor based mount facilities in Linux. +.P +.BR move_mount () +moves the mount object indicated by +.I from_dirfd +and +.I from_path +to the path indicated by +.I to_dirfd +and +.IR to_path . +The mount object being moved +can be an existing mount point in the current mount namespace, +or a detached mount object created by +.BR fsmount (2) +or +.BR open_tree (2) +with +.BR \%OPEN_TREE_CLONE . +.P +To access the source mount object +or the destination mount point, +no permissions are required on the object itself, +but if either pathname is supplied, +execute (search) permission is required +on all of the directories specified in +.I from_path +or +.IR to_path . +.P +The calling process must have the +.BR \%CAP_SYS_ADMIN +capability in order to move or attach a mount object. +.P +As with "*at()" system calls, +.BR move_mount () +uses the +.I from_dirfd +and +.I to_dirfd +arguments +in conjunction with the +.I from_path +and +.I to_path +arguments to determine the source and destination objects to operate on +(respectively), as follows: +.IP \[bu] 3 +If the pathname given in +.I *_path +is absolute, then +the corresponding +.I *_dirfd +is ignored. +.IP \[bu] +If the pathname given in +.I *_path +is relative and +the corresponding +.I *_dirfd +is the special value +.BR \%AT_FDCWD , +then +.I *_path +is interpreted relative to +the current working directory +of the calling process (like +.BR open (2)). +.IP \[bu] +If the pathname given in +.I *_path +is relative, +then it is interpreted relative to +the directory referred to by +the corresponding file descriptor +.I *_dirfd +(rather than relative to +the current working directory +of the calling process, +as is done by +.BR open (2) +for a relative pathname). +In this case, +the corresponding +.I *_dirfd +must be a directory +that was opened for reading +.RB ( O_RDONLY ) +or using the +.B O_PATH +flag. +.IP \[bu] +If +.I *_path +is an empty string, +and +.I flags +contains the appropriate +.BI \%MOVE_MOUNT_ * _EMPTY_PATH +flag, +then the corresponding file descriptor +.I *_dirfd +is operated on directly. +In this case, +the corresponding +.I *_dirfd +may refer to any type of file, +not just a directory. +.P +See +.BR openat (2) +for an explanation of why the +.I *_dirfd +arguments are useful. +.P +.I flags +can be used to control aspects of the path lookup +for both the source and destination objects, +as well as other properties of the mount operation. +A value for +.I flags +is constructed by bitwise ORing +zero or more of the following constants: +.RS +.TP +.B MOVE_MOUNT_F_EMPTY_PATH +If +.I from_path +is an empty string, operate on the file referred to by +.I from_dirfd +(which may have been obtained from +.BR open (2), +.BR fsmount (2), +or +.BR open_tree (2)). +In this case, +.I from_dirfd +may refer to any type of file, +not just a directory. +If +.I from_dirfd +is +.BR \%AT_FDCWD , +.BR move_mount () +will operate on the current working directory +of the calling process. +.IP +This is the most common mechanism +used to attach detached mount objects +produced by +.BR fsmount (2) +and +.BR open_tree (2) +to a mount point. +.TP +.B MOVE_MOUNT_T_EMPTY_PATH +As with +.BR \%MOVE_MOUNT_F_EMPTY_PATH , +except operating on +.I to_dirfd +and +.IR to_path . +.TP +.B MOVE_MOUNT_F_SYMLINKS +If +.IR from_path +references a symbolic link, +then dereference it. +The default behaviour for +.BR move_mount () +is to +.I not follow +symbolic links. +.TP +.B MOVE_MOUNT_T_SYMLINKS +As with +.BR \%MOVE_MOUNT_F_SYMLINKS , +except operating on +.I to_dirfd +and +.IR to_path . +.TP +.B MOVE_MOUNT_F_NO_AUTOMOUNT +Do not automount any automount points encountered +while resolving +.IR from_path . +This allows a mount object +that has an automount point at its root +to be moved +and prevents unintended triggering of an automount point. +This flag has no effect +if the automount point has already been mounted over. +.TP +.B MOVE_MOUNT_T_NO_AUTOMOUNT +As with +.BR \%MOVE_MOUNT_F_NO_AUTOMOUNT , +except operating on +.I to_dirfd +and +.IR to_path . +This allows an automount point to be manually mounted over. +.TP +.BR MOVE_MOUNT_SET_GROUP " (since Linux 5.15)" +Add the attached private-propagation mount object indicated by +.I to_dirfd +and +.I to_path +into the mount propagation "peer group" +of the attached non-private-propagation mount object indicated by +.I from_dirfd +and +.IR from_path . +.IP +Unlike other +.BR move_mount () +operations, +this operation does not move or attach any mount objects. +Instead, it only updates the metadata +of attached mount objects. +(Also, take careful note of +the argument order\[em]the mount object being modified +by this operation is the one specified by +.I to_dirfd +and +.IR to_path .) +.IP +This makes it possible to first create a mount tree +consisting only of private mounts +and then configure the desired propagation layout afterwards. +(See the "SHARED SUBTREES" section of +.BR mount_namespaces (7) +for more information about mount propagation and peer groups.) +.TP +.BR MOVE_MOUNT_BENEATH " (since Linux 6.5)" +If the path indicated by +.I to_dirfd +and +.I to_path +is an existing mount object, +rather than attaching or moving the mount object +indicated by +.I from_dirfd +and +.I from_path +on top of the mount stack, +attach or move it beneath the current top mount +on the mount stack. +.IP +After using +.BR \%MOVE_MOUNT_BENEATH , +it is possible to +.BR umount (2) +the top mount +in order to reveal the mount object +which was attached beneath it earlier. +This allows for the seamless (and atomic) replacement +of intricate mount trees, +which can further be used +to "upgrade" a mount tree with a newer version. +.IP +This operation has several restrictions: +.RS +.IP \[bu] 3 +Mount objects cannot be attached beneath the filesystem root, +including cases where +the filesystem root was configured by +.BR chroot (2) +or +.BR pivot_root (2). +To mount beneath the filesystem root, +.BR pivot_root (2) +must be used. +.IP \[bu] +The target path indicated by +.I to_dirfd +and +.I to_path +must not be a detached mount object, +such as those produced by +.BR open_tree (2) +with +.B \%OPEN_TREE_CLONE +or +.BR fsmount (2). +.IP \[bu] +The current top mount +of the target path's mount stack +and its parent mount +must be in the calling process's mount namespace. +.IP \[bu] +The caller must have sufficient privileges +to unmount the top mount +of the target path's mount stack, +to prove they have privileges +to reveal the underlying mount. +.IP \[bu] +Mount propagation events triggered by this +.BR move_mount () +operation +(as described in +.BR mount_namespaces (7)) +are calculated based on the parent mount +of the current top mount +of the target path's mount stack. +.IP \[bu] +The target path's mount +cannot be an ancestor in the mount tree of +the source mount object. +.IP \[bu] +The source mount object +must not have any overmounts, +otherwise it would be possible to create "shadow mounts" +(i.e., two mounts mounted on the same parent mount at the same mount point= ). +.IP \[bu] +It is not possible to move a mount +beneath a top mount +if the parent mount +of the current top mount +propagates to the top mount itself. +Otherwise, +.B \%MOVE_MOUNT_BENEATH +would cause the mount object +to be propagated +to the top mount +from the parent mount, +defeating the purpose of using +.BR \%MOVE_MOUNT_BENEATH . +.IP \[bu] +It is not possible to move a mount +beneath a top mount +if the parent mount +of the current top mount +propagates to the mount object +being mounted beneath. +Otherwise, this would cause a similar propagation issue +to the previous point, +also defeating the purpose of using +.BR \%MOVE_MOUNT_BENEATH . +.RE +.RE +.P +If +.I from_dirfd +is a mount object file descriptor and +.BR move_mount () +is operating on it directly, +.I from_dirfd +will remain associated with the mount object after +.BR move_mount () +succeeds, +so you may repeatedly use +.I from_dirfd +with +.BR move_mount (2) +and/or "*at()" system calls +as many times as necessary. +.SH RETURN VALUE +On success, +.BR move_mount () +returns 0. +On error, \-1 is returned, and +.I errno +is set to indicate the error. +.SH ERRORS +.TP +.B EACCES +Search permission is denied +for one of the directories +in the path prefix of one of +.I from_path +or +.IR to_path . +(See also +.BR path_resolution (7).) +.TP +.B EBADF +One of +.I from_dirfd +or +.I to_dirfd +is not a valid file descriptor. +.TP +.B EFAULT +One of +.I from_path +or +.I to_path +is NULL +or a pointer to a location +outside the calling process's accessible address space. +.TP +.B EINVAL +Invalid flag specified in +.IR flags . +.TP +.B EINVAL +The path indicated by +.I from_dirfd +and +.I from_path +is not a mount object. +.TP +.B EINVAL +The mount object type +of the source mount object and target inode +are not compatible +(i.e., the source is a file but the target is a directory, or vice-versa). +.TP +.B EINVAL +The source mount object or target path +are not in the calling process's mount namespace +(or an anonymous mount namespace of the calling process). +.TP +.B EINVAL +The source mount object's parent mount +has shared mount propagation, +and thus cannot be moved +(as described in +.BR mount_namespaces (7)). +.TP +.B EINVAL +The source mount has +.B MS_UNBINDABLE +child mounts +but the target path +resides on a mount tree with shared mount propagation, +which would otherwise cause the unbindable mounts to be propagated +(as described in +.BR mount_namespaces (7)). +.TP +.B EINVAL +.B \%MOVE_MOUNT_BENEATH +was attempted, +but one of the listed restrictions was violated. +.TP +.B ELOOP +Too many symbolic links encountered +when resolving one of +.I from_path +or +.IR to_path . +.TP +.B ENAMETOOLONG +One of +.I from_path +or +.I to_path +is longer than +.BR PATH_MAX . +.TP +.B ENOENT +A component of one of +.I from_path +or +.I to_path +does not exist. +.TP +.B ENOENT +One of +.I from_path +or +.I to_path +is an empty string, +but the corresponding +.BI MOVE_MOUNT_ * _EMPTY_PATH +flag is not specified in +.IR flags . +.TP +.B ENOTDIR +A component of the path prefix of one of +.I from_path +or +.I to_path +is not a directory, +or one of +.I from_path +or +.I to_path +is relative +and the corresponding +.I from_dirfd +or +.I to_dirfd +is a file descriptor referring to a file other than a directory. +.TP +.B ENOMEM +The kernel could not allocate sufficient memory to complete the operation. +.TP +.B EPERM +The calling process does not have the required +.B \%CAP_SYS_ADMIN +capability. +.SH STANDARDS +Linux. +.SH HISTORY +Linux 5.2. +.\" commit 2db154b3ea8e14b04fee23e3fdfd5e9d17fbc6ae +.\" commit 400913252d09f9cfb8cce33daee43167921fc343 +glibc 2.36. +.SH EXAMPLES +.BR move_mount () +can be used to move attached mounts like the following: +.P +.in +4n +.EX +move_mount(AT_FDCWD, "/a", AT_FDCWD, "/b", 0); +.EE +.in +.P +This would move the mount object mounted on +.I /a +to +.IR /b . +The above procedure is functionally equivalent to +the following mount operation +using +.BR mount (2): +.P +.in +4n +.EX +mount("/a", "/b", NULL, MS_MOVE, NULL); +.EE +.in +.P +.BR move_mount () +can also be used in conjunction with file descriptors returned from +.BR open_tree (2) +or +.BR open (2): +.P +.in +4n +.EX +int fd =3D open_tree(AT_FDCWD, "/mnt", 0); /* or open("/mnt", O_PATH); */ +move_mount(fd, "", AT_FDCWD, "/mnt2", MOVE_MOUNT_F_EMPTY_PATH); +move_mount(fd, "", AT_FDCWD, "/mnt3", MOVE_MOUNT_F_EMPTY_PATH); +move_mount(fd, "", AT_FDCWD, "/mnt4", MOVE_MOUNT_F_EMPTY_PATH); +.EE +.in +.P +This would move the mount object mounted at +.I /mnt +to +.IR /mnt2 , +then +.IR /mnt3 , +and then +.IR /mnt4 . +.P +If the source mount object +indicated by +.I from_dirfd +and +.I from_path +is a detached mount object, +.BR move_mount () +can be used to attach it to a mount point: +.P +.in +4n +.EX +int fsfd, mntfd; +\& +fsfd =3D fsopen("ext4", FSOPEN_CLOEXEC); +fsconfig(fsfd, FSCONFIG_SET_STRING, "source", "/dev/sda1", 0); +fsconfig(fsfd, FSCONFIG_SET_FLAG, "user_xattr", NULL, 0); +fsconfig(fsfd, FSCONFIG_CMD_CREATE, NULL, NULL, 0); +mntfd =3D fsmount(fsfd, FSMOUNT_CLOEXEC, MOUNT_ATTR_NODEV); +move_mount(mntfd, "", AT_FDCWD, "/home", MOVE_MOUNT_F_EMPTY_PATH); +.EE +.in +.P +This would create a new filesystem configuration context for ext4, +configure it, +create a detached mount object, +and then attach it to +.IR /home . +The above procedure is functionally equivalent to +the following mount operation +using +.BR mount (2): +.P +.in +4n +.EX +mount("/dev/sda1", "/home", "ext4", MS_NODEV, "user_xattr"); +.EE +.in +.P +The same operation also works with detached bind-mounts created with +.BR open_tree (2) +with +.BR OPEN_TREE_CLONE : +.P +.in +4n +.EX +int mntfd =3D open_tree(AT_FDCWD, "/home/cyphar", OPEN_TREE_CLONE); +move_mount(mntfd, "", AT_FDCWD, "/root", MOVE_MOUNT_F_EMPTY_PATH); +.EE +.in +.P +This would create a new bind-mount of +.I /home/cyphar +as a detached mount object, +and then attach it to +.IR /root . +The above procedure is functionally equivalent to +the following mount operation +using +.BR mount (2): +.P +.in +4n +.EX +mount("/home/cyphar", "/root", NULL, MS_BIND, NULL); +.EE +.in +.SH SEE ALSO +.BR fsconfig (2), +.BR fsmount (2), +.BR fsopen (2), +.BR fspick (2), +.BR mount (2), +.BR mount_setattr (2), +.BR open_tree (2), +.BR mount_namespaces (7) + --=20 2.51.0 From nobody Thu Oct 2 07:46:32 2025 Received: from mout-p-102.mailbox.org (mout-p-102.mailbox.org [80.241.56.152]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B6DB926C3B0; Fri, 19 Sep 2025 02:00:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.152 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247248; cv=none; b=hKRPbGFArt2JirVy42uqEtjv55IV3PVomwFhuAvmCyklXZuYIFxaLecgz4JeLdkZ8U0/nafOcpzzmQJooiM4ICXoAyvVmM9KpTtQVgT73NwzKy8fV9+6He4AufAOrc6jvKyOZadec4ird/L2upd2nP/zcC2fTWDH8KmiieRihhE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247248; c=relaxed/simple; bh=UWWn1MGgxGXNaKRGs/KQdtY4SShb0Y+xiQ6Y2RykYEo=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=LbBRbs6ohLUQCYyhL3xeEtQpigd1WstRyGOj2Up97ha2MHUA+hLZZryZZzdEcVOcZUjOWoUJipZGVtIbbgemln1DWv7cjyfhnIF6pHBZh+xT4962PBBwu0KwK7I9xL1xcdx2yWMfhqWDpjJCeXlHc8xvoG0FdQvOtFwhPfhTACs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com; spf=pass smtp.mailfrom=cyphar.com; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b=PPMdIvoD; arc=none smtp.client-ip=80.241.56.152 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cyphar.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b="PPMdIvoD" Received: from smtp2.mailbox.org (smtp2.mailbox.org [10.196.197.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-102.mailbox.org (Postfix) with ESMTPS id 4cSbLy1Z0Pz9stW; Fri, 19 Sep 2025 04:00:42 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cyphar.com; s=MBO0001; t=1758247242; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ncvjcQxwXCwpeHonR8DcNp6AC2Ym7so6AS6tZLngQMA=; b=PPMdIvoDRYAl3wXWiHVFbHXyW0QEeEWwev1FmM2jMLAzOLMOOh3eGoypx0WbvUQS1FIleY lJXrB6ugGOlgq6HURQixdEE8sNK1ak4mAevy6ygG4yQdYRqkU0dF4ZGzgYjrxaq4qdymRY guwW9D3ZUbSXqicmMOgHDKzqkIPhQLGxBj1ByKVvbhOXG4XDbD5YM0e73HhyGQ90+helwd W5T3svK6m7PXHCRbFEe36jZxVf4Hte12lORmhCVpzc6XQ0nJ12b0fUYknhHgCAvImaFxTv 34fXKf/yZLJvjW5qf8BXbrj+em4IcKQjHhGi0wBtjoDiKWd3eWSUKIwhP8l9Pw== From: Aleksa Sarai Date: Fri, 19 Sep 2025 11:59:48 +1000 Subject: [PATCH v4 07/10] man/man2/open_tree.2: document "new" mount API Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250919-new-mount-api-v4-7-1261201ab562@cyphar.com> References: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> In-Reply-To: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> To: Alejandro Colomar Cc: "Michael T. Kerrisk" , Alexander Viro , Jan Kara , Askar Safin , "G. Branden Robinson" , linux-man@vger.kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, David Howells , Christian Brauner , Aleksa Sarai X-Developer-Signature: v=1; a=openpgp-sha256; l=12467; i=cyphar@cyphar.com; h=from:subject:message-id; bh=UWWn1MGgxGXNaKRGs/KQdtY4SShb0Y+xiQ6Y2RykYEo=; b=owGbwMvMwCWmMf3Xpe0vXfIZT6slMWSc2Sluw5Ktf1+5pW36wco1PceuB75ZaZij4XnTKya64 seph6rHOiayMIhxMViKKbJs8/MM3TR/8ZXkTyvZYOawMoEMkRZpYGBgYGBh4MtNzCs10jHSM9U2 1DM01DHSMWLg4hSAqRYzY2TYeSgxX/h/2mmGvuK4xQZaa/7r2FYc8OtLelwldmjZNsG/jAwrRco lUh/Malh4r7ppqfSvX2fcl25eZKTveTV05VetN3NZAQ== X-Developer-Key: i=cyphar@cyphar.com; a=openpgp; fpr=C9C370B246B09F6DBCFC744C34401015D1D2D386 This is loosely based on the original documentation written by David Howells and later maintained by Christian Brauner, but has been rewritten to be more from a user perspective (as well as fixing a few critical mistakes). Co-authored-by: David Howells Signed-off-by: David Howells Co-authored-by: Christian Brauner Signed-off-by: Christian Brauner Signed-off-by: Aleksa Sarai --- man/man2/open_tree.2 | 498 +++++++++++++++++++++++++++++++++++++++++++++++= ++++ 1 file changed, 498 insertions(+) diff --git a/man/man2/open_tree.2 b/man/man2/open_tree.2 new file mode 100644 index 0000000000000000000000000000000000000000..7f85df08b43c7b48a9d021dbbeb= 2c60092a2b2d4 --- /dev/null +++ b/man/man2/open_tree.2 @@ -0,0 +1,498 @@ +.\" Copyright, the authors of the Linux man-pages project +.\" +.\" SPDX-License-Identifier: Linux-man-pages-copyleft +.\" +.TH open_tree 2 (date) "Linux man-pages (unreleased)" +.SH NAME +open_tree \- open path or create detached mount object and attach to fd +.SH LIBRARY +Standard C library +.RI ( libc ,\~ \-lc ) +.SH SYNOPSIS +.nf +.BR "#define _GNU_SOURCE " "/* See feature_test_macros(7) */" +.BR "#include " " /* Definition of " AT_* " constants */" +.B #include +.P +.BI "int open_tree(int " dirfd ", const char *" path ", unsigned int " fla= gs ); +.fi +.SH DESCRIPTION +The +.BR open_tree () +system call is part of +the suite of file descriptor based mount facilities in Linux. +.IP \[bu] 3 +If +.I flags +contains +.BR \%OPEN_TREE_CLONE , +.BR open_tree () +creates a detached mount object +which consists of a bind-mount of +the path specified by the +.IR path . +A new file descriptor +associated with the detached mount object +is then returned. +The mount object is equivalent to a bind-mount +that would be created by +.BR mount (2) +called with +.BR MS_BIND , +except that it is tied to a file descriptor +and is not mounted onto the filesystem. +.IP +As with file descriptors returned from +.BR fsmount (2), +the resultant file descriptor can then be used with +.BR move_mount (2), +.BR mount_setattr (2), +or other such system calls to do further mount operations. +This mount object will be unmounted and destroyed +when the file descriptor is closed +if it was not otherwise attached to a mount point +by calling +.BR move_mount (2). +(Note that the unmount operation on +.BR close (2) +is lazy\[em]akin to calling +.BR umount2 (2) +with +.BR MOUNT_DETACH ; +any existing open references to files +from the mount object +will continue to work, +and the mount object will only be completely destroyed +once it ceases to be busy.) +.IP \[bu] +If +.I flags +does not contain +.BR \%OPEN_TREE_CLONE , +.BR open_tree () +returns a file descriptor +that is exactly equivalent to +one produced by +.BR openat (2) +when called with the same +.I dirfd +and +.IR path . +.P +In either case, the resultant file descriptor +acts the same as one produced by +.BR open (2) +with +.BR O_PATH , +meaning it can also be used as a +.I dirfd +argument to +"*at()" system calls. +.P +As with "*at()" system calls, +.BR open_tree () +uses the +.I dirfd +argument in conjunction with the +.I path +argument to determine the path to operate on, as follows: +.IP \[bu] 3 +If the pathname given in +.I path +is absolute, then +.I dirfd +is ignored. +.IP \[bu] +If the pathname given in +.I path +is relative and +.I dirfd +is the special value +.BR \%AT_FDCWD , +then +.I path +is interpreted relative to +the current working directory +of the calling process (like +.BR open (2)). +.IP \[bu] +If the pathname given in +.I path +is relative, +then it is interpreted relative to +the directory referred to by the file descriptor +.I dirfd +(rather than relative to +the current working directory +of the calling process, +as is done by +.BR open (2) +for a relative pathname). +In this case, +.I dirfd +must be a directory +that was opened for reading +.RB ( O_RDONLY ) +or using the +.B O_PATH +flag. +.IP \[bu] +If +.I path +is an empty string, +and +.I flags +contains +.BR \%AT_EMPTY_PATH , +then the file descriptor +.I dirfd +is operated on directly. +In this case, +.I dirfd +may refer to any type of file, +not just a directory. +.P +See +.BR openat (2) +for an explanation of why the +.I dirfd +argument is useful. +.P +.I flags +can be used to control aspects of the path lookup +and properties of the returned file descriptor. +A value for +.I flags +is constructed by bitwise ORing +zero or more of the following constants: +.RS +.TP +.B \%AT_EMPTY_PATH +If +.I path +is an empty string, operate on the file referred to by +.I dirfd +(which may have been obtained from +.BR open (2), +.BR fsmount(2), +or from another +.BR open_tree () +call). +In this case, +.I dirfd +may refer to any type of file, not just a directory. +If +.I dirfd +is +.BR \%AT_FDCWD , +.BR open_tree () +will operate on the current working directory +of the calling process. +This flag is Linux-specific; define +.B \%_GNU_SOURCE +to obtain its definition. +.TP +.B \%AT_NO_AUTOMOUNT +Do not automount the terminal ("basename") component of +.I path +if it is a directory that is an automount point. +This allows you to create a handle to the automount point itself, +rather than the location it would mount. +This flag has no effect if the mount point has already been mounted over. +This flag is Linux-specific; define +.B \%_GNU_SOURCE +to obtain its definition. +.TP +.B \%AT_SYMLINK_NOFOLLOW +If +.I path +is a symbolic link, do not dereference it; instead, +create either a handle to the link itself +or a bind-mount of it. +The resultant file descriptor is indistinguishable from one produced by +.BR openat (2) +with +.BR \%O_PATH | O_NOFOLLLOW . +.TP +.B \%OPEN_TREE_CLOEXEC +Set the close-on-exec +.RB ( FD_CLOEXEC ) +flag on the new file descriptor. +See the description of the +.B O_CLOEXEC +flag in +.BR open (2) +for reasons why this may be useful. +.TP +.B \%OPEN_TREE_CLONE +Rather than creating an +.BR openat (2)-style +.B O_PATH +file descriptor, +create a bind-mount of +.I path +(akin to +.IR "mount --bind" ) +as a detached mount object. +In order to do this operation, +the calling process must have the +.BR \%CAP_SYS_ADMIN +capability. +.TP +.B \%AT_RECURSIVE +Create a recursive bind-mount of the path +(akin to +.IR "mount --rbind" ) +as a detached mount object. +This flag is only permitted in conjunction with +.BR \%OPEN_TREE_CLONE . +.SH RETURN VALUE +On success, a new file descriptor is returned. +On error, \-1 is returned, and +.I errno +is set to indicate the error. +.SH ERRORS +.TP +.B EACCES +Search permission is denied for one of the directories +in the path prefix of +.IR path . +(See also +.BR path_resolution (7).) +.TP +.B EBADF +.I path +is relative but +.I dirfd +is neither +.B \%AT_FDCWD +nor a valid file descriptor. +.TP +.B EFAULT +.I path +is NULL +or a pointer to a location +outside the calling process's accessible address space. +.TP +.B EINVAL +Invalid flag specified in +.IR flags . +.TP +.B ELOOP +Too many symbolic links encountered when resolving +.IR path . +.TP +.B EMFILE +The calling process has too many open files to create more. +.TP +.B ENAMETOOLONG +.I path +is longer than +.BR PATH_MAX . +.TP +.B ENFILE +The system has too many open files to create more. +.TP +.B ENOENT +A component of +.I path +does not exist, or is a dangling symbolic link. +.TP +.B ENOENT +.I path +is an empty string, but +.B AT_EMPTY_PATH +is not specified in +.IR flags . +.TP +.B ENOTDIR +A component of the path prefix of +.I path +is not a directory, or +.I path +is relative and +.I dirfd +is a file descriptor referring to a file other than a directory. +.TP +.B ENOSPC +The "anonymous" mount namespace +necessary to contain the +.B \%OPEN_TREE_CLONE +detached bind-mount mount object +could not be allocated, +as doing so would exceed +the configured per-user limit on +the number of mount namespaces in the current user namespace. +(See also +.BR namespaces (7).) +.TP +.B ENOMEM +The kernel could not allocate sufficient memory to complete the operation. +.TP +.B EPERM +.I flags +contains +.B \%OPEN_TREE_CLONE +but the calling process does not have the required +.B CAP_SYS_ADMIN +capability. +.SH STANDARDS +Linux. +.SH HISTORY +Linux 5.2. +.\" commit a07b20004793d8926f78d63eb5980559f7813404 +.\" commit 400913252d09f9cfb8cce33daee43167921fc343 +glibc 2.36. +.SH NOTES +.SS Mount propagation +The bind-mount mount objects created by +.BR open_tree () +with +.B \%OPEN_TREE_CLONE +are not associated with +the mount namespace of the calling process. +Instead, each mount object is placed +in a newly allocated "anonymous" mount namespace +associated with the calling process. +.P +One of the side-effects of this is that +(unlike bind-mounts created with +.BR mount (2)), +mount propagation +(as described in +.BR mount_namespaces (7)) +will not be applied to bind-mounts created by +.BR open_tree () +until the bind-mount is attached with +.BR move_mount (2), +at which point the mount object +will be associated with the mount namespace +where it was attached +and mount propagation will resume. +Note that any mount propagation events that occurred +before the mount object was attached +will +.I not +be propagated to the mount object, +even after it is attached. +.SH EXAMPLES +The following examples show how +.BR open_tree () +can be used in place of more traditional +.BR mount (2) +calls with +.BR MS_BIND . +.P +.in +4n +.EX +int srcfd =3D open_tree(AT_FDCWD, "/var", OPEN_TREE_CLONE); +move_mount(srcfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH); +.EE +.in +.P +First, +a detached bind-mount mount object of +.I /var +is created +and associated with the file descriptor +.IR srcfd . +Then, the mount object is attached to +.I /mnt +using +.BR move_mount (2) +with +.B \%MOVE_MOUNT_F_EMPTY_PATH +to request that the detached mount object +associated with the file descriptor +.I srcfd +be moved (and thus attached) to +.IR /mnt . +.P +The above procedure is functionally equivalent to +the following mount operation using +.BR mount (2): +.P +.in +4n +.EX +mount("/var", "/mnt", NULL, MS_BIND, NULL); +.EE +.in +.P +.B \%OPEN_TREE_CLONE +can be combined with +.B \%AT_RECURSIVE +to create recursive detached bind-mount mount objects, +which in turn can be attached to mount points +to create recursive bind-mounts. +.P +.in +4n +.EX +int srcfd =3D open_tree(AT_FDCWD, "/var", OPEN_TREE_CLONE | AT_RECURSIVE); +move_mount(srcfd, "", AT_FDCWD, "/mnt", MOVE_MOUNT_F_EMPTY_PATH); +.EE +.in +.P +The above procedure is functionally equivalent to +the following mount operation using +.BR mount (2): +.P +.in +4n +.EX +mount("/var", "/mnt", NULL, MS_BIND | MS_REC, NULL); +.EE +.in +.P +One of the primary benefits of using +.BR open_tree () +and +.BR move_mount (2) +over the traditional +.BR mount (2) +is that operating with +.IR dirfd -style +file descriptors is far easier and more intuitive. +.P +.in +4n +.EX +int srcfd =3D open_tree(100, "", AT_EMPTY_PATH | OPEN_TREE_CLONE); +move_mount(srcfd, "", 200, "foo", MOVE_MOUNT_F_EMPTY_PATH); +.EE +.in +.P +The above procedure is roughly equivalent to +the following mount operation using +.BR mount (2): +.P +.in +4n +.EX +mount("/proc/self/fd/100", "/proc/self/fd/200/foo", NULL, MS_BIND, NULL); +.EE +.in +.P +In addition, you can use the file descriptor returned by +.BR open_tree () +as the +.I dirfd +argument to any "*at()" system calls: +.P +.in +4n +.EX +int dirfd, fd; +\& +dirfd =3D open_tree(AT_FDCWD, "/etc", OPEN_TREE_CLONE); +fd =3D openat(dirfd, "passwd", O_RDONLY); +fchmodat(dirfd, "shadow", 0000, 0); +close(dirfd); +close(fd); +/* The bind-mount is now destroyed. */ +.EE +.in +.SH SEE ALSO +.BR fsconfig (2), +.BR fsmount (2), +.BR fsopen (2), +.BR fspick (2), +.BR mount (2), +.BR mount_setattr (2), +.BR move_mount (2), +.BR mount_namespaces (7) --=20 2.51.0 From nobody Thu Oct 2 07:46:32 2025 Received: from mout-p-103.mailbox.org (mout-p-103.mailbox.org [80.241.56.161]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 863C41E5B7B; Fri, 19 Sep 2025 02:00:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.161 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247254; cv=none; b=kcoJtbbrWBrf+pJwSGuIm1BY2mzzYDmh8tX2/4nAsTWDiucvkdaarswybFWLf9AjOF1XHWkYAG4q9ydqdaT+3Fz0Wq9hI5lHouZv6Ibq+sTIHwIlOz0za6iuOG+NIa7rv23BFC/usmapUDwUJUxygI5ch9+WW9TcuMcz35Lx/yE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247254; c=relaxed/simple; bh=xtWZXUuEoGedg7jF5ZMY70agUr5S23SYJpUH91uXjPQ=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=A0eWAqtL2/cfAxwV+ePj6fz+xdSJ4oTfn0NuspgHawIW06SSQM0CSlIb5hiFFZxW2DSOdxWbjRdWjyKjWj6MCdBxlBTRwyRRApZ4PITE8d7gTbsSXL2Y6n+nbFixyTjKI7hLLruu3kuF8NUgoFAogDcf5tz+QRS20oCDaVnltA8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com; spf=pass smtp.mailfrom=cyphar.com; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b=tFT97XWC; arc=none smtp.client-ip=80.241.56.161 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cyphar.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b="tFT97XWC" Received: from smtp2.mailbox.org (smtp2.mailbox.org [10.196.197.2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-103.mailbox.org (Postfix) with ESMTPS id 4cSbM35YMRz9sy4; Fri, 19 Sep 2025 04:00:47 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cyphar.com; s=MBO0001; t=1758247247; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=NOFD29nYgylZR4y6lrNSHBTMAjtJSZxxyA8MWra0xHY=; b=tFT97XWCqbRzLQgFcFkPN+vv0ykrRB9WPUIRTya4/sBtZ/SkYxKIw6U+cpOYCibMhRee81 j2Fr2vfRQhHEtFP5Jyz0Pl0n9AtzxwrUCXfX3QhIHMHMmu/HUIOPh4aLXzEEzGeSxFvz9x ZAZ1dgjI1aqfv/McLgm1jGgO3H4PKifkoK1pofhhq1/OfqPLlKLTk3nNRjkxaCNiCjQh6Y IJew3E+e1wXOzXM6tpg69LR1AoVKMaXDVQFeg+VmNOyN8TeRGkUFZCcs8+ra577/UO3jXj RaSPvpj5ZETEoxDLk/4njB7f52c2dWvyAkyKMtNOsF9PPBn1fvAZ7JbbZUnR1g== From: Aleksa Sarai Date: Fri, 19 Sep 2025 11:59:49 +1000 Subject: [PATCH v4 08/10] man/man2/mount_setattr.2: mirror opening sentence from fsopen(2) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250919-new-mount-api-v4-8-1261201ab562@cyphar.com> References: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> In-Reply-To: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> To: Alejandro Colomar Cc: "Michael T. Kerrisk" , Alexander Viro , Jan Kara , Askar Safin , "G. Branden Robinson" , linux-man@vger.kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, David Howells , Christian Brauner , Aleksa Sarai X-Developer-Signature: v=1; a=openpgp-sha256; l=1026; i=cyphar@cyphar.com; h=from:subject:message-id; bh=xtWZXUuEoGedg7jF5ZMY70agUr5S23SYJpUH91uXjPQ=; b=owGbwMvMwCWmMf3Xpe0vXfIZT6slMWSc2Sk+8WesWcfbRcWd99eYMzryB4qxqm/RmsoZ1y3W+ PzZV62ojoksDGJcDJZiiizb/DxDN81ffCX500o2mDmsTCBDpEUaGBgYGFgY+HIT80qNdIz0TLUN 9QwNdYx0jBi4OAVgqvWVGP4nP450WWvwIcTQpbOGt5Rzx/WNTa8LZqYxzAqWm9AYPk2E4Q/f0Q9 +Zy/nfrnfW7ml/bDSXN6e65bXl87b/FHh8Nr0W47MAA== X-Developer-Key: i=cyphar@cyphar.com; a=openpgp; fpr=C9C370B246B09F6DBCFC744C34401015D1D2D386 All of the other new mount API docs have this lead-in sentence in order to make this set of APIs feel a little bit more cohesive. Despite being a bit of a latecomer, mount_setattr(2) is definitely part of this family of APIs and so deserves the same treatment. Signed-off-by: Aleksa Sarai --- man/man2/mount_setattr.2 | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/man/man2/mount_setattr.2 b/man/man2/mount_setattr.2 index 4b55f6d2e09d00d9bc4b3a085f310b1b459f34e8..b27db5b96665cfb0c387bf5b607= 76d45e0139956 100644 --- a/man/man2/mount_setattr.2 +++ b/man/man2/mount_setattr.2 @@ -19,7 +19,11 @@ .SH SYNOPSIS .SH DESCRIPTION The .BR mount_setattr () -system call changes the mount properties of a mount or an entire mount tre= e. +system call is part of +the suite of file descriptor based mount facilities in Linux. +.P +.BR mount_setattr () +changes the mount properties of a mount or an entire mount tree. If .I path is relative, --=20 2.51.0 From nobody Thu Oct 2 07:46:32 2025 Received: from mout-p-101.mailbox.org (mout-p-101.mailbox.org [80.241.56.151]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E16C127145F; Fri, 19 Sep 2025 02:00:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.151 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247258; cv=none; b=EkJzTPWHNYG0+EI4uJtDZqIuza3nP9aVF+Opb1WF/auxX6bzkp8HGhMOjb2bdS2kH1IfIpnsrXVuVWGB4TqG5pzFOcAXV+7Gp+AT94VKAckOsAWx4s51FmZQ+vszMXvf3L7gFrs4diL/kYdWgZpgvg394Kd7/bl5wlJQfpXu68s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247258; c=relaxed/simple; bh=Q4hSoiAb/K2mL5SLnNoky1JXH917wZ2SexMlx1jzyuE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=PGfNE5IDI+KCm/eLvEKFLgOyN8B3yNMnVpsuocu+3AoRnL7XfuuXQ5+yC6vg5GwudpouBjdJbYj8kdZYC+iXbu4+72K024giyBSPN4+LByHsgR8QU1X1wAYrFNuKixSFVgm0evuJ5MxV04k+x6Ov0zQw7R3W/k2WG8ZBjiUZPnA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com; spf=pass smtp.mailfrom=cyphar.com; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b=aZWr2Oje; arc=none smtp.client-ip=80.241.56.151 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cyphar.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b="aZWr2Oje" Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:b231:465::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-101.mailbox.org (Postfix) with ESMTPS id 4cSbM91vDKz9trP; Fri, 19 Sep 2025 04:00:53 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cyphar.com; s=MBO0001; t=1758247253; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=26uKr0RIVJgFAWLRhEGDW585pCyl1M382u0FfPwNJNQ=; b=aZWr2OjeT7/VPK2cirmIy/xf8k/TGAkoaOY2SAQH7LrQnxTO0lTU5iyDU2luhOpka6tbCC taBLmvbFtGbnmusOLHeUzFqS1I8k8XpYmxouX+xFTpYwQfCHdEfshwwMtJlwrCF0YWwQVd UeydeOh85k1tkKteMB0yy2en3ToRPl3nXHsGnmwZ+IbCiNn371Q7sLQ3EgxkQHaF9wLxia hf2o5i6CGcadUBWYkwXZ4nBVZtBJPOZquh5yNcwA5qBZ4AAbeg6M0SrLwBNsLlKDkI/VqJ it/B3UxKi2lM2o8WOvToOriarpNGY02yIWBlQ4QpnHIIt1T/gDTx6yukUFoNLQ== Authentication-Results: outgoing_mbo_mout; dkim=none; spf=pass (outgoing_mbo_mout: domain of cyphar@cyphar.com designates 2001:67c:2050:b231:465::2 as permitted sender) smtp.mailfrom=cyphar@cyphar.com From: Aleksa Sarai Date: Fri, 19 Sep 2025 11:59:50 +1000 Subject: [PATCH v4 09/10] man/man2/open_tree{,_attr}.2: document new open_tree_attr() API Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250919-new-mount-api-v4-9-1261201ab562@cyphar.com> References: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> In-Reply-To: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> To: Alejandro Colomar Cc: "Michael T. Kerrisk" , Alexander Viro , Jan Kara , Askar Safin , "G. Branden Robinson" , linux-man@vger.kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, David Howells , Christian Brauner , Aleksa Sarai X-Developer-Signature: v=1; a=openpgp-sha256; l=5720; i=cyphar@cyphar.com; h=from:subject:message-id; bh=Q4hSoiAb/K2mL5SLnNoky1JXH917wZ2SexMlx1jzyuE=; b=owGbwMvMwCWmMf3Xpe0vXfIZT6slMWSc2SlukTFzp7qNtEtPmaHJ7BNuO1IFfv54wfdX5o2HB NuiVw33OyayMIhxMViKKbJs8/MM3TR/8ZXkTyvZYOawMoEMkRZpYGBgYGBh4MtNzCs10jHSM9U2 1DM01DHSMWLg4hSAqU7fxPA/b3Fn3sN6zdk6P+Onrsnw3d4azZ52rfzTq+pzv/IZru65zvBXYIW /0joZ0ZKTYqK/FXKqs/6sd22MMBZ85/x0jcWfK+4MAA== X-Developer-Key: i=cyphar@cyphar.com; a=openpgp; fpr=C9C370B246B09F6DBCFC744C34401015D1D2D386 X-Rspamd-Queue-Id: 4cSbM91vDKz9trP This is a new API added in Linux 6.15, and is effectively just a minor expansion of open_tree(2) in order to allow for MOUNT_ATTR_IDMAP to be changed for an existing ID-mapped mount. glibc does not yet have a wrapper for this. While working on this man-page, I discovered a bug in open_tree_attr(2) that accidentally permitted changing MOUNT_ATTR_IDMAP for extant detached ID-mapped mount objects. This is definitely a bug, but there is no need to add this to BUGS because the patch to fix this has already been accepted (slated for 6.18, and will be backported to 6.15+). Cc: Christian Brauner Signed-off-by: Aleksa Sarai --- man/man2/open_tree.2 | 140 ++++++++++++++++++++++++++++++++++++++++++= ++++ man/man2/open_tree_attr.2 | 1 + 2 files changed, 141 insertions(+) diff --git a/man/man2/open_tree.2 b/man/man2/open_tree.2 index 7f85df08b43c7b48a9d021dbbeb2c60092a2b2d4..60de4313a9d5be4ef3ff1217051= f252506a2ade9 100644 --- a/man/man2/open_tree.2 +++ b/man/man2/open_tree.2 @@ -15,7 +15,19 @@ .SH SYNOPSIS .B #include .P .BI "int open_tree(int " dirfd ", const char *" path ", unsigned int " fla= gs ); +.P +.BR "#include " " /* Definition of " SYS_* " constants *= /" +.P +.BI "int syscall(SYS_open_tree_attr, int " dirfd ", const char *" path , +.BI " unsigned int " flags ", struct mount_attr *_Nullable " at= tr ", \ +size_t " size ); .fi +.P +.IR Note : +glibc provides no wrapper for +.BR open_tree_attr (), +necessitating the use of +.BR syscall (2). .SH DESCRIPTION The .BR open_tree () @@ -246,6 +258,129 @@ .SH DESCRIPTION as a detached mount object. This flag is only permitted in conjunction with .BR \%OPEN_TREE_CLONE . +.SS open_tree_attr() +The +.BR open_tree_attr () +system call operates in exactly the same way as +.BR open_tree (), +except for the differences described here. +.P +After performing the same operation as with +.BR open_tree (), +.BR open_tree_attr () +will apply the mount attribute changes described in +.I attr +to the file descriptor before it is returned. +(See +.BR mount_attr (2type) +for a description of the +.I mount_attr +structure. +As described in +.BR mount_setattr (2), +.I size +must be set to +.I sizeof(struct mount_attr) +in order to support future extensions.) +If +.I attr +is NULL, +or has +.IR attr.attr_clr , +.IR attr.attr_set , +and +.I attr.propagation +all set to zero, +then +.BR open_tree_attr () +has identical behaviour to +.BR open_tree (). +.P +The application of +.I attr +to the resultant file descriptor +has identical semantics to +.BR mount_setattr (2), +except for the following extensions and general caveats: +.IP \[bu] 3 +Unlike +.BR mount_setattr (2) +called with a regular +.B OPEN_TREE_CLONE +detached mount object from +.BR open_tree (), +.BR open_tree_attr () +can specify a different setting for +.B \%MOUNT_ATTR_IDMAP +to the original mount object cloned with +.BR OPEN_TREE_CLONE . +.IP +Adding +.B \%MOUNT_ATTR_IDMAP +to +.I attr.attr_clr +will disable ID-mapping for the new mount object; +adding +.B \%MOUNT_ATTR_IDMAP +to +.I attr.attr_set +will configure the mount object to have the ID-mapping defined by +the user namespace referenced by the file descriptor +.IR attr.userns_fd . +(The semantics of which are identical to when +.BR mount_setattr (2) +is used to configure +.BR \%MOUNT_ATTR_IDMAP .) +.IP +Changing or removing the mapping +of an ID-mapped mount is only permitted +if a new detached mount object is being created with +.I flags +including +.BR \%OPEN_TREE_CLONE . +.\" Aleksa Sarai +.\" At time of writing, this is not actually true because of a bug where +.\" open_tree_attr() would accidentally permit changing MOUNT_ATTR_IDMAP = for +.\" existing detached mount objects without setting OPEN_TREE_CLONE, but a +.\" patch to fix it has been slated for 6.18 and will be backported to 6.= 15+. +.\" +.IP \[bu] +If +.I flags +contains +.BR \%AT_RECURSIVE , +then the attributes described in +.I attr +are applied recursively +(just as when +.BR mount_setattr (2) +is called with +.BR \%AT_RECURSIVE ). +However, this applies in addition to the +.BR open_tree ()-specific +behaviour regarding +.BR \%AT_RECURSIVE , +and thus +.I flags +must also contain +.BR \%OPEN_TREE_CLONE . +.P +Note that if +.I flags +does not contain +.BR \%OPEN_TREE_CLONE , +.BR open_tree_attr () +will attempt to modify the mount attributes of +the mount object attached at +the path described by +.I dirfd +and +.IR path . +As with +.BR mount_setattr (2), +if said path is not a mount point, +.BR open_tree_attr () +will return an error. .SH RETURN VALUE On success, a new file descriptor is returned. On error, \-1 is returned, and @@ -339,10 +474,15 @@ .SH ERRORS .SH STANDARDS Linux. .SH HISTORY +.SS open_tree() Linux 5.2. .\" commit a07b20004793d8926f78d63eb5980559f7813404 .\" commit 400913252d09f9cfb8cce33daee43167921fc343 glibc 2.36. +.SS open_tree_attr() +Linux 6.15. +.\" commit c4a16820d90199409c9bf01c4f794e1e9e8d8fd8 +.\" commit 7a54947e727b6df840780a66c970395ed9734ebe .SH NOTES .SS Mount propagation The bind-mount mount objects created by diff --git a/man/man2/open_tree_attr.2 b/man/man2/open_tree_attr.2 new file mode 100644 index 0000000000000000000000000000000000000000..e57269bbd269bcce0b0a9744256= 44ba75e379f2f --- /dev/null +++ b/man/man2/open_tree_attr.2 @@ -0,0 +1 @@ +.so man2/open_tree.2 --=20 2.51.0 From nobody Thu Oct 2 07:46:32 2025 Received: from mout-p-103.mailbox.org (mout-p-103.mailbox.org [80.241.56.161]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8BCC025A2B5; Fri, 19 Sep 2025 02:01:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=80.241.56.161 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247264; cv=none; b=KBr/yAeGCJu8YgozXr0yl+C0sgiQXQpjGUb9lIKxs1tg/JiEdJHSwNx2nYUxwgL+/8r8IZ/sAmwqK0kq97BgtEo1I/m6rjFS6Vdb5SpT3SpPPVGmThICKwv20pakkkD0l4yz4hdbD3Q5TUpVwFt8ZlkPvWw//jpnKi8MIOzafgY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758247264; c=relaxed/simple; bh=k0P5TLKIQcNvPpE/yRue2NqzcDOYWE1LQ0jR8gyJSr0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=nz8MCpvGJiKjB+iJJZ5SfSUcZ229gLu+9C5KJiSgT56TKnkBFrnWVw6S0TTyzLEeFLfx7Plv+j9TQYAFdlXRUDYcDwQwwxud5YcpJj+8BkcMTYhrOIV3squsSJXlWIGsFEGARiqmC0Ul88m+4zlaaRfXbCxreMg2SsvZ/Oe6HJg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com; spf=pass smtp.mailfrom=cyphar.com; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b=gXi+2/Ap; arc=none smtp.client-ip=80.241.56.161 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=cyphar.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=cyphar.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=cyphar.com header.i=@cyphar.com header.b="gXi+2/Ap" Received: from smtp2.mailbox.org (smtp2.mailbox.org [IPv6:2001:67c:2050:b231:465::2]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by mout-p-103.mailbox.org (Postfix) with ESMTPS id 4cSbMG4lQ2z9sy4; Fri, 19 Sep 2025 04:00:58 +0200 (CEST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cyphar.com; s=MBO0001; t=1758247258; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=u+pMwDP1HnUqPZvsXpTH0VTEoRpEk896w42KFmX45ms=; b=gXi+2/ApWAL6NrqyDSeQUG3uNM9IqM+0fJKERWEiseZ8fmSJcYfojWilXGv+0ZQ3ffWpnW Mltr0xS2+4PNEFuyp/RShMB+37nvIhUJiQaR5Yea8DbBkupn5FZEzRJjBgrF1dSC2OH2wU RClxRbnclRCl6KSpNADP5sQDXJ6cgwzetHh3XXCJapD3KK3XT9yZeHbZw/OGMSYQVgguzW WEe6blS6wunR+04qT2fAbglbLfSIXY9uoA8OR0YkTi8fymBvAci2OTWhvRS6s327foHPEi qRnOGqASClaJy6vtYI2UTtqMPBGKJWNcKAMx5e6Y092LabMLuHJpFYVDZ965OA== Authentication-Results: outgoing_mbo_mout; dkim=none; spf=pass (outgoing_mbo_mout: domain of cyphar@cyphar.com designates 2001:67c:2050:b231:465::2 as permitted sender) smtp.mailfrom=cyphar@cyphar.com From: Aleksa Sarai Date: Fri, 19 Sep 2025 11:59:51 +1000 Subject: [PATCH v4 10/10] man/man2/{fsconfig,mount_setattr}.2: add note about attribute-parameter distinction Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250919-new-mount-api-v4-10-1261201ab562@cyphar.com> References: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> In-Reply-To: <20250919-new-mount-api-v4-0-1261201ab562@cyphar.com> To: Alejandro Colomar Cc: "Michael T. Kerrisk" , Alexander Viro , Jan Kara , Askar Safin , "G. Branden Robinson" , linux-man@vger.kernel.org, linux-api@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, David Howells , Christian Brauner , Aleksa Sarai X-Developer-Signature: v=1; a=openpgp-sha256; l=2987; i=cyphar@cyphar.com; h=from:subject:message-id; bh=k0P5TLKIQcNvPpE/yRue2NqzcDOYWE1LQ0jR8gyJSr0=; b=owGbwMvMwCWmMf3Xpe0vXfIZT6slMWSc2SluknfoWMLrf3KtF/cqrlZzjWS+cyZFieVu9Yzcd S4Lftoad0xkYRDjYrAUU2TZ5ucZumn+4ivJn1aywcxhZQIZIi3SwMDAwMDCwJebmFdqpGOkZ6pt qGdoqGOkY8TAxSkAU50kwMiwT/b7pUCRTyKLj/KvaeT9Yfj4156O5gex/7IOLj89pc7vMyPD1NR djlzbpH67bdq6xIDHoVn0xhaRnd/+xH75ef3jLH9FRgA= X-Developer-Key: i=cyphar@cyphar.com; a=openpgp; fpr=C9C370B246B09F6DBCFC744C34401015D1D2D386 X-Rspamd-Queue-Id: 4cSbMG4lQ2z9sy4 This was not particularly well documented in mount(8) nor mount(2), and since this is a fairly notable aspect of the new mount API, we should probably add some words about it. Signed-off-by: Aleksa Sarai --- man/man2/fsconfig.2 | 12 ++++++++++++ man/man2/mount_setattr.2 | 40 ++++++++++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+) diff --git a/man/man2/fsconfig.2 b/man/man2/fsconfig.2 index 5a18e08c700ac93aa22c341b4134944ee3c38d0b..d827a7b96e08284fb025f94c334= 8a4acc4571b7d 100644 --- a/man/man2/fsconfig.2 +++ b/man/man2/fsconfig.2 @@ -579,6 +579,18 @@ .SS Generic filesystem parameters Linux Security Modules (LSMs) are also generic with respect to the underlying filesystem. See the documentation for the LSM you wish to configure for more details. +.SS Mount attributes and filesystem parameters +Some filesystem parameters +(traditionally associated with +.BR mount (8)-style +options) +have a sibling mount attribute +with superficially similar user-facing behaviour. +.P +For a description of the distinction between +mount attributes and filesystem parameters, +see the "Mount attributes and filesystem parameters" subsection of +.BR mount_setattr (2). .SH CAVEATS .SS Filesystem parameter types As a result of diff --git a/man/man2/mount_setattr.2 b/man/man2/mount_setattr.2 index b27db5b96665cfb0c387bf5b60776d45e0139956..f7d0b96fddf97698e36cab020f1= d695783143025 100644 --- a/man/man2/mount_setattr.2 +++ b/man/man2/mount_setattr.2 @@ -790,6 +790,46 @@ .SS ID-mapped mounts .BR chown (2) system call changes the ownership globally and permanently. .\" +.SS Mount attributes and filesystem parameters +Some mount attributes +(traditionally associated with +.BR mount (8)-style +options) +have a sibling mount attribute +with superficially similar user-facing behaviour. +For example, the +.I -o ro +option to +.BR mount (8) +can refer to the +"read-only" filesystem parameter, +or the "read-only" mount attribute. +Both of these result in mount objects becoming read-only, +but they do have different behaviour. +.P +The distinction between these two kinds of option is that +mount object attributes are applied per-mount-object +(allowing different mount objects +derived from a given filesystem instance +to have different attributes), +while filesystem instance parameters +("superblock flags" in kernel-developer parlance) +apply to all mount objects +derived from the same filesystem instance. +.P +When using +.BR mount (2), +the line between these two types of mount options was blurred. +However, with +.BR mount_setattr () +and +.BR fsconfig (2), +the distinction is made much clearer. +Mount attributes are configured with +.BR mount_setattr (), +while filesystem parameters can be configured using +.BR fsconfig (2). +.\" .SS Extensibility In order to allow for future extensibility, .BR mount_setattr () --=20 2.51.0