From nobody Mon Dec 15 21:43:17 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E66C71E4AE; Wed, 7 May 2025 08:42:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746607367; cv=none; b=CLZ0CB2WNRNHSa4TyOLNkxlFguRA6DXs5LfEyX9t0np43/JmUBK7bdLFVFLsIu4p2VKoqvh55P3NnQqoDoKWQlTXraoIlRespFI1y2aw1d3PMvSzYKwHDR+PFPYfOIV5ZMjh18ICEsGrMU9LNevPOlDP1xR6TCNh0l9SYgCTQiE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746607367; c=relaxed/simple; bh=kUhCRBq248hSPNfPnHMQzMgJGoZH/qz2S3cP/dYgX3Y=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=pPeZBsootkVSd3mY4zQUSbxJBlLDTFqki0vfLc4fDVhYN4CF1Zd/rD1erOFqBUymnLOtRsnRZ1ifv0lh5VNQ5KizUa/mUgsBroFwj9R7Hl+xnFGUBAcBtRlNuyVLAGmjh502cYa4bIN8NdLMZ4wrnXfHFiIAawdhFdpy91CMYtk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=D8vt4voB; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="D8vt4voB" Received: by smtp.kernel.org (Postfix) with ESMTPS id B07D0C4CEEE; Wed, 7 May 2025 08:42:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1746607366; bh=kUhCRBq248hSPNfPnHMQzMgJGoZH/qz2S3cP/dYgX3Y=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=D8vt4voBAvGihdx7U76lgzP7jOSTwTUEbOv2JMITD0nZFUt5Oavhx6UbUM0rQUvYw vOholWnxuH1nFrKy+YXcau5d3w0tK7VCdnKasxLi2hIs1PqcQU5msI0aB5jKsgfv7j nNc7DtAzZAmLcOBV9aTGpsmD28q8PZXQ+i8CjPGpj5zmnqozXtl/VmtPVKmtDfWLx9 qOvf713IO0yi1mZe+O5kuJgUMYCxozyfELNq7Ykzq+CI6t69h6dWgcQaQyvXUFfvH7 PwpuxkThAtpCWmp3d9Jiz+QOkpacuvp+myxARwqcWrtVwVSzFeh6Ra0Ddu2kdGtia8 Ddp67eNaYW99w== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id A16A6C3ABC0; Wed, 7 May 2025 08:42:46 +0000 (UTC) From: Chen Linxuan via B4 Relay Date: Wed, 07 May 2025 16:42:16 +0800 Subject: [PATCH v2 1/2] MAINTAINERS: update filter of FUSE documentation Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250507-fuse-passthrough-doc-v2-1-ae7c0dd8bba6@uniontech.com> References: <20250507-fuse-passthrough-doc-v2-0-ae7c0dd8bba6@uniontech.com> In-Reply-To: <20250507-fuse-passthrough-doc-v2-0-ae7c0dd8bba6@uniontech.com> To: Miklos Szeredi , Jonathan Corbet Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, Chen Linxuan , Amir Goldstein X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=815; i=chenlinxuan@uniontech.com; h=from:subject:message-id; bh=y42qoLgQli2vjKWNdl397i8EcidQOg+VhEWweGEDgMQ=; b=owEBbQKS/ZANAwAKAXYe5hQ5ma6LAcsmYgBoGx0D4Lng+/PWfbuRBECYjR3GOdxRrrem5npZO RxvyFHnioKJAjMEAAEKAB0WIQTO1VElAk6xdvy0ZVp2HuYUOZmuiwUCaBsdAwAKCRB2HuYUOZmu i4WEEACEiCfgkn/+Uh4aJp1ZgralHr+/z3NAEsQiarUZgLTRWT47ShbUYw603AdA8FOb8x8b9LV bWnDA5t5gM85bpgMpnjxi7lR6cQJwGKuhU3e4ErM3q2vm0Ab2xxMOw4tOo+X9zcXWSB4gBCpz3u cnVGkp6qxUgcpNY4Wicg5v3D3IBSOjWWDMaH3i1GU7TSs842ixqTeXM0E9wUCanN4eRm1LGDzm+ cOxIFyKKyXuEP4pbmuCnvZRCMJdEpodgmhBWIKicMf8ggWd2KwF7gL0o611KFMnv4FGWWFWJjxm 7SwvyekAaFhhvoNvxnHNNwUvAPak2PPtWBVa5R7adp8Tg/AwohX5grOy3dqNSVntRVsaTl07iBZ aleSnassOaKzJzl13tHELjK2iRBw0G3QZn1/CjfKWVV5XLaa39GxSu1PsY/kUxdLI6UYk/ZDoCL 8X1UdYfp/jsbtfd1c+I2aQbBvTFqqXLaAq9uaN8LeP3Fgwm28ib28xkxdf1Og2rX1ZUrjHNEyxC Wt58ChOL7v6FGBHo30wZzpolwMOl4RCC4FF4r2d+ok1W1bUKlrWJQxrLuN3GH12lNIeHJqALezX shJDQcNX+vPRCGLIHouni2aplzsGMHbgKP/nU3ZHDaTwJp6HZKlojYMBf2NHLlj4N/i1zlXUuSG HJqUX3M9koNMynQ== X-Developer-Key: i=chenlinxuan@uniontech.com; a=openpgp; fpr=D818ACDD385CAE92D4BAC01A6269794D24791D21 X-Endpoint-Received: by B4 Relay for chenlinxuan@uniontech.com/default with auth_id=380 X-Original-From: Chen Linxuan Reply-To: chenlinxuan@uniontech.com From: Chen Linxuan There are some fuse-*.rst files in Documentation directory, let's make get_mantainers.pl work on those file instead of only fuse.rst. Reviewed-by: Amir Goldstein Signed-off-by: Chen Linxuan --- MAINTAINERS | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/MAINTAINERS b/MAINTAINERS index 5f8688630c014d2639ed9023ba5a256bc1c28e25..08cedaa87eb3f7047ca49d0e6f9= 46dbd8e7ad793 100644 --- a/MAINTAINERS +++ b/MAINTAINERS @@ -9736,7 +9736,7 @@ L: linux-fsdevel@vger.kernel.org S: Maintained W: https://github.com/libfuse/ T: git git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse.git -F: Documentation/filesystems/fuse.rst +F: Documentation/filesystems/fuse* F: fs/fuse/ F: include/uapi/linux/fuse.h =20 --=20 2.43.0 From nobody Mon Dec 15 21:43:17 2025 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2B4B61D5CE0; Wed, 7 May 2025 08:42:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746607367; cv=none; b=KMO7ekHToe7LPB1D7vJMfLC/1RtCEZC2FMYKi8CB64ErUxcalaLPpUKfdDuOy2K6qASE8vvRzqWzI/tMSMJCHd7FtRN5EtLkLY0bGLtcqMtxjpTCjkZytyJbw1Cp4j45W9cau/SFrKiRBIr7h0l8ISK9QWsDpIVpAMcJk2Lud9U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746607367; c=relaxed/simple; bh=sMfc2MEmWe+OTy+/A8y1VlQc15fzCnGHPGGJoYi/UC0=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=iDKEU3ei3IdwKzRm2/p1fmsVU8kDAmfuQgxuw/obR9JahdK8nHdkTpvam9hcWwWgDjVrocZDJKA2gnun1mknMzcWmV7ncfku3uUFmZ6UPNxJWwoBygP9LArq987bwx/2rzML3nzj/7DdXLtYbojByAGwuxeIHaV5xpzgrPQo8jY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=qMjLXWUD; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="qMjLXWUD" Received: by smtp.kernel.org (Postfix) with ESMTPS id BBFB0C4CEEB; Wed, 7 May 2025 08:42:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1746607366; bh=sMfc2MEmWe+OTy+/A8y1VlQc15fzCnGHPGGJoYi/UC0=; h=From:Date:Subject:References:In-Reply-To:To:Cc:Reply-To:From; b=qMjLXWUDk42yOu/j69x1u+/yUuZZYG4i5mgeP2D5+zyvWtYhufdQUBYFGJzKl7DN7 zoQ2kmOCCweJw+PbzpSUNJK4+reLkotu1gVKZdqq7JAFaSph4zS/UXirRHv5oDhD8g fjrJ61kg+gPVO1+HkgABNpBib/RIfxgGY82uwa2IhxKqfGsmG1FDY2YPetOz3YauYv vG72fXhJL2srxke4RffHIz4t1UulbJDy9oxEQ6JKVOBpy7FQpVa/n3Q/tmgoC3UaD8 bhzcF2lTkxKB7djZ4kQdbwjvpxaiFD1mnATa4POAs2u9B24KmkK3law/oR+XNHANrI mYnQeOK8LjacA== Received: from aws-us-west-2-korg-lkml-1.web.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF819C3ABC9; Wed, 7 May 2025 08:42:46 +0000 (UTC) From: Chen Linxuan via B4 Relay Date: Wed, 07 May 2025 16:42:17 +0800 Subject: [PATCH v2 2/2] docs: filesystems: add fuse-passthrough.rst Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250507-fuse-passthrough-doc-v2-2-ae7c0dd8bba6@uniontech.com> References: <20250507-fuse-passthrough-doc-v2-0-ae7c0dd8bba6@uniontech.com> In-Reply-To: <20250507-fuse-passthrough-doc-v2-0-ae7c0dd8bba6@uniontech.com> To: Miklos Szeredi , Jonathan Corbet Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-doc@vger.kernel.org, Chen Linxuan , Amir Goldstein , Bernd Schubert X-Mailer: b4 0.14.2 X-Developer-Signature: v=1; a=openpgp-sha256; l=8328; i=chenlinxuan@uniontech.com; h=from:subject:message-id; bh=S1S/zXu1TEYabzgcjMgPUBwM81VJ+LsPv62zpTlym7Q=; b=owEBbQKS/ZANAwAKAXYe5hQ5ma6LAcsmYgBoGx0E+Cu2teiatZ0ivGwbqh+nbaftpODrq0cz3 H99DlmhVISJAjMEAAEKAB0WIQTO1VElAk6xdvy0ZVp2HuYUOZmuiwUCaBsdBAAKCRB2HuYUOZmu i9t/EAD5pgLS4hShoS0Aen0M3UnbeV1Wppnn8I1jVHRUppPLeZZovXnMEIOzGj712Aq/kNEy9mn al8q5mmEhhTE9KvQeUzqlTlLHcVc1SO9CaLhSsHgi5Y009GhQptLfM/pHQI9ZyMbWtOqKGVMx7K WgZ9wIQ4SAqVMyQgJ1q2P0rtknkv+5hPR92RaBqGaHqK5ilNBeg3GKY53JWCMXyqoGJjcE0FlO9 8n+Ltu114UliLAtE+8hHAXV3daAKJ7Hqh/OfBpOUKD+w4VNuOGdpSL6LQcCLo/Y2dFG4DEMbAse H49n8qmIZEH0hF8JF2FMStfdbAqf8rcGZAyGfQ7ffR9Zz7TnH8adlALCqASUB0M4m75TmzgruLn yJRQRecjN3Hdf2Z7TCCXnMizK2vHzkmRM+9NT4caFLx66F/jj8uu4j9Q19K/E0LkbFJwstATFGD kH3uE7ZxVzxP1B04nIgGGGvwOOB7V9qkDmrNSeKHtfiYQpAlKjBEnwVS8izwD+ex3Udx/k4g6zJ /lq2W0MR6FJC/eUHE0g2Uufq03k02O7qV8v2nDPwIpW71J7wsI1NRyKP7ELSJQFvNSakBArkLjn fqKnciCdqAXNMaetF6l7Q4DghknzPDfcGGBCP4ZXzcwiTsgv/xtaHA49eUMxLLdsRC9qqP8gQz0 REI/+GCtR31cSjQ== X-Developer-Key: i=chenlinxuan@uniontech.com; a=openpgp; fpr=D818ACDD385CAE92D4BAC01A6269794D24791D21 X-Endpoint-Received: by B4 Relay for chenlinxuan@uniontech.com/default with auth_id=380 X-Original-From: Chen Linxuan Reply-To: chenlinxuan@uniontech.com From: Chen Linxuan Add a documentation about FUSE passthrough. It's mainly about why FUSE passthrough needs CAP_SYS_ADMIN. Some related discussions: Link: https://lore.kernel.org/all/4b64a41c-6167-4c02-8bae-3021270ca519@fast= mail.fm/T/#mc73e04df56b8830b1d7b06b5d9f22e594fba423e Link: https://lore.kernel.org/linux-fsdevel/CAOQ4uxhAY1m7ubJ3p-A3rSufw_53Wu= DRMT1Zqe_OC0bP_Fb3Zw@mail.gmail.com/ Cc: Amir Goldstein Cc: Bernd Schubert Reviewed-by: Amir Goldstein Signed-off-by: Chen Linxuan --- Documentation/filesystems/fuse-passthrough.rst | 133 +++++++++++++++++++++= ++++ Documentation/filesystems/index.rst | 1 + 2 files changed, 134 insertions(+) diff --git a/Documentation/filesystems/fuse-passthrough.rst b/Documentation= /filesystems/fuse-passthrough.rst new file mode 100644 index 0000000000000000000000000000000000000000..2b0e7c2da54acde4d48fd91ecec= e27256c4e04fd --- /dev/null +++ b/Documentation/filesystems/fuse-passthrough.rst @@ -0,0 +1,133 @@ +.. SPDX-License-Identifier: GPL-2.0 + +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D +FUSE Passthrough +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Introduction +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +FUSE (Filesystem in Userspace) passthrough is a feature designed to improv= e the +performance of FUSE filesystems for I/O operations. Typically, FUSE operat= ions +involve communication between the kernel and a userspace FUSE daemon, whic= h can +incur overhead. Passthrough allows certain operations on a FUSE file to by= pass +the userspace daemon and be executed directly by the kernel on an underlyi= ng +"backing file". + +This is achieved by the FUSE daemon registering a file descriptor (pointin= g to +the backing file on a lower filesystem) with the FUSE kernel module. The k= ernel +then receives an identifier (``backing_id``) for this registered backing f= ile. +When a FUSE file is subsequently opened, the FUSE daemon can, in its respo= nse to +the ``OPEN`` request, include this ``backing_id`` and set the +``FOPEN_PASSTHROUGH`` flag. This establishes a direct link for specific +operations. + +Currently, passthrough is supported for operations like ``read(2)``/``writ= e(2)`` +(via ``read_iter``/``write_iter``), ``splice(2)``, and ``mmap(2)``. + +Enabling Passthrough +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +To use FUSE passthrough: + + 1. The FUSE filesystem must be compiled with ``CONFIG_FUSE_PASSTHROUGH`` + enabled. + 2. The FUSE daemon, during the ``FUSE_INIT`` handshake, must negotiate t= he + ``FUSE_PASSTHROUGH`` capability and specify its desired + ``max_stack_depth``. + 3. The (privileged) FUSE daemon uses the ``FUSE_DEV_IOC_BACKING_OPEN`` i= octl + on its connection file descriptor (e.g., ``/dev/fuse``) to register a + backing file descriptor and obtain a ``backing_id``. + 4. When handling an ``OPEN`` or ``CREATE`` request for a FUSE file, the = daemon + replies with the ``FOPEN_PASSTHROUGH`` flag set in + ``fuse_open_out::open_flags`` and provides the corresponding ``backin= g_id`` + in ``fuse_open_out::backing_id``. + 5. The FUSE daemon should eventually call ``FUSE_DEV_IOC_BACKING_CLOSE``= with + the ``backing_id`` to release the kernel's reference to the backing f= ile + when it's no longer needed for passthrough setups. + +Privilege Requirements +=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D + +Setting up passthrough functionality currently requires the FUSE daemon to +possess the ``CAP_SYS_ADMIN`` capability. This requirement stems from seve= ral +security and resource management considerations that are actively being +discussed and worked on. The primary reasons for this restriction are deta= iled +below. + +Resource Accounting and Visibility +---------------------------------- + +The core mechanism for passthrough involves the FUSE daemon opening a file +descriptor to a backing file and registering it with the FUSE kernel modul= e via +the ``FUSE_DEV_IOC_BACKING_OPEN`` ioctl. This ioctl returns a ``backing_id= `` +associated with a kernel-internal ``struct fuse_backing`` object, which ho= lds a +reference to the backing ``struct file``. + +A significant concern arises because the FUSE daemon can close its own file +descriptor to the backing file after registration. The kernel, however, wi= ll +still hold a reference to the ``struct file`` via the ``struct fuse_backin= g`` +object as long as it's associated with a ``backing_id`` (or subsequently, = with +an open FUSE file in passthrough mode). + +This behavior leads to two main issues for unprivileged FUSE daemons: + + 1. **Invisibility to lsof and other inspection tools**: Once the FUSE + daemon closes its file descriptor, the open backing file held by the = kernel + becomes "hidden." Standard tools like ``lsof``, which typically inspe= ct + process file descriptor tables, would not be able to identify that th= is + file is still open by the system on behalf of the FUSE filesystem. Th= is + makes it difficult for system administrators to track resource usage = or + debug issues related to open files (e.g., preventing unmounts). + + 2. **Bypassing RLIMIT_NOFILE**: The FUSE daemon process is subject to + resource limits, including the maximum number of open file descriptors + (``RLIMIT_NOFILE``). If an unprivileged daemon could register backing= files + and then close its own FDs, it could potentially cause the kernel to = hold + an unlimited number of open ``struct file`` references without these = being + accounted against the daemon's ``RLIMIT_NOFILE``. This could lead to a + denial-of-service (DoS) by exhausting system-wide file resources. + +The ``CAP_SYS_ADMIN`` requirement acts as a safeguard against these issues, +restricting this powerful capability to trusted processes. + +**NOTE**: ``io_uring`` solves this similar issue by exposing its "fixed fi= les", +which are visible via ``fdinfo`` and accounted under the registering user's +``RLIMIT_NOFILE``. + +Filesystem Stacking and Shutdown Loops +-------------------------------------- + +Another concern relates to the potential for creating complex and problema= tic +filesystem stacking scenarios if unprivileged users could set up passthrou= gh. +A FUSE passthrough filesystem might use a backing file that resides: + + * On the *same* FUSE filesystem. + * On another filesystem (like OverlayFS) which itself might have an uppe= r or + lower layer that is a FUSE filesystem. + +These configurations could create dependency loops, particularly during +filesystem shutdown or unmount sequences, leading to deadlocks or system +instability. This is conceptually similar to the risks associated with the +``LOOP_SET_FD`` ioctl, which also requires ``CAP_SYS_ADMIN``. + +To mitigate this, FUSE passthrough already incorporates checks based on +filesystem stacking depth (``sb->s_stack_depth`` and ``fc->max_stack_depth= ``). +For example, during the ``FUSE_INIT`` handshake, the FUSE daemon can negot= iate +the ``max_stack_depth`` it supports. When a backing file is registered via +``FUSE_DEV_IOC_BACKING_OPEN``, the kernel checks if the backing file's +filesystem stack depth is within the allowed limit. + +The ``CAP_SYS_ADMIN`` requirement provides an additional layer of security, +ensuring that only privileged users can create these potentially complex +stacking arrangements. + +General Security Posture +------------------------ + +As a general principle for new kernel features that allow userspace to ins= truct +the kernel to perform direct operations on its behalf based on user-provid= ed +file descriptors, starting with a higher privilege requirement (like +``CAP_SYS_ADMIN``) is a conservative and common security practice. This al= lows +the feature to be used and tested while further security implications are +evaluated and addressed. diff --git a/Documentation/filesystems/index.rst b/Documentation/filesystem= s/index.rst index a9cf8e950b15ad68a021d5f214b07f58d752f4e3..2913f4f2e00ccc466563aba5692= e2f95699cb674 100644 --- a/Documentation/filesystems/index.rst +++ b/Documentation/filesystems/index.rst @@ -99,6 +99,7 @@ Documentation for filesystem implementations. fuse fuse-io fuse-io-uring + fuse-passthrough inotify isofs nilfs2 --=20 2.43.0