From nobody Sun Oct 5 20:02:14 2025 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5E1FF1862BB for ; Wed, 30 Jul 2025 00:58:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753837118; cv=none; b=XUvLCSoD65XsWd4wGCm61EjI+UbstL75je0TzpPsimtOA32hNUc8PPI78xb3srKpBsRpPagpKH+L8r6Faf3S4xft5sloJ7kSaPP+uLByr+61XuFIBwYWqJG7ljmMGMvjKitnTbfSuSscmcSWXRXWC3XCn07zbFhKyKaxjMSXPJY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753837118; c=relaxed/simple; bh=pcC7UYan9t6G5rTddrYrXXeusTE6A9iNrN8k35inOfo=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=DgfO6wB31ySY01sqIGu0EVzJ66sL4MjehE/BMdGPrpLF5rJexke0Lru1Cxup5cDV8o7BN5O3dhH+9AU4sL45jWN2NaXav+3SOBNrFP8Y2S+kALYR1NrIhU7YUXykOD8oDkNMh8XIzGj8lf8u6g08LuhJk87rFWiInJZPqdsHqBI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--isaacmanjarres.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=pN8gaGw8; arc=none smtp.client-ip=209.85.215.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--isaacmanjarres.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="pN8gaGw8" Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-b38d8ee46a5so413387a12.1 for ; Tue, 29 Jul 2025 17:58:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1753837116; x=1754441916; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=05zj0+Q+MKMGwc697BcOisoUJr9tGqN15IKwcysHRAw=; b=pN8gaGw8sfjGDgIMFJyp50BvKcQzGc6yVFZyVtFVmDJgQVrOB9e5dV2DNtWr83o/lo o6tvNkm5/xCk84UKZ7BwejQ7/EDdAG+ELzXYMNgBQCzjXfuh0AcOY4MIseM1uaZ/2pPh AuHVrj59BW5SU/dWKsuCch/P/sstMzNH7+a/rjv3Lp8eM5vMEPeSzWekJxR9n8Cp5nV5 hWN2LxqpbvzM/iVllProPvhFzN71ZfgaCClMQRTmIrbvNS9Vsj6/F6SsVasevnib67bW 44Jqrb1TDYKCbsRf8OS+MtcRjQ21c6daulbrmZ8fvQqVqjGDYo+9hTMJJjrlOKyT9t8h YggA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753837116; x=1754441916; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=05zj0+Q+MKMGwc697BcOisoUJr9tGqN15IKwcysHRAw=; b=kBerzqE1byChwBAZ9Hv+BXB4YJOY9gWwJ9RGtlCfQQaU/Tk54XckSiWcAsaQat7G8X 7Rj5hoQPXVBfHuQmBGjNKzGzXDYv1DxoAmLSM/v17N/01qSD7XEAqc+n9VgCz6+VfYl0 /gwuW80SIQrI10TLhEOrOWS6FDIK2I02PykswCBpaHX7HJnAbhtfrKuJrtPG9ripmRk0 C0nb7O0fqSy940MLIOEfrA1mU5m5ZZ89pTa6h4TYMPS6voNRzMzsr0Rdjc/NQw5tXU60 v8wrMnwIw2AGmCv0/qo4R6HY0TCN38N/3rhYA3t9EfCUxTUwcrau4VTiaTRTx3gqXo7W +vhw== X-Forwarded-Encrypted: i=1; AJvYcCUXUKS8bH3FVBld81c7P8gEc8BZZgjgVz4q4lKgRAfUnmqjUUougcebkYGkMNDoDIFTBX63z6ikcaiaiGM=@vger.kernel.org X-Gm-Message-State: AOJu0YwxX7iS8dctlVhL01fz/cRyBkU4p3GE0dufbZ5uPCng+QpsnVqk cBJQcO0ioSdbfZLg7OG6CMhWoKVGmEekiXJx9yamKD4odgtwjQLymSqeuNYOIPixwJspb6aj72E mCwbLfZZDOz8gWpYoQc9IhlxnNRC7OFQjOVRH8g== X-Google-Smtp-Source: AGHT+IGWEjzOhQFYUAsEeOg6aONv1hOYMAFaYzjzUDUBEthSw7afXLd4o0cXyx997qVkvbfXma0R0HZEHo009a/07XIOQA== X-Received: from pglf36.prod.google.com ([2002:a63:1024:0:b0:b42:c17:47ac]) (user=isaacmanjarres job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6300:210d:b0:225:9ac1:7c6b with SMTP id adf61e73a8af0-23dadafe765mr10380648637.4.1753837115424; Tue, 29 Jul 2025 17:58:35 -0700 (PDT) Date: Tue, 29 Jul 2025 17:58:06 -0700 In-Reply-To: <20250730005818.2793577-1-isaacmanjarres@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250730005818.2793577-1-isaacmanjarres@google.com> X-Mailer: git-send-email 2.50.1.552.g942d659e1b-goog Message-ID: <20250730005818.2793577-2-isaacmanjarres@google.com> Subject: [PATCH 5.4.y 1/3] mm: drop the assumption that VM_SHARED always implies writable From: "Isaac J. Manjarres" To: lorenzo.stoakes@oracle.com, gregkh@linuxfoundation.org, Muchun Song , Oscar Salvador , David Hildenbrand , Alexander Viro , Christian Brauner , Jan Kara , Andrew Morton , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Kees Cook , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , "Matthew Wilcox (Oracle)" , Jann Horn , Pedro Falcato , Hugh Dickins , Baolin Wang Cc: aliceryhl@google.com, stable@vger.kernel.org, "Isaac J. Manjarres" , kernel-team@android.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Lorenzo Stoakes , Andy Lutomirski , Mike Kravetz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Lorenzo Stoakes [ Upstream commit e8e17ee90eaf650c855adb0a3e5e965fd6692ff1 ] Patch series "permit write-sealed memfd read-only shared mappings", v4. The man page for fcntl() describing memfd file seals states the following about F_SEAL_WRITE:- Furthermore, trying to create new shared, writable memory-mappings via mmap(2) will also fail with EPERM. With emphasis on 'writable'. In turns out in fact that currently the kernel simply disallows all new shared memory mappings for a memfd with F_SEAL_WRITE applied, rendering this documentation inaccurate. This matters because users are therefore unable to obtain a shared mapping to a memfd after write sealing altogether, which limits their usefulness. This was reported in the discussion thread [1] originating from a bug report [2]. This is a product of both using the struct address_space->i_mmap_writable atomic counter to determine whether writing may be permitted, and the kernel adjusting this counter when any VM_SHARED mapping is performed and more generally implicitly assuming VM_SHARED implies writable. It seems sensible that we should only update this mapping if VM_MAYWRITE is specified, i.e. whether it is possible that this mapping could at any point be written to. If we do so then all we need to do to permit write seals to function as documented is to clear VM_MAYWRITE when mapping read-only. It turns out this functionality already exists for F_SEAL_FUTURE_WRITE - we can therefore simply adapt this logic to do the same for F_SEAL_WRITE. We then hit a chicken and egg situation in mmap_region() where the check for VM_MAYWRITE occurs before we are able to clear this flag. To work around this, perform this check after we invoke call_mmap(), with careful consideration of error paths. Thanks to Andy Lutomirski for the suggestion! [1]:https://lore.kernel.org/all/20230324133646.16101dfa666f253c4715d965@lin= ux-foundation.org/ [2]:https://bugzilla.kernel.org/show_bug.cgi?id=3D217238 This patch (of 3): There is a general assumption that VMAs with the VM_SHARED flag set are writable. If the VM_MAYWRITE flag is not set, then this is simply not the case. Update those checks which affect the struct address_space->i_mmap_writable field to explicitly test for this by introducing [vma_]is_shared_maywrite() helper functions. This remains entirely conservative, as the lack of VM_MAYWRITE guarantees that the VMA cannot be written to. Link: https://lkml.kernel.org/r/cover.1697116581.git.lstoakes@gmail.com Link: https://lkml.kernel.org/r/d978aefefa83ec42d18dfa964ad180dbcde34795.16= 97116581.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes Suggested-by: Andy Lutomirski Reviewed-by: Jan Kara Cc: Alexander Viro Cc: Christian Brauner Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Mike Kravetz Cc: Muchun Song Signed-off-by: Andrew Morton Cc: stable@vger.kernel.org Signed-off-by: Isaac J. Manjarres --- include/linux/fs.h | 4 ++-- include/linux/mm.h | 11 +++++++++++ kernel/fork.c | 2 +- mm/filemap.c | 2 +- mm/madvise.c | 2 +- mm/mmap.c | 10 +++++----- 6 files changed, 21 insertions(+), 10 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index d3648a55590c..c5985d72d60e 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -430,7 +430,7 @@ int pagecache_write_end(struct file *, struct address_s= pace *mapping, * @host: Owner, either the inode or the block_device. * @i_pages: Cached pages. * @gfp_mask: Memory allocation flags to use for allocating pages. - * @i_mmap_writable: Number of VM_SHARED mappings. + * @i_mmap_writable: Number of VM_SHARED, VM_MAYWRITE mappings. * @nr_thps: Number of THPs in the pagecache (non-shmem only). * @i_mmap: Tree of private and shared mappings. * @i_mmap_rwsem: Protects @i_mmap and @i_mmap_writable. @@ -553,7 +553,7 @@ static inline int mapping_mapped(struct address_space *= mapping) =20 /* * Might pages of this file have been modified in userspace? - * Note that i_mmap_writable counts all VM_SHARED vmas: do_mmap_pgoff + * Note that i_mmap_writable counts all VM_SHARED, VM_MAYWRITE vmas: do_mm= ap_pgoff * marks vma as VM_SHARED if it is shared, and the file was opened for * writing i.e. vma may be mprotected writable even if now readonly. * diff --git a/include/linux/mm.h b/include/linux/mm.h index 4d3657b630db..47d56c96447a 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -549,6 +549,17 @@ static inline bool vma_is_anonymous(struct vm_area_str= uct *vma) return !vma->vm_ops; } =20 +static inline bool is_shared_maywrite(vm_flags_t vm_flags) +{ + return (vm_flags & (VM_SHARED | VM_MAYWRITE)) =3D=3D + (VM_SHARED | VM_MAYWRITE); +} + +static inline bool vma_is_shared_maywrite(struct vm_area_struct *vma) +{ + return is_shared_maywrite(vma->vm_flags); +} + #ifdef CONFIG_SHMEM /* * The vma_is_shmem is not inline because it is used only by slow diff --git a/kernel/fork.c b/kernel/fork.c index e71f96bff1dc..ad3e6e91d828 100644 --- a/kernel/fork.c +++ b/kernel/fork.c @@ -566,7 +566,7 @@ static __latent_entropy int dup_mmap(struct mm_struct *= mm, if (tmp->vm_flags & VM_DENYWRITE) atomic_dec(&inode->i_writecount); i_mmap_lock_write(mapping); - if (tmp->vm_flags & VM_SHARED) + if (vma_is_shared_maywrite(tmp)) atomic_inc(&mapping->i_mmap_writable); flush_dcache_mmap_lock(mapping); /* insert tmp into the share list, just after mpnt */ diff --git a/mm/filemap.c b/mm/filemap.c index f1ed0400c37c..af3efb23262b 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2876,7 +2876,7 @@ int generic_file_mmap(struct file * file, struct vm_a= rea_struct * vma) */ int generic_file_readonly_mmap(struct file *file, struct vm_area_struct *v= ma) { - if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_MAYWRITE)) + if (vma_is_shared_maywrite(vma)) return -EINVAL; return generic_file_mmap(file, vma); } diff --git a/mm/madvise.c b/mm/madvise.c index ac8d68c488b5..3f5331c96ad5 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -839,7 +839,7 @@ static long madvise_remove(struct vm_area_struct *vma, return -EINVAL; } =20 - if ((vma->vm_flags & (VM_SHARED|VM_WRITE)) !=3D (VM_SHARED|VM_WRITE)) + if (!vma_is_shared_maywrite(vma)) return -EACCES; =20 offset =3D (loff_t)(start - vma->vm_start) diff --git a/mm/mmap.c b/mm/mmap.c index eeebbb20accf..cb712ae731cd 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -141,7 +141,7 @@ static void __remove_shared_vm_struct(struct vm_area_st= ruct *vma, { if (vma->vm_flags & VM_DENYWRITE) atomic_inc(&file_inode(file)->i_writecount); - if (vma->vm_flags & VM_SHARED) + if (vma_is_shared_maywrite(vma)) mapping_unmap_writable(mapping); =20 flush_dcache_mmap_lock(mapping); @@ -619,7 +619,7 @@ static void __vma_link_file(struct vm_area_struct *vma) =20 if (vma->vm_flags & VM_DENYWRITE) atomic_dec(&file_inode(file)->i_writecount); - if (vma->vm_flags & VM_SHARED) + if (vma_is_shared_maywrite(vma)) atomic_inc(&mapping->i_mmap_writable); =20 flush_dcache_mmap_lock(mapping); @@ -1785,7 +1785,7 @@ unsigned long mmap_region(struct file *file, unsigned= long addr, if (error) goto free_vma; } - if (vm_flags & VM_SHARED) { + if (is_shared_maywrite(vm_flags)) { error =3D mapping_map_writable(file->f_mapping); if (error) goto allow_write_and_free_vma; @@ -1823,7 +1823,7 @@ unsigned long mmap_region(struct file *file, unsigned= long addr, vma_link(mm, vma, prev, rb_link, rb_parent); /* Once vma denies write, undo our temporary denial count */ if (file) { - if (vm_flags & VM_SHARED) + if (is_shared_maywrite(vm_flags)) mapping_unmap_writable(file->f_mapping); if (vm_flags & VM_DENYWRITE) allow_write_access(file); @@ -1864,7 +1864,7 @@ unsigned long mmap_region(struct file *file, unsigned= long addr, =20 /* Undo any partial mapping done by a device driver. */ unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end); - if (vm_flags & VM_SHARED) + if (is_shared_maywrite(vm_flags)) mapping_unmap_writable(file->f_mapping); allow_write_and_free_vma: if (vm_flags & VM_DENYWRITE) --=20 2.50.1.552.g942d659e1b-goog From nobody Sun Oct 5 20:02:14 2025 Received: from mail-pg1-f201.google.com (mail-pg1-f201.google.com [209.85.215.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1FBE61A23B6 for ; Wed, 30 Jul 2025 00:58:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753837124; cv=none; b=QeqndqYzksccePKUvFypTemf3jgd68aDj6IaoM0/IGneuYobgE1pau65izpuBCeNjGtCYrjUnQSA/v3OEGKBemCo6gFwWtA9PP8c6i+mrw6xIwtqW/alrXJU1vYHsdO4ZAavFWr8XxKctqB1uv1KdCC2Hk/hINuJMa0nl/XO1Wg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753837124; c=relaxed/simple; bh=bYYBIY+R2ybqqUZSXToBTBV9G5yjx4LIU3VfXNJeE9s=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=uMPkZ6dGOTp3PwN9COk0D9DAw8CwBSacJfLCG8barYGMvUv0EyX2rO9Qo6Qe4/v3V00+9MMF+4YKrD9O5OQQK1836dDu4LXX6oeEr/kuIdFiG5FopjfJGoJEMzCOWqQHvETUaAvVfdMKo8ZGydNS9FsuwyYnseyiOIbmPtT8zZM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--isaacmanjarres.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=fG4jeJr/; arc=none smtp.client-ip=209.85.215.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--isaacmanjarres.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="fG4jeJr/" Received: by mail-pg1-f201.google.com with SMTP id 41be03b00d2f7-b421d38d138so23584a12.0 for ; Tue, 29 Jul 2025 17:58:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1753837122; x=1754441922; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WHML0fVTZol4DLLRqimo4jzJSjhp3AKS45z0vIHCf/Y=; b=fG4jeJr/MXueyShYCJKLpXPkNXCLyJkZZvZSqrKQ0Ekqe5+SiWgOwE2372Rx0miiRz 8Y/xN6cyqROc1ouRvs/tO7WcP496pZNbErhIGWW+wL4Zz5VbHb26PzNdWXQ46Ozubk5P de5ES52oyhJHd9RqbT4sLnE9UxHIE9dCbvBesE5I7ZTe500M+HJ7LQGANKuUkp9GISEM +B7Xh282wBrkPufvc+R0tmtr7YtCyyjE06z1ZI/fTojI2jQLBOcb/uD3lIKVvfG7RFJl tPGLxqBB0R/oqgCkyjw5F5HZe8kEClQSEt2wVQeP1gqnxzRt7OeHM1/cuxvadMjuwFhc x5qA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753837122; x=1754441922; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WHML0fVTZol4DLLRqimo4jzJSjhp3AKS45z0vIHCf/Y=; b=it0eczAJuttag8P7guQcYATMoA+HWx8SfV8XtrGO9Q8ogRzGdBl0mVwoGntiesLNlq mZ8TdRjuUCmthHIjvVPrQrAJRSigi1nUxofOJMn5tKQyarFsMQpBVej5ff5BKJFE5dvV 74FU7WaNWDVXOrlFYoPEovXLTLK5fOi1UblXuc44QAsEPKiOfaeFajts1eEwyeO84shL KQIGxr7EA03Kb1LYDi7BxtA1+76D3xfXHEEswBbmIkph+0hiyKnLq8sG7349/A4WX7ex 36U04UjDnEHMadn8JTx+qvxiFLErY8ZxrTn4ry23/IJezOWJxcMHpN4+TyrxsWYxhaNA x55A== X-Forwarded-Encrypted: i=1; AJvYcCWWjMMcXTHOmbR9AdwotHV+zUPsxHYKdm+gTF2C71FW6Gwp5b7fi+YU7l8AWOuAZiWz2g/M9dYCB7XvotI=@vger.kernel.org X-Gm-Message-State: AOJu0YwHB1UEHepHq7PpMEYJ65LiJ5QnfaylnxNGVQW7twAIUQXAO/Wi bQbydwGrf6D0nKjsRBlsOBa/mM7GXtQ3XOZY30exqr15a++g9Da2qUsr1C6tcRZc3VZPdO6u+zN fxV2ZmQqu60Ui0IQ0tWnllg24fO2RCFqb44zdFQ== X-Google-Smtp-Source: AGHT+IGKKEl35BJsLTBrUz3LPVkFRFj381DprHeoXqPOiIsnE2zVrxKnAt1I6K+wzrAviCs9wMxmhgog+ERuVKqxrX5d2w== X-Received: from pfbbw8.prod.google.com ([2002:a05:6a00:4088:b0:741:8e1a:2d09]) (user=isaacmanjarres job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:32a6:b0:21d:a9d:ba3b with SMTP id adf61e73a8af0-23dc10a54edmr1953340637.39.1753837122323; Tue, 29 Jul 2025 17:58:42 -0700 (PDT) Date: Tue, 29 Jul 2025 17:58:07 -0700 In-Reply-To: <20250730005818.2793577-1-isaacmanjarres@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250730005818.2793577-1-isaacmanjarres@google.com> X-Mailer: git-send-email 2.50.1.552.g942d659e1b-goog Message-ID: <20250730005818.2793577-3-isaacmanjarres@google.com> Subject: [PATCH 5.4.y 2/3] mm: update memfd seal write check to include F_SEAL_WRITE From: "Isaac J. Manjarres" To: lorenzo.stoakes@oracle.com, gregkh@linuxfoundation.org, Muchun Song , Oscar Salvador , David Hildenbrand , Alexander Viro , Christian Brauner , Jan Kara , Andrew Morton , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Kees Cook , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , "Matthew Wilcox (Oracle)" , Jann Horn , Pedro Falcato , Hugh Dickins , Baolin Wang Cc: aliceryhl@google.com, stable@vger.kernel.org, "Isaac J. Manjarres" , kernel-team@android.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Lorenzo Stoakes , Andy Lutomirski , Mike Kravetz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Lorenzo Stoakes [ Upstream commit 28464bbb2ddc199433383994bcb9600c8034afa1 ] The seal_check_future_write() function is called by shmem_mmap() or hugetlbfs_file_mmap() to disallow any future writable mappings of an memfd sealed this way. The F_SEAL_WRITE flag is not checked here, as that is handled via the mapping->i_mmap_writable mechanism and so any attempt at a mapping would fail before this could be run. However we intend to change this, meaning this check can be performed for F_SEAL_WRITE mappings also. The logic here is equally applicable to both flags, so update this function to accommodate both and rename it accordingly. Link: https://lkml.kernel.org/r/913628168ce6cce77df7d13a63970bae06a526e0.16= 97116581.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes Reviewed-by: Jan Kara Cc: Alexander Viro Cc: Andy Lutomirski Cc: Christian Brauner Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Mike Kravetz Cc: Muchun Song Signed-off-by: Andrew Morton Cc: stable@vger.kernel.org Signed-off-by: Isaac J. Manjarres --- fs/hugetlbfs/inode.c | 2 +- include/linux/mm.h | 15 ++++++++------- mm/shmem.c | 2 +- 3 files changed, 10 insertions(+), 9 deletions(-) diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c index 47b292f9b4f8..c18a47a86e8b 100644 --- a/fs/hugetlbfs/inode.c +++ b/fs/hugetlbfs/inode.c @@ -152,7 +152,7 @@ static int hugetlbfs_file_mmap(struct file *file, struc= t vm_area_struct *vma) vma->vm_flags |=3D VM_HUGETLB | VM_DONTEXPAND; vma->vm_ops =3D &hugetlb_vm_ops; =20 - ret =3D seal_check_future_write(info->seals, vma); + ret =3D seal_check_write(info->seals, vma); if (ret) return ret; =20 diff --git a/include/linux/mm.h b/include/linux/mm.h index 47d56c96447a..57cba6e4fdcd 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2946,25 +2946,26 @@ static inline int pages_identical(struct page *page= 1, struct page *page2) } =20 /** - * seal_check_future_write - Check for F_SEAL_FUTURE_WRITE flag and handle= it + * seal_check_write - Check for F_SEAL_WRITE or F_SEAL_FUTURE_WRITE flags = and + * handle them. * @seals: the seals to check * @vma: the vma to operate on * - * Check whether F_SEAL_FUTURE_WRITE is set; if so, do proper check/handli= ng on - * the vma flags. Return 0 if check pass, or <0 for errors. + * Check whether F_SEAL_WRITE or F_SEAL_FUTURE_WRITE are set; if so, do pr= oper + * check/handling on the vma flags. Return 0 if check pass, or <0 for err= ors. */ -static inline int seal_check_future_write(int seals, struct vm_area_struct= *vma) +static inline int seal_check_write(int seals, struct vm_area_struct *vma) { - if (seals & F_SEAL_FUTURE_WRITE) { + if (seals & (F_SEAL_WRITE | F_SEAL_FUTURE_WRITE)) { /* * New PROT_WRITE and MAP_SHARED mmaps are not allowed when - * "future write" seal active. + * write seals are active. */ if ((vma->vm_flags & VM_SHARED) && (vma->vm_flags & VM_WRITE)) return -EPERM; =20 /* - * Since an F_SEAL_FUTURE_WRITE sealed memfd can be mapped as + * Since an F_SEAL_[FUTURE_]WRITE sealed memfd can be mapped as * MAP_SHARED and read-only, take care to not allow mprotect to * revert protections on such mappings. Do this only for shared * mappings. For private mappings, don't need to mask diff --git a/mm/shmem.c b/mm/shmem.c index 264229680ad7..8475d56f5977 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2215,7 +2215,7 @@ static int shmem_mmap(struct file *file, struct vm_ar= ea_struct *vma) struct shmem_inode_info *info =3D SHMEM_I(file_inode(file)); int ret; =20 - ret =3D seal_check_future_write(info->seals, vma); + ret =3D seal_check_write(info->seals, vma); if (ret) return ret; =20 --=20 2.50.1.552.g942d659e1b-goog From nobody Sun Oct 5 20:02:14 2025 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A87E187FEC for ; Wed, 30 Jul 2025 00:58:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753837140; cv=none; b=L6MdqFrWkHzJCetiDhEmGVw+PQWtLlKblm3y+XKQxozpDh4Quddzf7N2H3baRKK/XXOGy3Z1iJ5nz8fUTGWD+3zg6OX4LRj8YtTrfjWItqao2KeA43aE4TcBWwRevt1jvdeRLSX+kPcHpqge82belsHx2OaksuQNOZ01cUkevN4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1753837140; c=relaxed/simple; bh=52dGNsfCzYMNb7XqIBXnTCNHJh+uI7s3tVSTq3e1Zzs=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=UM72phakmsJopk22oUX12MynuYkD86DKYEr/AmdkFz61qEkYh7zX3KLUQ2RmLJeMHeJuMarGUXTA4X6Gu78k9y+o1GlteaOkSNR8VlQJshlOp67F1lrBXHA9IimhDWp15CN9x9tuUzj07E70849MbaLGKHMkhaJZrB6JCVlPTmI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--isaacmanjarres.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Q0MaheOB; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--isaacmanjarres.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Q0MaheOB" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-31edd69d754so2918754a91.1 for ; Tue, 29 Jul 2025 17:58:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1753837138; x=1754441938; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=jDQwseaUqKXgQMG0PL1emMIiquT/LrlyEZB+aSduFZ4=; b=Q0MaheOBcbEG6J8+lt/BqQSuEcsw7jVNhnVHmOLbnMJVHdfxQcISW1u+7F+eJCNhgm 3aWS7b2hPt0rdLxLVF/imiQdCJsBwvseeJyQzdcM1ScupDFcQPqX7hiCamVpQ3dTYrTo faQpQ+7b3M292knHogrqwwV9V4C7NApQWEOH/z2WWPSRm2Ff5pmHTYn/+QFH/3ePL0JZ Ofetf0TwdpK8CZBjtrNgIapV9o0W1p+/akoLVBwyQX1fnhgBNXRufiku3/PzfJcHlP2L CaWQwj9+mtVxlPIC3CgThVonCl6OA/YMYayOhqqoAJ/BGjoJtsTmOUVvGkfZAJLxb7tg 8t9w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1753837138; x=1754441938; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=jDQwseaUqKXgQMG0PL1emMIiquT/LrlyEZB+aSduFZ4=; b=QoQFElWJZTZ+6xCeR2X6c75vnNzZfuxWbfrpg/f5cmgQnMl8p3fo1J0+jj5i34GUoH T3FKBMCx6OGBDp9vyUIwhOCChwP/2zYcVM2Uwl8Qmvn6s4v8VUX1MKSylTaGIouu+88n XGdSXdcDoc2J6ZKn5GIPOyFHChVPKjdDQncESrZJnlQpG9uik87w0WR6pkV+atUlEMaW f3SD1hMu5CetSFrWMLPJeC4LlhP9oWwzbOqP0O7T3vsFGUDrS/zF6FN09buEcc7xsnWW yLP46sRt6YqwYpHeI4fw7nnSxQ5r/bCM/efbP7rLWajEBr5I8/N0GuADZsGTrEHiuRpP 9iyA== X-Forwarded-Encrypted: i=1; AJvYcCW+sqxe0/Kg6UtJd+4pObmw1an2N0s7ujrTz5p5jnnyO0AR0pb0d4p1+KrdrZrpb4BSBdUm41j+AJDpImI=@vger.kernel.org X-Gm-Message-State: AOJu0Yzyc9HvV9AUXMG0lHSA1YjMqPAjA5VU5VMC/7tIVgvKJ018PEDD Oi1jo1wxrLD9W585KvosfD/lsvZKThTq/R/4CcA8arQ1pv7ziIkFVcN13kBMM+b6sj2rKA/G3eX RqTH58Q1AUxEz2ES1U+AmfQFwrijUjzYQWLrsqA== X-Google-Smtp-Source: AGHT+IEGVoBAYELfdRdZ8xP5oRpoU0bK+jZlnzB/TmwUAobgxp21zhLfsqJQJKUf5/TTCLpNwHYxnC3kjLaDYc3GnzINcA== X-Received: from pjbpm8.prod.google.com ([2002:a17:90b:3c48:b0:31e:a865:8b32]) (user=isaacmanjarres job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:4b83:b0:31c:3872:9411 with SMTP id 98e67ed59e1d1-31f5de63c28mr2092662a91.33.1753837138329; Tue, 29 Jul 2025 17:58:58 -0700 (PDT) Date: Tue, 29 Jul 2025 17:58:08 -0700 In-Reply-To: <20250730005818.2793577-1-isaacmanjarres@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250730005818.2793577-1-isaacmanjarres@google.com> X-Mailer: git-send-email 2.50.1.552.g942d659e1b-goog Message-ID: <20250730005818.2793577-4-isaacmanjarres@google.com> Subject: [PATCH 5.4.y 3/3] mm: perform the mapping_map_writable() check after call_mmap() From: "Isaac J. Manjarres" To: lorenzo.stoakes@oracle.com, gregkh@linuxfoundation.org, Muchun Song , Oscar Salvador , David Hildenbrand , Alexander Viro , Christian Brauner , Jan Kara , Andrew Morton , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Kees Cook , Ingo Molnar , Peter Zijlstra , Juri Lelli , Vincent Guittot , Dietmar Eggemann , Steven Rostedt , Ben Segall , Mel Gorman , Valentin Schneider , "Matthew Wilcox (Oracle)" , Jann Horn , Pedro Falcato , Hugh Dickins , Baolin Wang Cc: aliceryhl@google.com, stable@vger.kernel.org, "Isaac J. Manjarres" , kernel-team@android.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, Lorenzo Stoakes , Andy Lutomirski , Mike Kravetz Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Lorenzo Stoakes [ Upstream commit 158978945f3173b8c1a88f8c5684a629736a57ac ] In order for a F_SEAL_WRITE sealed memfd mapping to have an opportunity to clear VM_MAYWRITE, we must be able to invoke the appropriate vm_ops->mmap() handler to do so. We would otherwise fail the mapping_map_writable() check before we had the opportunity to avoid it. This patch moves this check after the call_mmap() invocation. Only memfd actively denies write access causing a potential failure here (in memfd_add_seals()), so there should be no impact on non-memfd cases. This patch makes the userland-visible change that MAP_SHARED, PROT_READ mappings of an F_SEAL_WRITE sealed memfd mapping will now succeed. There is a delicate situation with cleanup paths assuming that a writable mapping must have occurred in circumstances where it may now not have. In order to ensure we do not accidentally mark a writable file unwritable by mistake, we explicitly track whether we have a writable mapping and unmap only if we do. [lstoakes@gmail.com: do not set writable_file_mapping in inappropriate case] Link: https://lkml.kernel.org/r/c9eb4cc6-7db4-4c2b-838d-43a0b319a4f0@luci= fer.local Link: https://bugzilla.kernel.org/show_bug.cgi?id=3D217238 Link: https://lkml.kernel.org/r/55e413d20678a1bb4c7cce889062bbb07b0df892.16= 97116581.git.lstoakes@gmail.com Signed-off-by: Lorenzo Stoakes Reviewed-by: Jan Kara Cc: Alexander Viro Cc: Andy Lutomirski Cc: Christian Brauner Cc: Hugh Dickins Cc: Matthew Wilcox (Oracle) Cc: Mike Kravetz Cc: Muchun Song Signed-off-by: Andrew Morton Cc: stable@vger.kernel.org [isaacmanjarres: added error handling to cleanup the work done by the mmap() callback and removed unused label.] Signed-off-by: Isaac J. Manjarres --- mm/mmap.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/mm/mmap.c b/mm/mmap.c index cb712ae731cd..e591a82a26a8 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1718,6 +1718,7 @@ unsigned long mmap_region(struct file *file, unsigned= long addr, { struct mm_struct *mm =3D current->mm; struct vm_area_struct *vma, *prev; + bool writable_file_mapping =3D false; int error; struct rb_node **rb_link, *rb_parent; unsigned long charged =3D 0; @@ -1785,11 +1786,6 @@ unsigned long mmap_region(struct file *file, unsigne= d long addr, if (error) goto free_vma; } - if (is_shared_maywrite(vm_flags)) { - error =3D mapping_map_writable(file->f_mapping); - if (error) - goto allow_write_and_free_vma; - } =20 /* ->mmap() can change vma->vm_file, but must guarantee that * vma_link() below can deny write-access if VM_DENYWRITE is set @@ -1801,6 +1797,14 @@ unsigned long mmap_region(struct file *file, unsigne= d long addr, if (error) goto unmap_and_free_vma; =20 + if (vma_is_shared_maywrite(vma)) { + error =3D mapping_map_writable(file->f_mapping); + if (error) + goto close_and_free_vma; + + writable_file_mapping =3D true; + } + /* Can addr have changed?? * * Answer: Yes, several device drivers can do it in their @@ -1823,7 +1827,7 @@ unsigned long mmap_region(struct file *file, unsigned= long addr, vma_link(mm, vma, prev, rb_link, rb_parent); /* Once vma denies write, undo our temporary denial count */ if (file) { - if (is_shared_maywrite(vm_flags)) + if (writable_file_mapping) mapping_unmap_writable(file->f_mapping); if (vm_flags & VM_DENYWRITE) allow_write_access(file); @@ -1858,15 +1862,17 @@ unsigned long mmap_region(struct file *file, unsign= ed long addr, =20 return addr; =20 +close_and_free_vma: + if (vma->vm_ops && vma->vm_ops->close) + vma->vm_ops->close(vma); unmap_and_free_vma: vma->vm_file =3D NULL; fput(file); =20 /* Undo any partial mapping done by a device driver. */ unmap_region(mm, vma, prev, vma->vm_start, vma->vm_end); - if (is_shared_maywrite(vm_flags)) + if (writable_file_mapping) mapping_unmap_writable(file->f_mapping); -allow_write_and_free_vma: if (vm_flags & VM_DENYWRITE) allow_write_access(file); free_vma: --=20 2.50.1.552.g942d659e1b-goog