From nobody Mon Apr 6 13:30:00 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4869D3FB076; Thu, 19 Mar 2026 18:24:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773944660; cv=none; b=RO7iAdDPFxRQ8+vVybZIPn3MAs4oPDcFGWTKIuaWtgSF5vTZLy7SwUwNs+czeF8QJtv4FkLjYQ5h6urgV2ZeZ57sX3S4tRl501YWOy/0ehj+vRE8Av2ODjX+HtwD2/5u30ARrGwEUPGc33BNkBSUGx+Zc6eTdtiOI+TW6rslhnw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773944660; c=relaxed/simple; bh=xhLlmwR4AUj9PhlewTo4sEx2FcR6iF+ofwi5SoZX8Hk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XWi9r49KfETE3IHT2E3kEzfDKr7Ajwn98qDhjkvexBsTXmxa8/cWt7K69R4oo/si0ATmGfn9lKcRo3BSrnsr/Y6f8aGl0So1UakYlzsgl3L+o1npiP9INrYBZll4Lzq6A58vWMLnaji//Hw9ebLx4gnfDn/ajjSnRQCUv990O2Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=mZgkP81+; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="mZgkP81+" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8234FC2BCB1; Thu, 19 Mar 2026 18:24:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1773944660; bh=xhLlmwR4AUj9PhlewTo4sEx2FcR6iF+ofwi5SoZX8Hk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=mZgkP81+To5o4+opP9DVY1B5L2QIqkffzZEmmi0qbgQF8ZVYK6KsNBItOdUPSj8Nw bMAJAsZplTZg309A7z2OhoR8cJq1hs3Voltr+9XPsXqcr6B8T7QpMIUib8B1mw5zpn zoJGI26UscPsQWtn1lV1WfrFgdQq/o2bKzW0dF1zLlzxZsvheOTfctL2OpkWYsLPm6 nf4EakEPzGXz2D3JzZ3dRo797Ig/ffZ8ia8X/aRBpZVx1/2wLN18YVJKawCcJFWUqK 28ck4MYHjmCzOUK3wFG9x513U6FnnHUq3xwOTyMHg9ARTMeUrZvVAYbymNlAVqMYPt dO8wscgfJc1bg== From: "Lorenzo Stoakes (Oracle)" To: Andrew Morton Cc: Jonathan Corbet , Clemens Ladisch , Arnd Bergmann , Greg Kroah-Hartman , "K . Y . Srinivasan" , Haiyang Zhang , Wei Liu , Dexuan Cui , Long Li , Alexander Shishkin , Maxime Coquelin , Alexandre Torgue , Miquel Raynal , Richard Weinberger , Vignesh Raghavendra , Bodo Stroesser , "Martin K . Petersen" , David Howells , Marc Dionne , Alexander Viro , Christian Brauner , Jan Kara , David Hildenbrand , "Liam R . Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko , Jann Horn , Pedro Falcato , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-hyperv@vger.kernel.org, linux-stm32@st-md-mailman.stormreply.com, linux-arm-kernel@lists.infradead.org, linux-mtd@lists.infradead.org, linux-staging@lists.linux.dev, linux-scsi@vger.kernel.org, target-devel@vger.kernel.org, linux-afs@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, Ryan Roberts Subject: [PATCH v3 12/16] mm: allow handling of stacked mmap_prepare hooks in more drivers Date: Thu, 19 Mar 2026 18:23:36 +0000 Message-ID: <05b4d97d6248777a827f17445760b460cd293cb4.1773944114.git.ljs@kernel.org> X-Mailer: git-send-email 2.53.0 In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" While the conversion of mmap hooks to mmap_prepare is underway, we will encounter situations where mmap hooks need to invoke nested mmap_prepare hooks. The nesting of mmap hooks is termed 'stacking'. In order to flexibly facilitate the conversion of custom mmap hooks in drivers which stack, we must split up the existing __compat_vma_mmap() function into two separate functions: * compat_set_desc_from_vma() - This allows the setting of a vm_area_desc object's fields to the relevant fields of a VMA. * __compat_vma_mmap() - Once an mmap_prepare hook has been executed upon a vm_area_desc object, this function performs any mmap actions specified by the mmap_prepare hook and then invokes its vm_ops->mapped() hook if any were specified. In ordinary cases, where a file's f_op->mmap_prepare() hook simply needs to be invoked in a stacked mmap() hook, compat_vma_mmap() can be used. However some drivers define their own nested hooks, which are invoked in turn by another hook. A concrete example is vmbus_channel->mmap_ring_buffer(), which is invoked in turn by bin_attribute->mmap(): vmbus_channel->mmap_ring_buffer() has a signature of: int (*mmap_ring_buffer)(struct vmbus_channel *channel, struct vm_area_struct *vma); And bin_attribute->mmap() has a signature of: int (*mmap)(struct file *, struct kobject *, const struct bin_attribute *attr, struct vm_area_struct *vma); And so compat_vma_mmap() cannot be used here for incremental conversion of hooks from mmap() to mmap_prepare(). There are many such instances like this, where conversion to mmap_prepare would otherwise cascade to a huge change set due to nesting of this kind. The changes in this patch mean we could now instead convert vmbus_channel->mmap_ring_buffer() to vmbus_channel->mmap_prepare_ring_buffer(), and implement something like: struct vm_area_desc desc; int err; compat_set_desc_from_vma(&desc, file, vma); err =3D channel->mmap_prepare_ring_buffer(channel, &desc); if (err) return err; return __compat_vma_mmap(&desc, vma); Allowing us to incrementally update this logic, and other logic like it. Unfortunately, as part of this change, we need to be able to flexibly assign to the VMA descriptor, so have to remove some of the const declarations within the structure. Also update the VMA tests to reflect the changes. Signed-off-by: Lorenzo Stoakes (Oracle) --- include/linux/fs.h | 3 + include/linux/mm_types.h | 4 +- mm/util.c | 112 ++++++++++++++++++++++--------- mm/vma.h | 2 +- tools/testing/vma/include/dup.h | 114 +++++++++++++++++++++----------- 5 files changed, 162 insertions(+), 73 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index c390f5c667e3..0bdccfa70b44 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -2058,6 +2058,9 @@ static inline bool can_mmap_file(struct file *file) return true; } =20 +void compat_set_desc_from_vma(struct vm_area_desc *desc, const struct file= *file, + const struct vm_area_struct *vma); +int __compat_vma_mmap(struct vm_area_desc *desc, struct vm_area_struct *vm= a); int compat_vma_mmap(struct file *file, struct vm_area_struct *vma); int __vma_check_mmap_hook(struct vm_area_struct *vma); =20 diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 50685cf29792..7538d64f8848 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -891,8 +891,8 @@ static __always_inline bool vma_flags_empty(vma_flags_t= *flags) */ struct vm_area_desc { /* Immutable state. */ - const struct mm_struct *const mm; - struct file *const file; /* May vary from vm_file in stacked callers. */ + struct mm_struct *mm; + struct file *file; /* May vary from vm_file in stacked callers. */ unsigned long start; unsigned long end; =20 diff --git a/mm/util.c b/mm/util.c index 879ba62b5f0c..8cf59267a9ac 100644 --- a/mm/util.c +++ b/mm/util.c @@ -1163,34 +1163,40 @@ void flush_dcache_folio(struct folio *folio) EXPORT_SYMBOL(flush_dcache_folio); #endif =20 -static int __compat_vma_mmap(struct file *file, struct vm_area_struct *vma) +/** + * compat_set_desc_from_vma() - assigns VMA descriptor @desc fields from a= VMA. + * @desc: A VMA descriptor whose fields need to be set. + * @file: The file object describing the file being mmap()'d. + * @vma: The VMA whose fields we wish to assign to @desc. + * + * This is a compatibility function to allow an mmap() hook to call + * mmap_prepare() hooks when drivers nest these. This function specifically + * allows the construction of a vm_area_desc value, @desc, from a VMA @vma= for + * the purposes of doing this. + * + * Once the conversion of drivers is complete this function will no longer= be + * required and will be removed. + */ +void compat_set_desc_from_vma(struct vm_area_desc *desc, + const struct file *file, + const struct vm_area_struct *vma) { - struct vm_area_desc desc =3D { - .mm =3D vma->vm_mm, - .file =3D file, - .start =3D vma->vm_start, - .end =3D vma->vm_end, + memset(desc, 0, sizeof(*desc)); =20 - .pgoff =3D vma->vm_pgoff, - .vm_file =3D vma->vm_file, - .vma_flags =3D vma->flags, - .page_prot =3D vma->vm_page_prot, - - .action.type =3D MMAP_NOTHING, /* Default */ - }; - int err; + desc->mm =3D vma->vm_mm; + desc->file =3D (struct file *)file; + desc->start =3D vma->vm_start; + desc->end =3D vma->vm_end; =20 - err =3D vfs_mmap_prepare(file, &desc); - if (err) - return err; + desc->pgoff =3D vma->vm_pgoff; + desc->vm_file =3D vma->vm_file; + desc->vma_flags =3D vma->flags; + desc->page_prot =3D vma->vm_page_prot; =20 - err =3D mmap_action_prepare(&desc); - if (err) - return err; - - set_vma_from_desc(vma, &desc); - return mmap_action_complete(vma, &desc.action, /*rmap_lock_held=3D*/false= ); + /* Default. */ + desc->action.type =3D MMAP_NOTHING; } +EXPORT_SYMBOL(compat_set_desc_from_vma); =20 static int __compat_vma_mapped(struct file *file, struct vm_area_struct *v= ma) { @@ -1211,6 +1217,50 @@ static int __compat_vma_mapped(struct file *file, st= ruct vm_area_struct *vma) return err; } =20 +/** + * __compat_vma_mmap() - Similar to compat_vma_mmap(), only it allows + * flexibility as to how the mmap_prepare callback is invoked, which is us= eful + * for drivers which invoke nested mmap_prepare callbacks in an mmap() hoo= k. + * @desc: A VMA descriptor upon which an mmap_prepare() hook has already b= een + * executed. + * @vma: The VMA to which @desc should be applied. + * + * The function assumes that you have obtained a VMA descriptor @desc from + * compat_set_desc_from_vma(), and already executed the mmap_prepare() hoo= k upon + * it. + * + * It then performs any specified mmap actions, and invokes the vm_ops->ma= pped() + * hook if one is present. + * + * See the description of compat_vma_mmap() for more details. + * + * Once the conversion of drivers is complete this function will no longer= be + * required and will be removed. + * + * Returns: 0 on success or error. + */ +int __compat_vma_mmap(struct vm_area_desc *desc, + struct vm_area_struct *vma) +{ + int err; + + /* Perform any preparatory tasks for mmap action. */ + err =3D mmap_action_prepare(desc); + if (err) + return err; + /* Update the VMA from the descriptor. */ + compat_set_vma_from_desc(vma, desc); + /* Complete any specified mmap actions. */ + err =3D mmap_action_complete(vma, &desc->action, + /*rmap_lock_held=3D*/false); + if (err) + return err; + + /* Invoke vm_ops->mapped callback. */ + return __compat_vma_mapped(desc->file, vma); +} +EXPORT_SYMBOL(__compat_vma_mmap); + /** * compat_vma_mmap() - Apply the file's .mmap_prepare() hook to an * existing VMA and execute any requested actions. @@ -1218,10 +1268,10 @@ static int __compat_vma_mapped(struct file *file, s= truct vm_area_struct *vma) * @vma: The VMA to apply the .mmap_prepare() hook to. * * Ordinarily, .mmap_prepare() is invoked directly upon mmap(). However, c= ertain - * stacked filesystems invoke a nested mmap hook of an underlying file. + * stacked drivers invoke a nested mmap hook of an underlying file. * - * Until all filesystems are converted to use .mmap_prepare(), we must be - * conservative and continue to invoke these stacked filesystems using the + * Until all drivers are converted to use .mmap_prepare(), we must be + * conservative and continue to invoke these stacked drivers using the * deprecated .mmap() hook. * * However we have a problem if the underlying file system possesses an @@ -1232,20 +1282,22 @@ static int __compat_vma_mapped(struct file *file, s= truct vm_area_struct *vma) * establishes a struct vm_area_desc descriptor, passes to the underlying * .mmap_prepare() hook and applies any changes performed by it. * - * Once the conversion of filesystems is complete this function will no lo= nger - * be required and will be removed. + * Once the conversion of drivers is complete this function will no longer= be + * required and will be removed. * * Returns: 0 on success or error. */ int compat_vma_mmap(struct file *file, struct vm_area_struct *vma) { + struct vm_area_desc desc; int err; =20 - err =3D __compat_vma_mmap(file, vma); + compat_set_desc_from_vma(&desc, file, vma); + err =3D vfs_mmap_prepare(file, &desc); if (err) return err; =20 - return __compat_vma_mapped(file, vma); + return __compat_vma_mmap(&desc, vma); } EXPORT_SYMBOL(compat_vma_mmap); =20 diff --git a/mm/vma.h b/mm/vma.h index adc18f7dd9f1..a76046c39b14 100644 --- a/mm/vma.h +++ b/mm/vma.h @@ -300,7 +300,7 @@ static inline int vma_iter_store_gfp(struct vma_iterato= r *vmi, * f_op->mmap() but which might have an underlying file system which imple= ments * f_op->mmap_prepare(). */ -static inline void set_vma_from_desc(struct vm_area_struct *vma, +static inline void compat_set_vma_from_desc(struct vm_area_struct *vma, struct vm_area_desc *desc) { /* diff --git a/tools/testing/vma/include/dup.h b/tools/testing/vma/include/du= p.h index 1b86c34e1158..1f123704078e 100644 --- a/tools/testing/vma/include/dup.h +++ b/tools/testing/vma/include/dup.h @@ -519,8 +519,8 @@ enum vma_operation { */ struct vm_area_desc { /* Immutable state. */ - const struct mm_struct *const mm; - struct file *const file; /* May vary from vm_file in stacked callers. */ + struct mm_struct *mm; + struct file *file; /* May vary from vm_file in stacked callers. */ unsigned long start; unsigned long end; =20 @@ -1272,43 +1272,95 @@ static inline void vma_set_anonymous(struct vm_area= _struct *vma) } =20 /* Declared in vma.h. */ -static inline void set_vma_from_desc(struct vm_area_struct *vma, +static inline void compat_set_vma_from_desc(struct vm_area_struct *vma, struct vm_area_desc *desc); =20 -static inline int __compat_vma_mmap(const struct file_operations *f_op, - struct file *file, struct vm_area_struct *vma) +static inline void compat_set_desc_from_vma(struct vm_area_desc *desc, + const struct file *file, + const struct vm_area_struct *vma) { - struct vm_area_desc desc =3D { - .mm =3D vma->vm_mm, - .file =3D file, - .start =3D vma->vm_start, - .end =3D vma->vm_end, + memset(desc, 0, sizeof(*desc)); =20 - .pgoff =3D vma->vm_pgoff, - .vm_file =3D vma->vm_file, - .vma_flags =3D vma->flags, - .page_prot =3D vma->vm_page_prot, + desc->mm =3D vma->vm_mm; + desc->file =3D (struct file *)file; + desc->start =3D vma->vm_start; + desc->end =3D vma->vm_end; =20 - .action.type =3D MMAP_NOTHING, /* Default */ - }; + desc->pgoff =3D vma->vm_pgoff; + desc->vm_file =3D vma->vm_file; + desc->vma_flags =3D vma->flags; + desc->page_prot =3D vma->vm_page_prot; + + /* Default. */ + desc->action.type =3D MMAP_NOTHING; +} + +static inline unsigned long vma_pages(const struct vm_area_struct *vma) +{ + return (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; +} + +static inline void unmap_vma_locked(struct vm_area_struct *vma) +{ + const size_t len =3D vma_pages(vma) << PAGE_SHIFT; + + mmap_assert_write_locked(vma->vm_mm); + do_munmap(vma->vm_mm, vma->vm_start, len, NULL); +} + +static inline int __compat_vma_mapped(struct file *file, struct vm_area_st= ruct *vma) +{ + const struct vm_operations_struct *vm_ops =3D vma->vm_ops; int err; =20 - err =3D f_op->mmap_prepare(&desc); + if (!vm_ops->mapped) + return 0; + + err =3D vm_ops->mapped(vma->vm_start, vma->vm_end, vma->vm_pgoff, file, + &vma->vm_private_data); if (err) - return err; + unmap_vma_locked(vma); + return err; +} =20 - err =3D mmap_action_prepare(&desc); +static inline int __compat_vma_mmap(struct vm_area_desc *desc, + struct vm_area_struct *vma) +{ + int err; + + /* Perform any preparatory tasks for mmap action. */ + err =3D mmap_action_prepare(desc); + if (err) + return err; + /* Update the VMA from the descriptor. */ + compat_set_vma_from_desc(vma, desc); + /* Complete any specified mmap actions. */ + err =3D mmap_action_complete(vma, &desc->action, + /*rmap_lock_held=3D*/false); if (err) return err; =20 - set_vma_from_desc(vma, &desc); - return mmap_action_complete(vma, &desc.action, /*rmap_lock_held=3D*/false= ); + /* Invoke vm_ops->mapped callback. */ + return __compat_vma_mapped(desc->file, vma); +} + +static inline int vfs_mmap_prepare(struct file *file, struct vm_area_desc = *desc) +{ + return file->f_op->mmap_prepare(desc); } =20 static inline int compat_vma_mmap(struct file *file, struct vm_area_struct *vma) { - return __compat_vma_mmap(file->f_op, file, vma); + struct vm_area_desc desc; + int err; + + compat_set_desc_from_vma(&desc, file, vma); + err =3D vfs_mmap_prepare(file, &desc); + if (err) + return err; + + return __compat_vma_mmap(&desc, vma); } =20 =20 @@ -1318,11 +1370,6 @@ static inline void vma_iter_init(struct vma_iterator= *vmi, mas_init(&vmi->mas, &mm->mm_mt, addr); } =20 -static inline unsigned long vma_pages(struct vm_area_struct *vma) -{ - return (vma->vm_end - vma->vm_start) >> PAGE_SHIFT; -} - static inline void mmap_assert_locked(struct mm_struct *); static inline struct vm_area_struct *find_vma_intersection(struct mm_struc= t *mm, unsigned long start_addr, @@ -1492,11 +1539,6 @@ static inline int vfs_mmap(struct file *file, struct= vm_area_struct *vma) return file->f_op->mmap(file, vma); } =20 -static inline int vfs_mmap_prepare(struct file *file, struct vm_area_desc = *desc) -{ - return file->f_op->mmap_prepare(desc); -} - static inline void vma_set_file(struct vm_area_struct *vma, struct file *f= ile) { /* Changing an anonymous vma with this is illegal */ @@ -1521,11 +1563,3 @@ static inline pgprot_t vma_get_page_prot(vma_flags_t= vma_flags) =20 return vm_get_page_prot(vm_flags); } - -static inline void unmap_vma_locked(struct vm_area_struct *vma) -{ - const size_t len =3D vma_pages(vma) << PAGE_SHIFT; - - mmap_assert_write_locked(vma->vm_mm); - do_munmap(vma->vm_mm, vma->vm_start, len, NULL); -} --=20 2.53.0