[PATCH 2/2] mm: memfd_luo: preserve file seals

Pratyush Yadav posted 2 patches 2 weeks ago
[PATCH 2/2] mm: memfd_luo: preserve file seals
Posted by Pratyush Yadav 2 weeks ago
From: "Pratyush Yadav (Google)" <pratyush@kernel.org>

File seals are used on memfd for making shared memory communication with
untrusted peers safer and simpler. Seals provide a guarantee that
certain operations won't be allowed on the file such as writes or
truncations. Maintaining these guarantees across a live update will help
keeping such use cases secure.

These guarantees will also be needed for IOMMUFD preservation with LUO.
Normally when IOMMUFD maps a memfd, it pins all its pages to make sure
any truncation operations on the memfd don't lead to IOMMUFD using freed
memory. This doesn't work with LUO since the preserved memfd might have
completely different pages after a live update, and mapping them back to
the IOMMUFD will cause all sorts of problems. Using and preserving the
seals allows IOMMUFD preservation logic to trust the memfd.

Preserve the seals by introducing a new 8-bit-wide bitfield. There are
currently only 6 possible seals but 2 extra bits are used to provide
room for future expansion. Since the seals are UAPI, it is safe to use
them directly in the ABI.

Back the 8-bit field with a u64, leaving 56 unused bits. This is done to
keep the struct nice and aligned. The unused bits can be used to add new
flags later, potentially without even needing to bump the version
number.

Since the serialization structure is changed, bump the version number to
"memfd-v2".

Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
---
 include/linux/kho/abi/memfd.h |  9 ++++++++-
 mm/memfd_luo.c                | 23 +++++++++++++++++++++--
 2 files changed, 29 insertions(+), 3 deletions(-)

diff --git a/include/linux/kho/abi/memfd.h b/include/linux/kho/abi/memfd.h
index 68cb6303b846..bd549c81f1d2 100644
--- a/include/linux/kho/abi/memfd.h
+++ b/include/linux/kho/abi/memfd.h
@@ -60,6 +60,11 @@ struct memfd_luo_folio_ser {
  * struct memfd_luo_ser - Main serialization structure for a memfd.
  * @pos:       The file's current position (f_pos).
  * @size:      The total size of the file in bytes (i_size).
+ * @seals:     The seals present on the memfd. The seals are UAPI so it is safe
+ *             to directly use them in the ABI. Note: currently there are 6
+ *             seals possible but this field is 8 bits to leave room for future
+ *             expansion.
+ * @__reserved: Reserved bits. May be used later to add more flags.
  * @nr_folios: Number of folios in the folios array.
  * @folios:    KHO vmalloc descriptor pointing to the array of
  *             struct memfd_luo_folio_ser.
@@ -67,11 +72,13 @@ struct memfd_luo_folio_ser {
 struct memfd_luo_ser {
 	u64 pos;
 	u64 size;
+	u64 seals:8;
+	u64 __reserved:56;
 	u64 nr_folios;
 	struct kho_vmalloc folios;
 } __packed;
 
 /* The compatibility string for memfd file handler */
-#define MEMFD_LUO_FH_COMPATIBLE	"memfd-v1"
+#define MEMFD_LUO_FH_COMPATIBLE	"memfd-v2"
 
 #endif /* _LINUX_KHO_ABI_MEMFD_H */
diff --git a/mm/memfd_luo.c b/mm/memfd_luo.c
index a34fccc23b6a..eb68e0b5457f 100644
--- a/mm/memfd_luo.c
+++ b/mm/memfd_luo.c
@@ -79,6 +79,8 @@
 #include <linux/shmem_fs.h>
 #include <linux/vmalloc.h>
 #include <linux/memfd.h>
+#include <uapi/linux/memfd.h>
+
 #include "internal.h"
 
 static int memfd_luo_preserve_folios(struct file *file,
@@ -222,7 +224,7 @@ static int memfd_luo_preserve(struct liveupdate_file_op_args *args)
 	struct memfd_luo_folio_ser *folios_ser;
 	struct memfd_luo_ser *ser;
 	u64 nr_folios;
-	int err = 0;
+	int err = 0, seals;
 
 	inode_lock(inode);
 	shmem_freeze(inode, true);
@@ -234,8 +236,15 @@ static int memfd_luo_preserve(struct liveupdate_file_op_args *args)
 		goto err_unlock;
 	}
 
+	seals = memfd_get_seals(args->file);
+	if (seals < 0) {
+		err = seals;
+		goto err_free_ser;
+	}
+
 	ser->pos = args->file->f_pos;
 	ser->size = i_size_read(inode);
+	ser->seals = seals;
 
 	err = memfd_luo_preserve_folios(args->file, &ser->folios,
 					&folios_ser, &nr_folios);
@@ -444,13 +453,23 @@ static int memfd_luo_retrieve(struct liveupdate_file_op_args *args)
 	if (!ser)
 		return -EINVAL;
 
-	file = memfd_alloc_file("", 0);
+	/*
+	 * The seals are preserved. Allow sealing here so they can be added
+	 * later.
+	 */
+	file = memfd_alloc_file("", MFD_ALLOW_SEALING);
 	if (IS_ERR(file)) {
 		pr_err("failed to setup file: %pe\n", file);
 		err = PTR_ERR(file);
 		goto free_ser;
 	}
 
+	err = memfd_add_seals(file, ser->seals);
+	if (err) {
+		pr_err("failed to add seals: %pe\n", ERR_PTR(err));
+		goto put_file;
+	}
+
 	vfs_setpos(file, ser->pos, MAX_LFS_FILESIZE);
 	file->f_inode->i_size = ser->size;
 
-- 
2.52.0.457.g6b5491de43-goog
Re: [PATCH 2/2] mm: memfd_luo: preserve file seals
Posted by Mike Rapoport 1 week, 5 days ago
On Fri, Jan 23, 2026 at 10:58:51AM +0100, Pratyush Yadav wrote:
> From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
> 
> File seals are used on memfd for making shared memory communication with
> untrusted peers safer and simpler. Seals provide a guarantee that
> certain operations won't be allowed on the file such as writes or
> truncations. Maintaining these guarantees across a live update will help
> keeping such use cases secure.
> 
> These guarantees will also be needed for IOMMUFD preservation with LUO.
> Normally when IOMMUFD maps a memfd, it pins all its pages to make sure
> any truncation operations on the memfd don't lead to IOMMUFD using freed
> memory. This doesn't work with LUO since the preserved memfd might have
> completely different pages after a live update, and mapping them back to
> the IOMMUFD will cause all sorts of problems. Using and preserving the
> seals allows IOMMUFD preservation logic to trust the memfd.
> 
> Preserve the seals by introducing a new 8-bit-wide bitfield. There are
> currently only 6 possible seals but 2 extra bits are used to provide
> room for future expansion. Since the seals are UAPI, it is safe to use
> them directly in the ABI.
> 
> Back the 8-bit field with a u64, leaving 56 unused bits. This is done to
> keep the struct nice and aligned. The unused bits can be used to add new
> flags later, potentially without even needing to bump the version
> number.
> 
> Since the serialization structure is changed, bump the version number to
> "memfd-v2".
> 
> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
> ---
>  include/linux/kho/abi/memfd.h |  9 ++++++++-
>  mm/memfd_luo.c                | 23 +++++++++++++++++++++--
>  2 files changed, 29 insertions(+), 3 deletions(-)
> 
> diff --git a/include/linux/kho/abi/memfd.h b/include/linux/kho/abi/memfd.h
> index 68cb6303b846..bd549c81f1d2 100644
> --- a/include/linux/kho/abi/memfd.h
> +++ b/include/linux/kho/abi/memfd.h
> @@ -60,6 +60,11 @@ struct memfd_luo_folio_ser {
>   * struct memfd_luo_ser - Main serialization structure for a memfd.
>   * @pos:       The file's current position (f_pos).
>   * @size:      The total size of the file in bytes (i_size).
> + * @seals:     The seals present on the memfd. The seals are UAPI so it is safe
> + *             to directly use them in the ABI. Note: currently there are 6
> + *             seals possible but this field is 8 bits to leave room for future
> + *             expansion.
> + * @__reserved: Reserved bits. May be used later to add more flags.
>   * @nr_folios: Number of folios in the folios array.
>   * @folios:    KHO vmalloc descriptor pointing to the array of
>   *             struct memfd_luo_folio_ser.
> @@ -67,11 +72,13 @@ struct memfd_luo_folio_ser {
>  struct memfd_luo_ser {
>  	u64 pos;
>  	u64 size;
> +	u64 seals:8;

Kernel uABI defines seals as unsigned int, I think we can spare u32 for
them and reserve a u32 flags for other memfd flags (MFD_CLOEXEC,
MFD_HUGETLB etc).

> +	u64 __reserved:56;
>  	u64 nr_folios;
>  	struct kho_vmalloc folios;
>  } __packed;
>  
>  /* The compatibility string for memfd file handler */
> -#define MEMFD_LUO_FH_COMPATIBLE	"memfd-v1"
> +#define MEMFD_LUO_FH_COMPATIBLE	"memfd-v2"
>  
>  #endif /* _LINUX_KHO_ABI_MEMFD_H */
> diff --git a/mm/memfd_luo.c b/mm/memfd_luo.c
> index a34fccc23b6a..eb68e0b5457f 100644
> --- a/mm/memfd_luo.c
> +++ b/mm/memfd_luo.c
> @@ -79,6 +79,8 @@
>  #include <linux/shmem_fs.h>
>  #include <linux/vmalloc.h>
>  #include <linux/memfd.h>
> +#include <uapi/linux/memfd.h>
> +
>  #include "internal.h"
>  
>  static int memfd_luo_preserve_folios(struct file *file,
> @@ -222,7 +224,7 @@ static int memfd_luo_preserve(struct liveupdate_file_op_args *args)
>  	struct memfd_luo_folio_ser *folios_ser;
>  	struct memfd_luo_ser *ser;
>  	u64 nr_folios;
> -	int err = 0;
> +	int err = 0, seals;
>  
>  	inode_lock(inode);
>  	shmem_freeze(inode, true);
> @@ -234,8 +236,15 @@ static int memfd_luo_preserve(struct liveupdate_file_op_args *args)
>  		goto err_unlock;
>  	}
>  
> +	seals = memfd_get_seals(args->file);
> +	if (seals < 0) {
> +		err = seals;
> +		goto err_free_ser;
> +	}
> +
>  	ser->pos = args->file->f_pos;
>  	ser->size = i_size_read(inode);
> +	ser->seals = seals;
>  
>  	err = memfd_luo_preserve_folios(args->file, &ser->folios,
>  					&folios_ser, &nr_folios);
> @@ -444,13 +453,23 @@ static int memfd_luo_retrieve(struct liveupdate_file_op_args *args)
>  	if (!ser)
>  		return -EINVAL;
>  
> -	file = memfd_alloc_file("", 0);
> +	/*
> +	 * The seals are preserved. Allow sealing here so they can be added
> +	 * later.
> +	 */
> +	file = memfd_alloc_file("", MFD_ALLOW_SEALING);

I think we should select flags passed to memfd_alloc_file() based on
ser->seals (and later based on ser->seals and ser->flags).

>  	if (IS_ERR(file)) {
>  		pr_err("failed to setup file: %pe\n", file);
>  		err = PTR_ERR(file);
>  		goto free_ser;
>  	}
>  
> +	err = memfd_add_seals(file, ser->seals);

I'm not sure using MFD_ALLOW_SEALING is enough if there was F_SEAL_EXEC in
seals.

> +	if (err) {
> +		pr_err("failed to add seals: %pe\n", ERR_PTR(err));
> +		goto put_file;
> +	}
> +
>  	vfs_setpos(file, ser->pos, MAX_LFS_FILESIZE);
>  	file->f_inode->i_size = ser->size;
>  
> -- 
> 2.52.0.457.g6b5491de43-goog
> 

-- 
Sincerely yours,
Mike.
Re: [PATCH 2/2] mm: memfd_luo: preserve file seals
Posted by Jason Gunthorpe 1 week, 4 days ago
On Sun, Jan 25, 2026 at 02:03:29PM +0200, Mike Rapoport wrote:
> > @@ -67,11 +72,13 @@ struct memfd_luo_folio_ser {
> >  struct memfd_luo_ser {
> >  	u64 pos;
> >  	u64 size;
> > +	u64 seals:8;
> 
> Kernel uABI defines seals as unsigned int, I think we can spare u32 for
> them and reserve a u32 flags for other memfd flags (MFD_CLOEXEC,
> MFD_HUGETLB etc).

It is a bit worse than that, the "v2" version is only going to support
some set of seals (probably the set defined in v6.19) and if there are
new seals down the road then this needs a version bump.

So I'd check that only supported seals are set here:

> > +     seals = memfd_get_seals(args->file);
> > +     if (seals < 0) {
> > +             err = seals;
> > +             goto err_free_ser;
> > +     }
> > +
> >       ser->pos = args->file->f_pos;
> >       ser->size = i_size_read(inode);
> > +     ser->seals = seals;

..

> > @@ -444,13 +453,23 @@ static int memfd_luo_retrieve(struct liveupdate_file_op_args *args)
> >  	if (!ser)
> >  		return -EINVAL;
> >  
> > -	file = memfd_alloc_file("", 0);
> > +	/*
> > +	 * The seals are preserved. Allow sealing here so they can be added
> > +	 * later.
> > +	 */
> > +	file = memfd_alloc_file("", MFD_ALLOW_SEALING);
> >  	if (IS_ERR(file)) {
> >  		pr_err("failed to setup file: %pe\n", file);
> >  		err = PTR_ERR(file);
> >  		goto free_ser;
> >  	}
> >  
> > +	err = memfd_add_seals(file, ser->seals);

Because we really don't want this to fail :\

Jason
Re: [PATCH 2/2] mm: memfd_luo: preserve file seals
Posted by Pratyush Yadav 1 week, 4 days ago
Hi Mike,

On Sun, Jan 25 2026, Mike Rapoport wrote:

> On Fri, Jan 23, 2026 at 10:58:51AM +0100, Pratyush Yadav wrote:
>> From: "Pratyush Yadav (Google)" <pratyush@kernel.org>
>> 
>> File seals are used on memfd for making shared memory communication with
>> untrusted peers safer and simpler. Seals provide a guarantee that
>> certain operations won't be allowed on the file such as writes or
>> truncations. Maintaining these guarantees across a live update will help
>> keeping such use cases secure.
>> 
>> These guarantees will also be needed for IOMMUFD preservation with LUO.
>> Normally when IOMMUFD maps a memfd, it pins all its pages to make sure
>> any truncation operations on the memfd don't lead to IOMMUFD using freed
>> memory. This doesn't work with LUO since the preserved memfd might have
>> completely different pages after a live update, and mapping them back to
>> the IOMMUFD will cause all sorts of problems. Using and preserving the
>> seals allows IOMMUFD preservation logic to trust the memfd.
>> 
>> Preserve the seals by introducing a new 8-bit-wide bitfield. There are
>> currently only 6 possible seals but 2 extra bits are used to provide
>> room for future expansion. Since the seals are UAPI, it is safe to use
>> them directly in the ABI.
>> 
>> Back the 8-bit field with a u64, leaving 56 unused bits. This is done to
>> keep the struct nice and aligned. The unused bits can be used to add new
>> flags later, potentially without even needing to bump the version
>> number.
>> 
>> Since the serialization structure is changed, bump the version number to
>> "memfd-v2".
>> 
>> Signed-off-by: Pratyush Yadav (Google) <pratyush@kernel.org>
>> ---
>>  include/linux/kho/abi/memfd.h |  9 ++++++++-
>>  mm/memfd_luo.c                | 23 +++++++++++++++++++++--
>>  2 files changed, 29 insertions(+), 3 deletions(-)
>> 
>> diff --git a/include/linux/kho/abi/memfd.h b/include/linux/kho/abi/memfd.h
>> index 68cb6303b846..bd549c81f1d2 100644
>> --- a/include/linux/kho/abi/memfd.h
>> +++ b/include/linux/kho/abi/memfd.h
>> @@ -60,6 +60,11 @@ struct memfd_luo_folio_ser {
>>   * struct memfd_luo_ser - Main serialization structure for a memfd.
>>   * @pos:       The file's current position (f_pos).
>>   * @size:      The total size of the file in bytes (i_size).
>> + * @seals:     The seals present on the memfd. The seals are UAPI so it is safe
>> + *             to directly use them in the ABI. Note: currently there are 6
>> + *             seals possible but this field is 8 bits to leave room for future
>> + *             expansion.
>> + * @__reserved: Reserved bits. May be used later to add more flags.
>>   * @nr_folios: Number of folios in the folios array.
>>   * @folios:    KHO vmalloc descriptor pointing to the array of
>>   *             struct memfd_luo_folio_ser.
>> @@ -67,11 +72,13 @@ struct memfd_luo_folio_ser {
>>  struct memfd_luo_ser {
>>  	u64 pos;
>>  	u64 size;
>> +	u64 seals:8;
>
> Kernel uABI defines seals as unsigned int, I think we can spare u32 for
> them and reserve a u32 flags for other memfd flags (MFD_CLOEXEC,
> MFD_HUGETLB etc).

Sure, will do.

>
>> +	u64 __reserved:56;
>>  	u64 nr_folios;
>>  	struct kho_vmalloc folios;
>>  } __packed;
>>  
>>  /* The compatibility string for memfd file handler */
>> -#define MEMFD_LUO_FH_COMPATIBLE	"memfd-v1"
>> +#define MEMFD_LUO_FH_COMPATIBLE	"memfd-v2"
>>  
>>  #endif /* _LINUX_KHO_ABI_MEMFD_H */
>> diff --git a/mm/memfd_luo.c b/mm/memfd_luo.c
>> index a34fccc23b6a..eb68e0b5457f 100644
>> --- a/mm/memfd_luo.c
>> +++ b/mm/memfd_luo.c
>> @@ -79,6 +79,8 @@
>>  #include <linux/shmem_fs.h>
>>  #include <linux/vmalloc.h>
>>  #include <linux/memfd.h>
>> +#include <uapi/linux/memfd.h>
>> +
>>  #include "internal.h"
>>  
>>  static int memfd_luo_preserve_folios(struct file *file,
>> @@ -222,7 +224,7 @@ static int memfd_luo_preserve(struct liveupdate_file_op_args *args)
>>  	struct memfd_luo_folio_ser *folios_ser;
>>  	struct memfd_luo_ser *ser;
>>  	u64 nr_folios;
>> -	int err = 0;
>> +	int err = 0, seals;
>>  
>>  	inode_lock(inode);
>>  	shmem_freeze(inode, true);
>> @@ -234,8 +236,15 @@ static int memfd_luo_preserve(struct liveupdate_file_op_args *args)
>>  		goto err_unlock;
>>  	}
>>  
>> +	seals = memfd_get_seals(args->file);
>> +	if (seals < 0) {
>> +		err = seals;
>> +		goto err_free_ser;
>> +	}
>> +
>>  	ser->pos = args->file->f_pos;
>>  	ser->size = i_size_read(inode);
>> +	ser->seals = seals;
>>  
>>  	err = memfd_luo_preserve_folios(args->file, &ser->folios,
>>  					&folios_ser, &nr_folios);
>> @@ -444,13 +453,23 @@ static int memfd_luo_retrieve(struct liveupdate_file_op_args *args)
>>  	if (!ser)
>>  		return -EINVAL;
>>  
>> -	file = memfd_alloc_file("", 0);
>> +	/*
>> +	 * The seals are preserved. Allow sealing here so they can be added
>> +	 * later.
>> +	 */
>> +	file = memfd_alloc_file("", MFD_ALLOW_SEALING);
>
> I think we should select flags passed to memfd_alloc_file() based on
> ser->seals (and later based on ser->seals and ser->flags).

Not sure what you mean.

I think the only seal we can set via memfd_alloc_file() flags is
MFD_NOEXEC_SEAL, which is really a F_SEAL_EXEC and plus a change of the
inode's mode. And now that I think of it, that is a valid use case that
we might as well support. But I think that should be done by preserving
the mode of the inode directly, and then copying the seals back. The
main reason for that is that the mode can be changed after the memfd is
created too.

Other than that, all other seals are set by fcntl (via
memfd_add_seals()), so I don't see what else we can pass to
memfd_alloc_file().

>
>>  	if (IS_ERR(file)) {
>>  		pr_err("failed to setup file: %pe\n", file);
>>  		err = PTR_ERR(file);
>>  		goto free_ser;
>>  	}
>>  
>> +	err = memfd_add_seals(file, ser->seals);
>
> I'm not sure using MFD_ALLOW_SEALING is enough if there was F_SEAL_EXEC in
> seals.

Why not? memfd_add_seals() can handle F_SEAL_EXEC as far as I can tell.

>
>> +	if (err) {
>> +		pr_err("failed to add seals: %pe\n", ERR_PTR(err));
>> +		goto put_file;
>> +	}
>> +
>>  	vfs_setpos(file, ser->pos, MAX_LFS_FILESIZE);
>>  	file->f_inode->i_size = ser->size;
>>  
>> -- 
>> 2.52.0.457.g6b5491de43-goog
>> 

-- 
Regards,
Pratyush Yadav
Re: [PATCH 2/2] mm: memfd_luo: preserve file seals
Posted by Mike Rapoport 1 week, 4 days ago
Hi Pratyush,

On Mon, Jan 26, 2026 at 01:47:21PM +0100, Pratyush Yadav wrote:
> Hi Mike,
> 
> On Sun, Jan 25 2026, Mike Rapoport wrote:
> > On Fri, Jan 23, 2026 at 10:58:51AM +0100, Pratyush Yadav wrote:

...

> >> -	file = memfd_alloc_file("", 0);
> >> +	/*
> >> +	 * The seals are preserved. Allow sealing here so they can be added
> >> +	 * later.
> >> +	 */
> >> +	file = memfd_alloc_file("", MFD_ALLOW_SEALING);
> >
> > I think we should select flags passed to memfd_alloc_file() based on
> > ser->seals (and later based on ser->seals and ser->flags).
> 
> Not sure what you mean.
> 
> I think the only seal we can set via memfd_alloc_file() flags is
> MFD_NOEXEC_SEAL, which is really a F_SEAL_EXEC and plus a change of the
> inode's mode. And now that I think of it, that is a valid use case that
> we might as well support. But I think that should be done by preserving
> the mode of the inode directly, and then copying the seals back. The
> main reason for that is that the mode can be changed after the memfd is
> created too.
>
> Other than that, all other seals are set by fcntl (via
> memfd_add_seals()), so I don't see what else we can pass to
> memfd_alloc_file().

Hmm, "using ser->seals" was bad phrasing :)

Now we add support for creating memfd with MFD_ALLOW_SEALING and at some
point we'd want MFD_HUGETLB and huge page size.
So I think we should have a field in ser that will define what flags should
be used for creation of memfd and based on the value of that field pass the
flags to memfd_alloc_file().

For seals support this field can be hardwired to MFD_ALLOW_SEALING at preserve
time.

> >>  	if (IS_ERR(file)) {
> >>  		pr_err("failed to setup file: %pe\n", file);
> >>  		err = PTR_ERR(file);
> >>  		goto free_ser;
> >>  	}
> >>  
> >> +	err = memfd_add_seals(file, ser->seals);
> >
> > I'm not sure using MFD_ALLOW_SEALING is enough if there was F_SEAL_EXEC in
> > seals.
> 
> Why not? memfd_add_seals() can handle F_SEAL_EXEC as far as I can tell.

I just noticed it behaved differently :)
Looks like F_SEAL_EXEC indeed can handle it.

> -- 
> Regards,
> Pratyush Yadav

-- 
Sincerely yours,
Mike.