[PATCH v3 1/4] mm: Introduce vm_uffd_ops API

Peter Xu posted 4 patches 5 days, 1 hour ago
[PATCH v3 1/4] mm: Introduce vm_uffd_ops API
Posted by Peter Xu 5 days, 1 hour ago
Currently, most of the userfaultfd features are implemented directly in the
core mm.  It will invoke VMA specific functions whenever necessary.  So far
it is fine because it almost only interacts with shmem and hugetlbfs.

Introduce a generic userfaultfd API extension for vm_operations_struct,
so that any code that implements vm_operations_struct (including kernel
modules that can be compiled separately from the kernel core) can support
userfaults without modifying the core files.

With this API applied, if a module wants to support userfaultfd, the
module should only need to properly define vm_uffd_ops and hook it to
vm_operations_struct, instead of changing anything in core mm.

This API will not work for anonymous memory. Handling of userfault
operations for anonymous memory remains unchanged in core mm.

Due to a security concern while reviewing older versions of this series
[1], uffd_copy() will be temprorarily removed.  IOW, so far MISSING-capable
memory types can only be hard-coded and implemented in mm/.  It would also
affect UFFDIO_COPY and UFFDIO_ZEROPAGE.  Other functions should still be
able to be provided from vm_uffd_ops.

Introduces the API only so that existing userfaultfd users can be moved
over without breaking them.

[1] https://lore.kernel.org/all/20250627154655.2085903-1-peterx@redhat.com/

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 include/linux/mm.h            |  9 +++++++++
 include/linux/userfaultfd_k.h | 37 +++++++++++++++++++++++++++++++++++
 2 files changed, 46 insertions(+)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 6b6c6980f46c2..8afb93387e2c6 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -620,6 +620,8 @@ struct vm_fault {
 					 */
 };
 
+struct vm_uffd_ops;
+
 /*
  * These are the virtual MM functions - opening of an area, closing and
  * unmapping it (needed to keep files on disk up-to-date etc), pointer
@@ -705,6 +707,13 @@ struct vm_operations_struct {
 	struct page *(*find_normal_page)(struct vm_area_struct *vma,
 					 unsigned long addr);
 #endif /* CONFIG_FIND_NORMAL_PAGE */
+#ifdef CONFIG_USERFAULTFD
+	/*
+	 * Userfaultfd related ops.  Modules need to define this to support
+	 * userfaultfd.
+	 */
+	const struct vm_uffd_ops *userfaultfd_ops;
+#endif
 };
 
 #ifdef CONFIG_NUMA_BALANCING
diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
index c0e716aec26aa..b1949d8611238 100644
--- a/include/linux/userfaultfd_k.h
+++ b/include/linux/userfaultfd_k.h
@@ -92,6 +92,43 @@ enum mfill_atomic_mode {
 	NR_MFILL_ATOMIC_MODES,
 };
 
+/* VMA userfaultfd operations */
+struct vm_uffd_ops {
+	/**
+	 * @uffd_features: features supported in bitmask.
+	 *
+	 * When the ops is defined, the driver must set non-zero features
+	 * to be a subset (or all) of: VM_UFFD_MISSING|WP|MINOR.
+	 *
+	 * NOTE: VM_UFFD_MISSING is still only supported under mm/ so far.
+	 */
+	unsigned long uffd_features;
+	/**
+	 * @uffd_ioctls: ioctls supported in bitmask.
+	 *
+	 * Userfaultfd ioctls supported by the module.  Below will always
+	 * be supported by default whenever a module provides vm_uffd_ops:
+	 *
+	 *   _UFFDIO_API, _UFFDIO_REGISTER, _UFFDIO_UNREGISTER, _UFFDIO_WAKE
+	 *
+	 * The module needs to provide all the rest optionally supported
+	 * ioctls.  For example, when VM_UFFD_MINOR is supported,
+	 * _UFFDIO_CONTINUE must be supported as an ioctl.
+	 */
+	unsigned long uffd_ioctls;
+	/**
+	 * uffd_get_folio: Handler to resolve UFFDIO_CONTINUE request.
+	 *
+	 * @inode: the inode for folio lookup
+	 * @pgoff: the pgoff of the folio
+	 * @folio: returned folio pointer
+	 *
+	 * Return: zero if succeeded, negative for errors.
+	 */
+	int (*uffd_get_folio)(struct inode *inode, pgoff_t pgoff,
+			      struct folio **folio);
+};
+
 #define MFILL_ATOMIC_MODE_BITS (const_ilog2(NR_MFILL_ATOMIC_MODES - 1) + 1)
 #define MFILL_ATOMIC_BIT(nr) BIT(MFILL_ATOMIC_MODE_BITS + (nr))
 #define MFILL_ATOMIC_FLAG(nr) ((__force uffd_flags_t) MFILL_ATOMIC_BIT(nr))
-- 
2.50.1
Re: [PATCH v3 1/4] mm: Introduce vm_uffd_ops API
Posted by David Hildenbrand 1 day, 12 hours ago
On 26.09.25 23:16, Peter Xu wrote:
> Currently, most of the userfaultfd features are implemented directly in the
> core mm.  It will invoke VMA specific functions whenever necessary.  So far
> it is fine because it almost only interacts with shmem and hugetlbfs.
> 
> Introduce a generic userfaultfd API extension for vm_operations_struct,
> so that any code that implements vm_operations_struct (including kernel
> modules that can be compiled separately from the kernel core) can support
> userfaults without modifying the core files.
> 
> With this API applied, if a module wants to support userfaultfd, the
> module should only need to properly define vm_uffd_ops and hook it to
> vm_operations_struct, instead of changing anything in core mm.
> 
> This API will not work for anonymous memory. Handling of userfault
> operations for anonymous memory remains unchanged in core mm.
> 
> Due to a security concern while reviewing older versions of this series
> [1], uffd_copy() will be temprorarily removed.  IOW, so far MISSING-capable
> memory types can only be hard-coded and implemented in mm/.  It would also
> affect UFFDIO_COPY and UFFDIO_ZEROPAGE.  Other functions should still be
> able to be provided from vm_uffd_ops.
> 
> Introduces the API only so that existing userfaultfd users can be moved
> over without breaking them.
> 
> [1] https://lore.kernel.org/all/20250627154655.2085903-1-peterx@redhat.com/
> 

Looks much better with the uffdio_copy stuff removed for now.

> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   include/linux/mm.h            |  9 +++++++++
>   include/linux/userfaultfd_k.h | 37 +++++++++++++++++++++++++++++++++++
>   2 files changed, 46 insertions(+)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 6b6c6980f46c2..8afb93387e2c6 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -620,6 +620,8 @@ struct vm_fault {
>   					 */
>   };
>   
> +struct vm_uffd_ops;
> +
>   /*
>    * These are the virtual MM functions - opening of an area, closing and
>    * unmapping it (needed to keep files on disk up-to-date etc), pointer
> @@ -705,6 +707,13 @@ struct vm_operations_struct {
>   	struct page *(*find_normal_page)(struct vm_area_struct *vma,
>   					 unsigned long addr);
>   #endif /* CONFIG_FIND_NORMAL_PAGE */
> +#ifdef CONFIG_USERFAULTFD
> +	/*
> +	 * Userfaultfd related ops.  Modules need to define this to support
> +	 * userfaultfd.
> +	 */
> +	const struct vm_uffd_ops *userfaultfd_ops;
> +#endif
>   };
>   
>   #ifdef CONFIG_NUMA_BALANCING
> diff --git a/include/linux/userfaultfd_k.h b/include/linux/userfaultfd_k.h
> index c0e716aec26aa..b1949d8611238 100644
> --- a/include/linux/userfaultfd_k.h
> +++ b/include/linux/userfaultfd_k.h
> @@ -92,6 +92,43 @@ enum mfill_atomic_mode {
>   	NR_MFILL_ATOMIC_MODES,
>   };
>   
> +/* VMA userfaultfd operations */
> +struct vm_uffd_ops {
> +	/**
> +	 * @uffd_features: features supported in bitmask.
> +	 *
> +	 * When the ops is defined, the driver must set non-zero features
> +	 * to be a subset (or all) of: VM_UFFD_MISSING|WP|MINOR.
> +	 *
> +	 * NOTE: VM_UFFD_MISSING is still only supported under mm/ so far.
> +	 */
> +	unsigned long uffd_features;

This variable name is a bit confusing , because it's all about vma 
flags, not uffd features. Just reading the variable, I would rather 
connect it to things like UFFD_FEATURE_WP_UNPOPULATED.

As currently used for VM flags, maybe you should call this

	unsigned long uffd_vm_flags;

or sth like that.

I briefly wondered whether we could use actual UFFD_FEATURE_* here, but 
they are rather unsuited for this case here (e.g., different feature 
flags for hugetlb support/shmem support etc).

But reading "uffd_ioctls" below, can't we derive the suitable vma flags 
from the supported ioctls?

_UFFDIO_COPY | _UFDIO_ZEROPAGE -> VM_UFFD_MISSING
_UFFDIO_WRITEPROTECT -> VM_UFFD_WP
_UFFDIO_CONTINUE -> VM_UFFD_MINOR

> +	/**
> +	 * @uffd_ioctls: ioctls supported in bitmask.
> +	 *
> +	 * Userfaultfd ioctls supported by the module.  Below will always
> +	 * be supported by default whenever a module provides vm_uffd_ops:
> +	 *
> +	 *   _UFFDIO_API, _UFFDIO_REGISTER, _UFFDIO_UNREGISTER, _UFFDIO_WAKE
> +	 *
> +	 * The module needs to provide all the rest optionally supported
> +	 * ioctls.  For example, when VM_UFFD_MINOR is supported,
> +	 * _UFFDIO_CONTINUE must be supported as an ioctl.
> +	 */
> +	unsigned long uffd_ioctls;
> +	/**
> +	 * uffd_get_folio: Handler to resolve UFFDIO_CONTINUE request.

Just wondering if we could incorporate the "continue" / "minor" aspect 
into the callback name.

uffd_minor_get_folio / uffd_continue_get_folio

Or do you see use of that callback in the context of other uffd features?

-- 
Cheers

David / dhildenb
Re: [PATCH v3 1/4] mm: Introduce vm_uffd_ops API
Posted by Peter Xu 1 day, 3 hours ago
On Tue, Sep 30, 2025 at 11:36:53AM +0200, David Hildenbrand wrote:
> > +/* VMA userfaultfd operations */
> > +struct vm_uffd_ops {
> > +	/**
> > +	 * @uffd_features: features supported in bitmask.
> > +	 *
> > +	 * When the ops is defined, the driver must set non-zero features
> > +	 * to be a subset (or all) of: VM_UFFD_MISSING|WP|MINOR.
> > +	 *
> > +	 * NOTE: VM_UFFD_MISSING is still only supported under mm/ so far.
> > +	 */
> > +	unsigned long uffd_features;
> 
> This variable name is a bit confusing , because it's all about vma flags,
> not uffd features. Just reading the variable, I would rather connect it to
> things like UFFD_FEATURE_WP_UNPOPULATED.
> 
> As currently used for VM flags, maybe you should call this
> 
> 	unsigned long uffd_vm_flags;
> 
> or sth like that.

Indeed it's slightly confusing.  However uffd_vm_flags is confusing in
another way, where it seems to imply some flags similar to vm_flags that is
prone to change.

How about uffd_vm_flags_supported / uffd_modes_supported?

> 
> I briefly wondered whether we could use actual UFFD_FEATURE_* here, but they
> are rather unsuited for this case here (e.g., different feature flags for
> hugetlb support/shmem support etc).
> 
> But reading "uffd_ioctls" below, can't we derive the suitable vma flags from
> the supported ioctls?
> 
> _UFFDIO_COPY | _UFDIO_ZEROPAGE -> VM_UFFD_MISSING
> _UFFDIO_WRITEPROTECT -> VM_UFFD_WP
> _UFFDIO_CONTINUE -> VM_UFFD_MINOR

Yes we can deduce that, but it'll be unclear then when one stares at a
bunch of ioctls and cannot easily digest the modes the memory type
supports.  Here, the modes should be the most straightforward way to
describe the capability of a memory type.

If hugetlbfs supported ZEROPAGE, then we can deduce the ioctls the other
way round, and we can drop the uffd_ioctls.  However we need the ioctls now
for hugetlbfs to make everything generic.

Do you mind I still keep it as-is?  So far that's still the clearest I can
think of.  It's only set when some support is added to a memory type, so
it's a one-time shot.

Thanks,

-- 
Peter Xu
Re: [PATCH v3 1/4] mm: Introduce vm_uffd_ops API
Posted by David Hildenbrand 1 day, 3 hours ago
On 30.09.25 20:48, Peter Xu wrote:
> On Tue, Sep 30, 2025 at 11:36:53AM +0200, David Hildenbrand wrote:
>>> +/* VMA userfaultfd operations */
>>> +struct vm_uffd_ops {
>>> +	/**
>>> +	 * @uffd_features: features supported in bitmask.
>>> +	 *
>>> +	 * When the ops is defined, the driver must set non-zero features
>>> +	 * to be a subset (or all) of: VM_UFFD_MISSING|WP|MINOR.
>>> +	 *
>>> +	 * NOTE: VM_UFFD_MISSING is still only supported under mm/ so far.
>>> +	 */
>>> +	unsigned long uffd_features;
>>
>> This variable name is a bit confusing , because it's all about vma flags,
>> not uffd features. Just reading the variable, I would rather connect it to
>> things like UFFD_FEATURE_WP_UNPOPULATED.
>>
>> As currently used for VM flags, maybe you should call this
>>
>> 	unsigned long uffd_vm_flags;
>>
>> or sth like that.
> 
> Indeed it's slightly confusing.  However uffd_vm_flags is confusing in
> another way, where it seems to imply some flags similar to vm_flags that is
> prone to change.
> 
> How about uffd_vm_flags_supported / uffd_modes_supported?

The former would make things clearer when we are at least not talking 
about uffd features.

> 
>>
>> I briefly wondered whether we could use actual UFFD_FEATURE_* here, but they
>> are rather unsuited for this case here (e.g., different feature flags for
>> hugetlb support/shmem support etc).
>>
>> But reading "uffd_ioctls" below, can't we derive the suitable vma flags from
>> the supported ioctls?
>>
>> _UFFDIO_COPY | _UFDIO_ZEROPAGE -> VM_UFFD_MISSING
>> _UFFDIO_WRITEPROTECT -> VM_UFFD_WP
>> _UFFDIO_CONTINUE -> VM_UFFD_MINOR
> 
> Yes we can deduce that, but it'll be unclear then when one stares at a
> bunch of ioctls and cannot easily digest the modes the memory type
> supports.  Here, the modes should be the most straightforward way to
> describe the capability of a memory type.

I rather dislike the current split approach between vm-flags and ioctls.

I briefly thought about abstracting it for internal purposes further and 
just have some internal backend ("memory type") flags.

UFFD_BACKEND_FEAT_MISSING -> _UFFDIO_COPY and VM_UFFD_MISSING
UFFD_BACKEND_FEAT_ZEROPAGE -> _UFDIO_ZEROPAGE
UFFD_BACKEND_FEAT_WP -> _UFFDIO_WRITEPROTECT and VM_UFFD_WP
UFFD_BACKEND_FEAT_MINOR -> _UFFDIO_CONTINUE and VM_UFFD_MINOR
UFFD_BACKEND_FEAT_POISON -> _UFFDIO_POISON
> 
> If hugetlbfs supported ZEROPAGE, then we can deduce the ioctls the other
> way round, and we can drop the uffd_ioctls.  However we need the ioctls now
> for hugetlbfs to make everything generic.

POISON is not a VM_ flag, so that wouldn't work completely, right?

As a side note, hugetlbfs support for ZEROPAGE should be fairly easy: 
similar to shmem support, simply allocate a zeroed hugetlb folio.

> 
> Do you mind I still keep it as-is?

I would prefer if we find a way to not have this dependency between both 
feature/ioctl thingies. It just looks rather odd.

But let's hear if there are other opinions.

-- 
Cheers

David / dhildenb
Re: [PATCH v3 1/4] mm: Introduce vm_uffd_ops API
Posted by Peter Xu 1 day, 1 hour ago
On Tue, Sep 30, 2025 at 09:19:05PM +0200, David Hildenbrand wrote:
> On 30.09.25 20:48, Peter Xu wrote:
> > On Tue, Sep 30, 2025 at 11:36:53AM +0200, David Hildenbrand wrote:
> > > > +/* VMA userfaultfd operations */
> > > > +struct vm_uffd_ops {
> > > > +	/**
> > > > +	 * @uffd_features: features supported in bitmask.
> > > > +	 *
> > > > +	 * When the ops is defined, the driver must set non-zero features
> > > > +	 * to be a subset (or all) of: VM_UFFD_MISSING|WP|MINOR.
> > > > +	 *
> > > > +	 * NOTE: VM_UFFD_MISSING is still only supported under mm/ so far.
> > > > +	 */
> > > > +	unsigned long uffd_features;
> > > 
> > > This variable name is a bit confusing , because it's all about vma flags,
> > > not uffd features. Just reading the variable, I would rather connect it to
> > > things like UFFD_FEATURE_WP_UNPOPULATED.
> > > 
> > > As currently used for VM flags, maybe you should call this
> > > 
> > > 	unsigned long uffd_vm_flags;
> > > 
> > > or sth like that.
> > 
> > Indeed it's slightly confusing.  However uffd_vm_flags is confusing in
> > another way, where it seems to imply some flags similar to vm_flags that is
> > prone to change.
> > 
> > How about uffd_vm_flags_supported / uffd_modes_supported?
> 
> The former would make things clearer when we are at least not talking about
> uffd features.

I'll go with it.

> 
> > 
> > > 
> > > I briefly wondered whether we could use actual UFFD_FEATURE_* here, but they
> > > are rather unsuited for this case here (e.g., different feature flags for
> > > hugetlb support/shmem support etc).
> > > 
> > > But reading "uffd_ioctls" below, can't we derive the suitable vma flags from
> > > the supported ioctls?
> > > 
> > > _UFFDIO_COPY | _UFDIO_ZEROPAGE -> VM_UFFD_MISSING
> > > _UFFDIO_WRITEPROTECT -> VM_UFFD_WP
> > > _UFFDIO_CONTINUE -> VM_UFFD_MINOR
> > 
> > Yes we can deduce that, but it'll be unclear then when one stares at a
> > bunch of ioctls and cannot easily digest the modes the memory type
> > supports.  Here, the modes should be the most straightforward way to
> > describe the capability of a memory type.
> 
> I rather dislike the current split approach between vm-flags and ioctls.
> 
> I briefly thought about abstracting it for internal purposes further and
> just have some internal backend ("memory type") flags.
> 
> UFFD_BACKEND_FEAT_MISSING -> _UFFDIO_COPY and VM_UFFD_MISSING
> UFFD_BACKEND_FEAT_ZEROPAGE -> _UFDIO_ZEROPAGE
> UFFD_BACKEND_FEAT_WP -> _UFFDIO_WRITEPROTECT and VM_UFFD_WP
> UFFD_BACKEND_FEAT_MINOR -> _UFFDIO_CONTINUE and VM_UFFD_MINOR
> UFFD_BACKEND_FEAT_POISON -> _UFFDIO_POISON

This layer of mapping can be helpful to some, but maybe confusing to
others.. who is familiar with existing userfaultfd definitions.

> > 
> > If hugetlbfs supported ZEROPAGE, then we can deduce the ioctls the other
> > way round, and we can drop the uffd_ioctls.  However we need the ioctls now
> > for hugetlbfs to make everything generic.
> 
> POISON is not a VM_ flag, so that wouldn't work completely, right?

Logically speaking, POISON should be meaningful if MISSING|MINOR is
supported.  However, in reality, POISON should always be supported across
all types..

> 
> As a side note, hugetlbfs support for ZEROPAGE should be fairly easy:
> similar to shmem support, simply allocate a zeroed hugetlb folio.

IMHO it'll be good if we do not introduce ZEROPAGE only because we want to
remove some flags.. We could be introducing dead codes that nobody uses.

I think it'll be good if we put that as a separate discussion, and define
the vm_uffd_ops based on the current situation.

> 
> > 
> > Do you mind I still keep it as-is?
> 
> I would prefer if we find a way to not have this dependency between both
> feature/ioctl thingies. It just looks rather odd.
> 
> But let's hear if there are other opinions.

Sure.

-- 
Peter Xu
Re: [PATCH v3 1/4] mm: Introduce vm_uffd_ops API
Posted by David Hildenbrand 8 hours ago
>>>> I briefly wondered whether we could use actual UFFD_FEATURE_* here, but they
>>>> are rather unsuited for this case here (e.g., different feature flags for
>>>> hugetlb support/shmem support etc).
>>>>
>>>> But reading "uffd_ioctls" below, can't we derive the suitable vma flags from
>>>> the supported ioctls?
>>>>
>>>> _UFFDIO_COPY | _UFDIO_ZEROPAGE -> VM_UFFD_MISSING
>>>> _UFFDIO_WRITEPROTECT -> VM_UFFD_WP
>>>> _UFFDIO_CONTINUE -> VM_UFFD_MINOR
>>>
>>> Yes we can deduce that, but it'll be unclear then when one stares at a
>>> bunch of ioctls and cannot easily digest the modes the memory type
>>> supports.  Here, the modes should be the most straightforward way to
>>> describe the capability of a memory type.
>>
>> I rather dislike the current split approach between vm-flags and ioctls.
>>
>> I briefly thought about abstracting it for internal purposes further and
>> just have some internal backend ("memory type") flags.
>>
>> UFFD_BACKEND_FEAT_MISSING -> _UFFDIO_COPY and VM_UFFD_MISSING
>> UFFD_BACKEND_FEAT_ZEROPAGE -> _UFDIO_ZEROPAGE
>> UFFD_BACKEND_FEAT_WP -> _UFFDIO_WRITEPROTECT and VM_UFFD_WP
>> UFFD_BACKEND_FEAT_MINOR -> _UFFDIO_CONTINUE and VM_UFFD_MINOR
>> UFFD_BACKEND_FEAT_POISON -> _UFFDIO_POISON
> 
> This layer of mapping can be helpful to some, but maybe confusing to
> others.. who is familiar with existing userfaultfd definitions.
> 

Just wondering, is this confusing to you, and if so, which part?

To me it makes perfect sense and cleans up this API and not have to sets 
of flags that are somehow interlinked.

>>>
>>> If hugetlbfs supported ZEROPAGE, then we can deduce the ioctls the other
>>> way round, and we can drop the uffd_ioctls.  However we need the ioctls now
>>> for hugetlbfs to make everything generic.
>>
>> POISON is not a VM_ flag, so that wouldn't work completely, right?
> 
> Logically speaking, POISON should be meaningful if MISSING|MINOR is
> supported.  However, in reality, POISON should always be supported across
> all types..

Do you know what the plans are with guest_memfd?

> 
>>
>> As a side note, hugetlbfs support for ZEROPAGE should be fairly easy:
>> similar to shmem support, simply allocate a zeroed hugetlb folio.
> 
> IMHO it'll be good if we do not introduce ZEROPAGE only because we want to
> remove some flags.. We could be introducing dead codes that nobody uses.
> 
> I think it'll be good if we put that as a separate discussion, and define
> the vm_uffd_ops based on the current situation.

Right. I'd vote for an abstraction in the lines of what I proposed 
above. Doesn't have to be the terminology I used above, but some simple 
single set of flag that we can map to the underlying details.

But again, hoping to hear other opinions on this topic.

-- 
Cheers

David / dhildenb
Re: [PATCH v3 1/4] mm: Introduce vm_uffd_ops API
Posted by Peter Xu 7 hours ago
On Wed, Oct 01, 2025 at 03:58:14PM +0200, David Hildenbrand wrote:
> > > > > I briefly wondered whether we could use actual UFFD_FEATURE_* here, but they
> > > > > are rather unsuited for this case here (e.g., different feature flags for
> > > > > hugetlb support/shmem support etc).
> > > > > 
> > > > > But reading "uffd_ioctls" below, can't we derive the suitable vma flags from
> > > > > the supported ioctls?
> > > > > 
> > > > > _UFFDIO_COPY | _UFDIO_ZEROPAGE -> VM_UFFD_MISSING
> > > > > _UFFDIO_WRITEPROTECT -> VM_UFFD_WP
> > > > > _UFFDIO_CONTINUE -> VM_UFFD_MINOR
> > > > 
> > > > Yes we can deduce that, but it'll be unclear then when one stares at a
> > > > bunch of ioctls and cannot easily digest the modes the memory type
> > > > supports.  Here, the modes should be the most straightforward way to
> > > > describe the capability of a memory type.
> > > 
> > > I rather dislike the current split approach between vm-flags and ioctls.
> > > 
> > > I briefly thought about abstracting it for internal purposes further and
> > > just have some internal backend ("memory type") flags.
> > > 
> > > UFFD_BACKEND_FEAT_MISSING -> _UFFDIO_COPY and VM_UFFD_MISSING
> > > UFFD_BACKEND_FEAT_ZEROPAGE -> _UFDIO_ZEROPAGE
> > > UFFD_BACKEND_FEAT_WP -> _UFFDIO_WRITEPROTECT and VM_UFFD_WP
> > > UFFD_BACKEND_FEAT_MINOR -> _UFFDIO_CONTINUE and VM_UFFD_MINOR
> > > UFFD_BACKEND_FEAT_POISON -> _UFFDIO_POISON
> > 
> > This layer of mapping can be helpful to some, but maybe confusing to
> > others.. who is familiar with existing userfaultfd definitions.
> > 
> 
> Just wondering, is this confusing to you, and if so, which part?
> 
> To me it makes perfect sense and cleans up this API and not have to sets of
> flags that are somehow interlinked.

It adds the extra layer of mapping that will only be used in vm_uffd_ops
and the helper that will consume it.

But I confess this might be subjective.

> 
> > > > 
> > > > If hugetlbfs supported ZEROPAGE, then we can deduce the ioctls the other
> > > > way round, and we can drop the uffd_ioctls.  However we need the ioctls now
> > > > for hugetlbfs to make everything generic.
> > > 
> > > POISON is not a VM_ flag, so that wouldn't work completely, right?
> > 
> > Logically speaking, POISON should be meaningful if MISSING|MINOR is
> > supported.  However, in reality, POISON should always be supported across
> > all types..
> 
> Do you know what the plans are with guest_memfd?

I am not aware of anyone discussing this yet, but IMHO we need to support
it at least for the !CoCo use cases.

I do not know how CoCo manages poisoned pages, e.g. if they are kept being
encrypted or not even if corrupted.

Thanks,

-- 
Peter Xu
Re: [PATCH v3 1/4] mm: Introduce vm_uffd_ops API
Posted by David Hildenbrand 7 hours ago
On 01.10.25 16:35, Peter Xu wrote:
> On Wed, Oct 01, 2025 at 03:58:14PM +0200, David Hildenbrand wrote:
>>>>>> I briefly wondered whether we could use actual UFFD_FEATURE_* here, but they
>>>>>> are rather unsuited for this case here (e.g., different feature flags for
>>>>>> hugetlb support/shmem support etc).
>>>>>>
>>>>>> But reading "uffd_ioctls" below, can't we derive the suitable vma flags from
>>>>>> the supported ioctls?
>>>>>>
>>>>>> _UFFDIO_COPY | _UFDIO_ZEROPAGE -> VM_UFFD_MISSING
>>>>>> _UFFDIO_WRITEPROTECT -> VM_UFFD_WP
>>>>>> _UFFDIO_CONTINUE -> VM_UFFD_MINOR
>>>>>
>>>>> Yes we can deduce that, but it'll be unclear then when one stares at a
>>>>> bunch of ioctls and cannot easily digest the modes the memory type
>>>>> supports.  Here, the modes should be the most straightforward way to
>>>>> describe the capability of a memory type.
>>>>
>>>> I rather dislike the current split approach between vm-flags and ioctls.
>>>>
>>>> I briefly thought about abstracting it for internal purposes further and
>>>> just have some internal backend ("memory type") flags.
>>>>
>>>> UFFD_BACKEND_FEAT_MISSING -> _UFFDIO_COPY and VM_UFFD_MISSING
>>>> UFFD_BACKEND_FEAT_ZEROPAGE -> _UFDIO_ZEROPAGE
>>>> UFFD_BACKEND_FEAT_WP -> _UFFDIO_WRITEPROTECT and VM_UFFD_WP
>>>> UFFD_BACKEND_FEAT_MINOR -> _UFFDIO_CONTINUE and VM_UFFD_MINOR
>>>> UFFD_BACKEND_FEAT_POISON -> _UFFDIO_POISON
>>>
>>> This layer of mapping can be helpful to some, but maybe confusing to
>>> others.. who is familiar with existing userfaultfd definitions.
>>>
>>
>> Just wondering, is this confusing to you, and if so, which part?
>>
>> To me it makes perfect sense and cleans up this API and not have to sets of
>> flags that are somehow interlinked.
> 
> It adds the extra layer of mapping that will only be used in vm_uffd_ops
> and the helper that will consume it.

Agreed, while making the API cleaner. I don't easily see what's 
confusing about that, though.

I think it can be done with a handful of LOC and avoid having to use VM_ 
flags in this API.

-- 
Cheers

David / dhildenb
Re: [PATCH v3 1/4] mm: Introduce vm_uffd_ops API
Posted by Mike Rapoport 1 day, 12 hours ago
On Tue, Sep 30, 2025 at 11:36:53AM +0200, David Hildenbrand wrote:
> On 26.09.25 23:16, Peter Xu wrote:
> > +	/**
> > +	 * uffd_get_folio: Handler to resolve UFFDIO_CONTINUE request.
> 
> Just wondering if we could incorporate the "continue" / "minor" aspect into
> the callback name.
> 
> uffd_minor_get_folio / uffd_continue_get_folio
> 
> Or do you see use of that callback in the context of other uffd features?

If someone picks the gauntlet of refactoring the loop in mcopy_atomic()
we'd need a similar callback for uffd copy. And as I see it it would be
different enough to warrant emphasizing minor/continue in the name here.

I also think we can drop uffd_ prefix for the callback, as it's called as
uffd_ops->get_folio() or whatever it's be called.
 
> -- 
> Cheers
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.
Re: [PATCH v3 1/4] mm: Introduce vm_uffd_ops API
Posted by David Hildenbrand 1 day, 12 hours ago
On 30.09.25 12:07, Mike Rapoport wrote:
> On Tue, Sep 30, 2025 at 11:36:53AM +0200, David Hildenbrand wrote:
>> On 26.09.25 23:16, Peter Xu wrote:
>>> +	/**
>>> +	 * uffd_get_folio: Handler to resolve UFFDIO_CONTINUE request.
>>
>> Just wondering if we could incorporate the "continue" / "minor" aspect into
>> the callback name.
>>
>> uffd_minor_get_folio / uffd_continue_get_folio
>>
>> Or do you see use of that callback in the context of other uffd features?
> 
> If someone picks the gauntlet of refactoring the loop in mcopy_atomic()
> we'd need a similar callback for uffd copy. And as I see it it would be
> different enough to warrant emphasizing minor/continue in the name here.
> 
> I also think we can drop uffd_ prefix for the callback, as it's called as
> uffd_ops->get_folio() or whatever it's be called.

Agreed. I got annoyed yesterday when typing vma->vm_mm often enough 
(vma->mm! ).

-- 
Cheers

David / dhildenb
Re: [PATCH v3 1/4] mm: Introduce vm_uffd_ops API
Posted by Peter Xu 1 day, 3 hours ago
On Tue, Sep 30, 2025 at 12:18:37PM +0200, David Hildenbrand wrote:
> On 30.09.25 12:07, Mike Rapoport wrote:
> > On Tue, Sep 30, 2025 at 11:36:53AM +0200, David Hildenbrand wrote:
> > > On 26.09.25 23:16, Peter Xu wrote:
> > > > +	/**
> > > > +	 * uffd_get_folio: Handler to resolve UFFDIO_CONTINUE request.
> > > 
> > > Just wondering if we could incorporate the "continue" / "minor" aspect into
> > > the callback name.
> > > 
> > > uffd_minor_get_folio / uffd_continue_get_folio
> > > 
> > > Or do you see use of that callback in the context of other uffd features?
> > 
> > If someone picks the gauntlet of refactoring the loop in mcopy_atomic()
> > we'd need a similar callback for uffd copy. And as I see it it would be
> > different enough to warrant emphasizing minor/continue in the name here.

Sure, I can go with uffd_minor_get_folio when I repost.

> > 
> > I also think we can drop uffd_ prefix for the callback, as it's called as
> > uffd_ops->get_folio() or whatever it's be called.
> 
> Agreed. I got annoyed yesterday when typing vma->vm_mm often enough
> (vma->mm! ).

That's also why I kept uffd_ because that's the tradition mm/ uses in many
important data structures like vma and mm.  It helps most tagging systems
that most Linux developers use to avoid global name collisions.

So I tend to keep the prefix for now, until we want to switch away from
Hungarian-like notations completely. But let me know if anyone has strong
feelings.

Thanks,

-- 
Peter Xu