[PATCH 12/16] mm: update resctl to use mmap_prepare, mmap_complete, mmap_abort

Lorenzo Stoakes posted 16 patches 1 day, 14 hours ago
[PATCH 12/16] mm: update resctl to use mmap_prepare, mmap_complete, mmap_abort
Posted by Lorenzo Stoakes 1 day, 13 hours ago
resctl uses remap_pfn_range(), but holds a mutex over the
operation. Therefore, establish the mutex in mmap_prepare(), release it in
mmap_complete() and release it in mmap_abort() should the operation fail.

Otherwise, we simply make use of the remap_pfn_range_[prepare/complete]()
remap PFN range variants in an ordinary way.

Signed-off-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
---
 fs/resctrl/pseudo_lock.c | 56 +++++++++++++++++++++++++++++++---------
 1 file changed, 44 insertions(+), 12 deletions(-)

diff --git a/fs/resctrl/pseudo_lock.c b/fs/resctrl/pseudo_lock.c
index 87bbc2605de1..6d18ffde6a94 100644
--- a/fs/resctrl/pseudo_lock.c
+++ b/fs/resctrl/pseudo_lock.c
@@ -995,7 +995,8 @@ static const struct vm_operations_struct pseudo_mmap_ops = {
 	.mremap = pseudo_lock_dev_mremap,
 };
 
-static int pseudo_lock_dev_mmap(struct file *filp, struct vm_area_struct *vma)
+static int pseudo_lock_dev_mmap_complete(struct file *filp, struct vm_area_struct *vma,
+					 const void *context)
 {
 	unsigned long vsize = vma->vm_end - vma->vm_start;
 	unsigned long off = vma->vm_pgoff << PAGE_SHIFT;
@@ -1004,6 +1005,40 @@ static int pseudo_lock_dev_mmap(struct file *filp, struct vm_area_struct *vma)
 	unsigned long physical;
 	unsigned long psize;
 
+	rdtgrp = filp->private_data;
+	plr = rdtgrp->plr;
+
+	physical = __pa(plr->kmem) >> PAGE_SHIFT;
+	psize = plr->size - off;
+
+	memset(plr->kmem + off, 0, vsize);
+
+	if (remap_pfn_range_complete(vma, vma->vm_start, physical + vma->vm_pgoff,
+			    vsize, vma->vm_page_prot)) {
+		mutex_unlock(&rdtgroup_mutex);
+		return -EAGAIN;
+	}
+
+	mutex_unlock(&rdtgroup_mutex);
+	return 0;
+}
+
+static void pseudo_lock_dev_mmap_abort(const struct file *filp,
+				       const void *vm_private_data,
+				       const void *context)
+{
+	mutex_unlock(&rdtgroup_mutex);
+}
+
+static int pseudo_lock_dev_mmap_prepare(struct vm_area_desc *desc)
+{
+	unsigned long vsize = vma_desc_size(desc);
+	unsigned long off = desc->pgoff << PAGE_SHIFT;
+	struct file *filp = desc->file;
+	struct pseudo_lock_region *plr;
+	struct rdtgroup *rdtgrp;
+	unsigned long psize;
+
 	mutex_lock(&rdtgroup_mutex);
 
 	rdtgrp = filp->private_data;
@@ -1031,7 +1066,6 @@ static int pseudo_lock_dev_mmap(struct file *filp, struct vm_area_struct *vma)
 		return -EINVAL;
 	}
 
-	physical = __pa(plr->kmem) >> PAGE_SHIFT;
 	psize = plr->size - off;
 
 	if (off > plr->size) {
@@ -1043,7 +1077,7 @@ static int pseudo_lock_dev_mmap(struct file *filp, struct vm_area_struct *vma)
 	 * Ensure changes are carried directly to the memory being mapped,
 	 * do not allow copy-on-write mapping.
 	 */
-	if (!(vma->vm_flags & VM_SHARED)) {
+	if (!(desc->vm_flags & VM_SHARED)) {
 		mutex_unlock(&rdtgroup_mutex);
 		return -EINVAL;
 	}
@@ -1053,15 +1087,11 @@ static int pseudo_lock_dev_mmap(struct file *filp, struct vm_area_struct *vma)
 		return -ENOSPC;
 	}
 
-	memset(plr->kmem + off, 0, vsize);
+	/* No CoW allowed so don't need to specify pfn. */
+	remap_pfn_range_prepare(desc, 0);
+	desc->vm_ops = &pseudo_mmap_ops;
 
-	if (remap_pfn_range(vma, vma->vm_start, physical + vma->vm_pgoff,
-			    vsize, vma->vm_page_prot)) {
-		mutex_unlock(&rdtgroup_mutex);
-		return -EAGAIN;
-	}
-	vma->vm_ops = &pseudo_mmap_ops;
-	mutex_unlock(&rdtgroup_mutex);
+	/* mutex will be release in mmap_complete or mmap_abort. */
 	return 0;
 }
 
@@ -1071,7 +1101,9 @@ static const struct file_operations pseudo_lock_dev_fops = {
 	.write =	NULL,
 	.open =		pseudo_lock_dev_open,
 	.release =	pseudo_lock_dev_release,
-	.mmap =		pseudo_lock_dev_mmap,
+	.mmap_prepare =	pseudo_lock_dev_mmap_prepare,
+	.mmap_complete = pseudo_lock_dev_mmap_complete,
+	.mmap_abort =	pseudo_lock_dev_mmap_abort,
 };
 
 int rdt_pseudo_lock_init(void)
-- 
2.51.0
Re: [PATCH 12/16] mm: update resctl to use mmap_prepare, mmap_complete, mmap_abort
Posted by Jason Gunthorpe 1 day, 11 hours ago
On Mon, Sep 08, 2025 at 12:10:43PM +0100, Lorenzo Stoakes wrote:
> resctl uses remap_pfn_range(), but holds a mutex over the
> operation. Therefore, establish the mutex in mmap_prepare(), release it in
> mmap_complete() and release it in mmap_abort() should the operation fail.

The mutex can't do anything relative to remap_pfn, no reason to hold it.

> @@ -1053,15 +1087,11 @@ static int pseudo_lock_dev_mmap(struct file *filp, struct vm_area_struct *vma)
>  		return -ENOSPC;
>  	}
>  
> -	memset(plr->kmem + off, 0, vsize);
> +	/* No CoW allowed so don't need to specify pfn. */
> +	remap_pfn_range_prepare(desc, 0);

This would be a good place to make a more generic helper..

 ret = remap_pfn_no_cow(desc, phys);

And it can consistently check for !shared internally.

Store phys in the desc and use common code to trigger the PTE population
during complete.

Jason
Re: [PATCH 12/16] mm: update resctl to use mmap_prepare, mmap_complete, mmap_abort
Posted by Lorenzo Stoakes 1 day, 10 hours ago
On Mon, Sep 08, 2025 at 10:24:47AM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 08, 2025 at 12:10:43PM +0100, Lorenzo Stoakes wrote:
> > resctl uses remap_pfn_range(), but holds a mutex over the
> > operation. Therefore, establish the mutex in mmap_prepare(), release it in
> > mmap_complete() and release it in mmap_abort() should the operation fail.
>
> The mutex can't do anything relative to remap_pfn, no reason to hold it.

Sorry I missed this bit before...

Yeah I guess my concern was that the original code very intentionally holds the
mutex _over the remap operation_.

But I guess given we release the lock on failure this isn't necessary, and of
course obviously the lock has no bearing ont he actual remap.

Will drop it and drop mmap_abort for now as it's not yet needed.

Cheers, Lorenzo
Re: [PATCH 12/16] mm: update resctl to use mmap_prepare, mmap_complete, mmap_abort
Posted by Lorenzo Stoakes 1 day, 11 hours ago
On Mon, Sep 08, 2025 at 10:24:47AM -0300, Jason Gunthorpe wrote:
> On Mon, Sep 08, 2025 at 12:10:43PM +0100, Lorenzo Stoakes wrote:
> > resctl uses remap_pfn_range(), but holds a mutex over the
> > operation. Therefore, establish the mutex in mmap_prepare(), release it in
> > mmap_complete() and release it in mmap_abort() should the operation fail.
>
> The mutex can't do anything relative to remap_pfn, no reason to hold it.
>
> > @@ -1053,15 +1087,11 @@ static int pseudo_lock_dev_mmap(struct file *filp, struct vm_area_struct *vma)
> >  		return -ENOSPC;
> >  	}
> >
> > -	memset(plr->kmem + off, 0, vsize);
> > +	/* No CoW allowed so don't need to specify pfn. */
> > +	remap_pfn_range_prepare(desc, 0);
>
> This would be a good place to make a more generic helper..
>
>  ret = remap_pfn_no_cow(desc, phys);

Ha, funny I suggested a _no_cow() thing earlier :) seems we are agreed on that
then!

Presumably you mean remap_pfn_no_cow_prepare()?

>
> And it can consistently check for !shared internally.
>
> Store phys in the desc and use common code to trigger the PTE population
> during complete.

We can use mmap_context for this, I guess it's not a terrible idea to set .pfn
but I just don't want to add any confusion as to what doing that means in
the non-generic mmap_complete case.

>
> Jason

Cheers, Lorenzo