[PATCH v2 0/6] x86/mem-paging: misc cleanup

Jan Beulich posted 6 patches 3 years, 11 months ago
Only 0 patches received!
[PATCH v2 0/6] x86/mem-paging: misc cleanup
Posted by Jan Beulich 3 years, 11 months ago
Repeatedly looking at varying parts of this code has lead to
me accumulating a few adjustments.

1: fold p2m_mem_paging_prep()'s main if()-s
2: correct p2m_mem_paging_prep()'s error handling
3: use guest handle for XENMEM_paging_op_prep
4: add minimal lock order enforcement to p2m_mem_paging_prep()
5: move code to its dedicated source file
6: consistently use gfn_t

Jan

[PATCH v2 1/6] x86/mem-paging: fold p2m_mem_paging_prep()'s main if()-s
Posted by Jan Beulich 3 years, 11 months ago
The condition of the second can be true only if the condition of the
first was met; the second half of the condition of the second then also
is redundant with an earlier check. Combine them, drop a pointless
local variable, and re-flow the affected gdprintk().

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1808,6 +1808,8 @@ int p2m_mem_paging_prep(struct domain *d
     /* Allocate a page if the gfn does not have one yet */
     if ( !mfn_valid(mfn) )
     {
+        void *guest_map;
+
         /* If the user did not provide a buffer, we disallow */
         ret = -EINVAL;
         if ( unlikely(user_ptr == NULL) )
@@ -1819,22 +1821,16 @@ int p2m_mem_paging_prep(struct domain *d
             goto out;
         mfn = page_to_mfn(page);
         page_extant = 0;
-    }
-
-    /* If we were given a buffer, now is the time to use it */
-    if ( !page_extant && user_ptr )
-    {
-        void *guest_map;
-        int rc;
 
         ASSERT( mfn_valid(mfn) );
         guest_map = map_domain_page(mfn);
-        rc = copy_from_user(guest_map, user_ptr, PAGE_SIZE);
+        ret = copy_from_user(guest_map, user_ptr, PAGE_SIZE);
         unmap_domain_page(guest_map);
-        if ( rc )
+        if ( ret )
         {
-            gdprintk(XENLOG_ERR, "Failed to load paging-in gfn %lx domain %u "
-                                 "bytes left %d\n", gfn_l, d->domain_id, rc);
+            gdprintk(XENLOG_ERR,
+                     "Failed to load paging-in gfn %lx Dom%d bytes left %d\n",
+                     gfn_l, d->domain_id, ret);
             ret = -EFAULT;
             put_page(page); /* Don't leak pages */
             goto out;            


Re: [PATCH v2 1/6] x86/mem-paging: fold p2m_mem_paging_prep()'s main if()-s
Posted by Andrew Cooper 3 years, 10 months ago
On 23/04/2020 09:37, Jan Beulich wrote:
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1819,22 +1821,16 @@ int p2m_mem_paging_prep(struct domain *d
>              goto out;
>          mfn = page_to_mfn(page);
>          page_extant = 0;
> -    }
> -
> -    /* If we were given a buffer, now is the time to use it */
> -    if ( !page_extant && user_ptr )
> -    {
> -        void *guest_map;
> -        int rc;
>  
>          ASSERT( mfn_valid(mfn) );
>          guest_map = map_domain_page(mfn);
> -        rc = copy_from_user(guest_map, user_ptr, PAGE_SIZE);
> +        ret = copy_from_user(guest_map, user_ptr, PAGE_SIZE);
>          unmap_domain_page(guest_map);
> -        if ( rc )
> +        if ( ret )
>          {
> -            gdprintk(XENLOG_ERR, "Failed to load paging-in gfn %lx domain %u "
> -                                 "bytes left %d\n", gfn_l, d->domain_id, rc);
> +            gdprintk(XENLOG_ERR,
> +                     "Failed to load paging-in gfn %lx Dom%d bytes left %d\n",
> +                     gfn_l, d->domain_id, ret);

%pd, and "%pd gfn %lx" would be a more natural way to phrase it.

That said - I'm not sure how useful the information is.  We don't
normally print any diagnostics on -EFAULT and I don't see why this case
is special.

With at least %pd fixed, but preferably with the printk() dropped,
Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Re: [PATCH v2 1/6] x86/mem-paging: fold p2m_mem_paging_prep()'s main if()-s
Posted by Roger Pau Monné 3 years, 10 months ago
On Thu, Apr 23, 2020 at 10:37:06AM +0200, Jan Beulich wrote:
> The condition of the second can be true only if the condition of the
> first was met; the second half of the condition of the second then also
> is redundant with an earlier check. Combine them, drop a pointless
> local variable, and re-flow the affected gdprintk().
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> 
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1808,6 +1808,8 @@ int p2m_mem_paging_prep(struct domain *d
>      /* Allocate a page if the gfn does not have one yet */
>      if ( !mfn_valid(mfn) )
>      {
> +        void *guest_map;
> +
>          /* If the user did not provide a buffer, we disallow */
>          ret = -EINVAL;
>          if ( unlikely(user_ptr == NULL) )
> @@ -1819,22 +1821,16 @@ int p2m_mem_paging_prep(struct domain *d
>              goto out;
>          mfn = page_to_mfn(page);
>          page_extant = 0;
> -    }
> -
> -    /* If we were given a buffer, now is the time to use it */
> -    if ( !page_extant && user_ptr )
> -    {
> -        void *guest_map;
> -        int rc;
>  
>          ASSERT( mfn_valid(mfn) );

I would be tempted to remove this assert also, since you just
successfully allocated the page at this point.

Thanks, Roger.

Re: [PATCH v2 1/6] x86/mem-paging: fold p2m_mem_paging_prep()'s main if()-s
Posted by Jan Beulich 3 years, 10 months ago
On 14.05.2020 17:45, Roger Pau Monné wrote:
> On Thu, Apr 23, 2020 at 10:37:06AM +0200, Jan Beulich wrote:
>> The condition of the second can be true only if the condition of the
>> first was met; the second half of the condition of the second then also
>> is redundant with an earlier check. Combine them, drop a pointless
>> local variable, and re-flow the affected gdprintk().
>>
>> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

>> --- a/xen/arch/x86/mm/p2m.c
>> +++ b/xen/arch/x86/mm/p2m.c
>> @@ -1808,6 +1808,8 @@ int p2m_mem_paging_prep(struct domain *d
>>      /* Allocate a page if the gfn does not have one yet */
>>      if ( !mfn_valid(mfn) )
>>      {
>> +        void *guest_map;
>> +
>>          /* If the user did not provide a buffer, we disallow */
>>          ret = -EINVAL;
>>          if ( unlikely(user_ptr == NULL) )
>> @@ -1819,22 +1821,16 @@ int p2m_mem_paging_prep(struct domain *d
>>              goto out;
>>          mfn = page_to_mfn(page);
>>          page_extant = 0;
>> -    }
>> -
>> -    /* If we were given a buffer, now is the time to use it */
>> -    if ( !page_extant && user_ptr )
>> -    {
>> -        void *guest_map;
>> -        int rc;
>>  
>>          ASSERT( mfn_valid(mfn) );
> 
> I would be tempted to remove this assert also, since you just
> successfully allocated the page at this point.

Oh, indeed, good point.

Jan

[PATCH v2 2/6] x86/mem-paging: correct p2m_mem_paging_prep()'s error handling
Posted by Jan Beulich 3 years, 11 months ago
Communicating errors from p2m_set_entry() to the caller is not enough:
Neither the M2P nor the stats updates should occur in such a case.
Instead the allocated page needs to be freed again; for cleanliness
reasons also properly take into account _PGC_allocated there.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1781,7 +1781,7 @@ void p2m_mem_paging_populate(struct doma
  */
 int p2m_mem_paging_prep(struct domain *d, unsigned long gfn_l, uint64_t buffer)
 {
-    struct page_info *page;
+    struct page_info *page = NULL;
     p2m_type_t p2mt;
     p2m_access_t a;
     gfn_t gfn = _gfn(gfn_l);
@@ -1816,9 +1816,19 @@ int p2m_mem_paging_prep(struct domain *d
             goto out;
         /* Get a free page */
         ret = -ENOMEM;
-        page = alloc_domheap_page(p2m->domain, 0);
+        page = alloc_domheap_page(d, 0);
         if ( unlikely(page == NULL) )
             goto out;
+        if ( unlikely(!get_page(page, d)) )
+        {
+            /*
+             * The domain can't possibly know about this page yet, so failure
+             * here is a clear indication of something fishy going on.
+             */
+            domain_crash(d);
+            page = NULL;
+            goto out;
+        }
         mfn = page_to_mfn(page);
         page_extant = 0;
 
@@ -1832,7 +1842,6 @@ int p2m_mem_paging_prep(struct domain *d
                      "Failed to load paging-in gfn %lx Dom%d bytes left %d\n",
                      gfn_l, d->domain_id, ret);
             ret = -EFAULT;
-            put_page(page); /* Don't leak pages */
             goto out;            
         }
     }
@@ -1843,13 +1852,24 @@ int p2m_mem_paging_prep(struct domain *d
     ret = p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
                         paging_mode_log_dirty(d) ? p2m_ram_logdirty
                                                  : p2m_ram_rw, a);
-    set_gpfn_from_mfn(mfn_x(mfn), gfn_l);
+    if ( !ret )
+    {
+        set_gpfn_from_mfn(mfn_x(mfn), gfn_l);
 
-    if ( !page_extant )
-        atomic_dec(&d->paged_pages);
+        if ( !page_extant )
+            atomic_dec(&d->paged_pages);
+    }
 
  out:
     gfn_unlock(p2m, gfn, 0);
+
+    if ( page )
+    {
+        if ( ret )
+            put_page_alloc_ref(page);
+        put_page(page);
+    }
+
     return ret;
 }
 


Re: [PATCH v2 2/6] x86/mem-paging: correct p2m_mem_paging_prep()'s error handling
Posted by Roger Pau Monné 3 years, 10 months ago
On Thu, Apr 23, 2020 at 10:37:46AM +0200, Jan Beulich wrote:
> Communicating errors from p2m_set_entry() to the caller is not enough:
> Neither the M2P nor the stats updates should occur in such a case.
> Instead the allocated page needs to be freed again; for cleanliness
> reasons also properly take into account _PGC_allocated there.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> 
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1781,7 +1781,7 @@ void p2m_mem_paging_populate(struct doma
>   */
>  int p2m_mem_paging_prep(struct domain *d, unsigned long gfn_l, uint64_t buffer)
>  {
> -    struct page_info *page;
> +    struct page_info *page = NULL;
>      p2m_type_t p2mt;
>      p2m_access_t a;
>      gfn_t gfn = _gfn(gfn_l);
> @@ -1816,9 +1816,19 @@ int p2m_mem_paging_prep(struct domain *d
>              goto out;
>          /* Get a free page */
>          ret = -ENOMEM;
> -        page = alloc_domheap_page(p2m->domain, 0);
> +        page = alloc_domheap_page(d, 0);
>          if ( unlikely(page == NULL) )
>              goto out;
> +        if ( unlikely(!get_page(page, d)) )
> +        {
> +            /*
> +             * The domain can't possibly know about this page yet, so failure
> +             * here is a clear indication of something fishy going on.
> +             */
> +            domain_crash(d);
> +            page = NULL;
> +            goto out;
> +        }
>          mfn = page_to_mfn(page);
>          page_extant = 0;
>  
> @@ -1832,7 +1842,6 @@ int p2m_mem_paging_prep(struct domain *d
>                       "Failed to load paging-in gfn %lx Dom%d bytes left %d\n",
>                       gfn_l, d->domain_id, ret);
>              ret = -EFAULT;
> -            put_page(page); /* Don't leak pages */
>              goto out;            
>          }
>      }
> @@ -1843,13 +1852,24 @@ int p2m_mem_paging_prep(struct domain *d
>      ret = p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
>                          paging_mode_log_dirty(d) ? p2m_ram_logdirty
>                                                   : p2m_ram_rw, a);
> -    set_gpfn_from_mfn(mfn_x(mfn), gfn_l);

I would instead do `if ( ret ) goto out`;

And won't touch the code afterwards.

Thanks, Roger.

Re: [PATCH v2 2/6] x86/mem-paging: correct p2m_mem_paging_prep()'s error handling
Posted by Andrew Cooper 3 years, 10 months ago
On 23/04/2020 09:37, Jan Beulich wrote:
> Communicating errors from p2m_set_entry() to the caller is not enough:
> Neither the M2P nor the stats updates should occur in such a case.

"neither".  Following a colon/semicolon isn't the start of a new sentence.

> Instead the allocated page needs to be freed again; for cleanliness
> reasons also properly take into account _PGC_allocated there.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
>
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1781,7 +1781,7 @@ void p2m_mem_paging_populate(struct doma
>   */
>  int p2m_mem_paging_prep(struct domain *d, unsigned long gfn_l, uint64_t buffer)
>  {
> -    struct page_info *page;
> +    struct page_info *page = NULL;
>      p2m_type_t p2mt;
>      p2m_access_t a;
>      gfn_t gfn = _gfn(gfn_l);
> @@ -1816,9 +1816,19 @@ int p2m_mem_paging_prep(struct domain *d
>              goto out;
>          /* Get a free page */
>          ret = -ENOMEM;
> -        page = alloc_domheap_page(p2m->domain, 0);
> +        page = alloc_domheap_page(d, 0);
>          if ( unlikely(page == NULL) )
>              goto out;
> +        if ( unlikely(!get_page(page, d)) )
> +        {
> +            /*
> +             * The domain can't possibly know about this page yet, so failure
> +             * here is a clear indication of something fishy going on.
> +             */

This needs a gprintk(XENLOG_ERR, ...) of some form.  (which also reminds
me that I *still* need to finish my patch forcing everyone to provide
something to qualify the domain crash, so release builds of Xen get some
hint as to what went on or why.)

> +            domain_crash(d);
> +            page = NULL;
> +            goto out;
> +        }
>          mfn = page_to_mfn(page);
>          page_extant = 0;
>  
> @@ -1832,7 +1842,6 @@ int p2m_mem_paging_prep(struct domain *d
>                       "Failed to load paging-in gfn %lx Dom%d bytes left %d\n",
>                       gfn_l, d->domain_id, ret);
>              ret = -EFAULT;
> -            put_page(page); /* Don't leak pages */
>              goto out;            
>          }
>      }
> @@ -1843,13 +1852,24 @@ int p2m_mem_paging_prep(struct domain *d
>      ret = p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
>                          paging_mode_log_dirty(d) ? p2m_ram_logdirty
>                                                   : p2m_ram_rw, a);
> -    set_gpfn_from_mfn(mfn_x(mfn), gfn_l);
> +    if ( !ret )
> +    {
> +        set_gpfn_from_mfn(mfn_x(mfn), gfn_l);
>  
> -    if ( !page_extant )
> -        atomic_dec(&d->paged_pages);
> +        if ( !page_extant )
> +            atomic_dec(&d->paged_pages);
> +    }
>  
>   out:
>      gfn_unlock(p2m, gfn, 0);
> +
> +    if ( page )
> +    {
> +        if ( ret )
> +            put_page_alloc_ref(page);
> +        put_page(page);

This is a very long way from clear enough to follow, and buggy if anyone
inserts a new goto out path.

~Andrew

> +    }
> +
>      return ret;
>  }
>  
>


Re: [PATCH v2 2/6] x86/mem-paging: correct p2m_mem_paging_prep()'s error handling
Posted by Jan Beulich 3 years, 10 months ago
On 15.05.2020 16:40, Andrew Cooper wrote:
> On 23/04/2020 09:37, Jan Beulich wrote:
>> @@ -1816,9 +1816,19 @@ int p2m_mem_paging_prep(struct domain *d
>>              goto out;
>>          /* Get a free page */
>>          ret = -ENOMEM;
>> -        page = alloc_domheap_page(p2m->domain, 0);
>> +        page = alloc_domheap_page(d, 0);
>>          if ( unlikely(page == NULL) )
>>              goto out;
>> +        if ( unlikely(!get_page(page, d)) )
>> +        {
>> +            /*
>> +             * The domain can't possibly know about this page yet, so failure
>> +             * here is a clear indication of something fishy going on.
>> +             */
> 
> This needs a gprintk(XENLOG_ERR, ...) of some form.  (which also reminds
> me that I *still* need to finish my patch forcing everyone to provide
> something to qualify the domain crash, so release builds of Xen get some
> hint as to what went on or why.)

First of all - I've committed the patch earlier today, with Roger's
R-b, and considering that it had been pending for quite some time.

As to the specific aspect:

>> +            domain_crash(d);

This already leaves a file/line combination as a (minimal hint). I
can make a patch to add a gprintk() as you ask for, but I'm not
sure it's worth it for this almost dead code.

>> @@ -1843,13 +1852,24 @@ int p2m_mem_paging_prep(struct domain *d
>>      ret = p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
>>                          paging_mode_log_dirty(d) ? p2m_ram_logdirty
>>                                                   : p2m_ram_rw, a);
>> -    set_gpfn_from_mfn(mfn_x(mfn), gfn_l);
>> +    if ( !ret )
>> +    {
>> +        set_gpfn_from_mfn(mfn_x(mfn), gfn_l);
>>  
>> -    if ( !page_extant )
>> -        atomic_dec(&d->paged_pages);
>> +        if ( !page_extant )
>> +            atomic_dec(&d->paged_pages);
>> +    }
>>  
>>   out:
>>      gfn_unlock(p2m, gfn, 0);
>> +
>> +    if ( page )
>> +    {
>> +        if ( ret )
>> +            put_page_alloc_ref(page);
>> +        put_page(page);
> 
> This is a very long way from clear enough to follow, and buggy if anyone
> inserts a new goto out path.

What alternatives do you see?

Jan

Re: [PATCH v2 2/6] x86/mem-paging: correct p2m_mem_paging_prep()'s error handling
Posted by Andrew Cooper 3 years, 10 months ago
On 15/05/2020 16:15, Jan Beulich wrote:
>>> +            domain_crash(d);
> This already leaves a file/line combination as a (minimal hint).

First, that is still tantamount to useless in logs from a user.

Second, the use of __LINE__ is why it breaks livepatching, and people
using livepatching is still carrying an out-of-tree patch to unbreak it.

> I can make a patch to add a gprintk() as you ask for, but I'm not
> sure it's worth it for this almost dead code.

"page in unexpected state" would be better than nothing, but given the
comment, it might also be better as ASSERT_UNREACHABLE(), and we now
have a lot of cases where we declare unreachable, and kill the domain in
release builds.

>
>>> @@ -1843,13 +1852,24 @@ int p2m_mem_paging_prep(struct domain *d
>>>      ret = p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
>>>                          paging_mode_log_dirty(d) ? p2m_ram_logdirty
>>>                                                   : p2m_ram_rw, a);
>>> -    set_gpfn_from_mfn(mfn_x(mfn), gfn_l);
>>> +    if ( !ret )
>>> +    {
>>> +        set_gpfn_from_mfn(mfn_x(mfn), gfn_l);
>>>  
>>> -    if ( !page_extant )
>>> -        atomic_dec(&d->paged_pages);
>>> +        if ( !page_extant )
>>> +            atomic_dec(&d->paged_pages);
>>> +    }
>>>  
>>>   out:
>>>      gfn_unlock(p2m, gfn, 0);
>>> +
>>> +    if ( page )
>>> +    {
>>> +        if ( ret )
>>> +            put_page_alloc_ref(page);
>>> +        put_page(page);
>> This is a very long way from clear enough to follow, and buggy if anyone
>> inserts a new goto out path.
> What alternatives do you see?

/* Fully free the page on error.  Drop our temporary reference in all
cases. */

would at least help someone trying to figure out what is going on here,
especially as put_page_alloc_ref() is not the obvious freeing function
for alloc_domheap_page().

~Andrew

[PATCH v2 3/6] x86/mem-paging: use guest handle for XENMEM_paging_op_prep
Posted by Jan Beulich 3 years, 11 months ago
While it should have been this way from the beginning, not doing so will
become an actual problem with PVH Dom0. The interface change is binary
compatible, but requires tools side producers to be re-built.

Drop the bogus/unnecessary page alignment restriction on the input
buffer at the same time.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Use HANDLE_64() instead of HANDLE_PARAM() for function parameter.
---
Is there really no way to avoid the buffer copying in libxc?

--- a/tools/libxc/xc_mem_paging.c
+++ b/tools/libxc/xc_mem_paging.c
@@ -26,15 +26,33 @@ static int xc_mem_paging_memop(xc_interf
                                unsigned int op, uint64_t gfn, void *buffer)
 {
     xen_mem_paging_op_t mpo;
+    DECLARE_HYPERCALL_BOUNCE(buffer, XC_PAGE_SIZE,
+                             XC_HYPERCALL_BUFFER_BOUNCE_IN);
+    int rc;
 
     memset(&mpo, 0, sizeof(mpo));
 
     mpo.op      = op;
     mpo.domain  = domain_id;
     mpo.gfn     = gfn;
-    mpo.buffer  = (unsigned long) buffer;
 
-    return do_memory_op(xch, XENMEM_paging_op, &mpo, sizeof(mpo));
+    if ( buffer )
+    {
+        if ( xc_hypercall_bounce_pre(xch, buffer) )
+        {
+            PERROR("Could not bounce memory for XENMEM_paging_op %u", op);
+            return -1;
+        }
+
+        set_xen_guest_handle(mpo.buffer, buffer);
+    }
+
+    rc = do_memory_op(xch, XENMEM_paging_op, &mpo, sizeof(mpo));
+
+    if ( buffer )
+        xc_hypercall_bounce_post(xch, buffer);
+
+    return rc;
 }
 
 int xc_mem_paging_enable(xc_interface *xch, uint32_t domain_id,
@@ -92,28 +110,13 @@ int xc_mem_paging_prep(xc_interface *xch
 int xc_mem_paging_load(xc_interface *xch, uint32_t domain_id,
                        uint64_t gfn, void *buffer)
 {
-    int rc, old_errno;
-
     errno = EINVAL;
 
     if ( !buffer )
         return -1;
 
-    if ( ((unsigned long) buffer) & (XC_PAGE_SIZE - 1) )
-        return -1;
-
-    if ( mlock(buffer, XC_PAGE_SIZE) )
-        return -1;
-
-    rc = xc_mem_paging_memop(xch, domain_id,
-                             XENMEM_paging_op_prep,
-                             gfn, buffer);
-
-    old_errno = errno;
-    munlock(buffer, XC_PAGE_SIZE);
-    errno = old_errno;
-
-    return rc;
+    return xc_mem_paging_memop(xch, domain_id, XENMEM_paging_op_prep,
+                               gfn, buffer);
 }
 
 
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1779,7 +1779,8 @@ void p2m_mem_paging_populate(struct doma
  * mfn if populate was called for  gfn which was nominated but not evicted. In
  * this case only the p2mt needs to be forwarded.
  */
-int p2m_mem_paging_prep(struct domain *d, unsigned long gfn_l, uint64_t buffer)
+int p2m_mem_paging_prep(struct domain *d, unsigned long gfn_l,
+                        XEN_GUEST_HANDLE_64(const_uint8) buffer)
 {
     struct page_info *page = NULL;
     p2m_type_t p2mt;
@@ -1788,13 +1789,9 @@ int p2m_mem_paging_prep(struct domain *d
     mfn_t mfn;
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
     int ret, page_extant = 1;
-    const void *user_ptr = (const void *) buffer;
 
-    if ( user_ptr )
-        /* Sanity check the buffer and bail out early if trouble */
-        if ( (buffer & (PAGE_SIZE - 1)) || 
-             (!access_ok(user_ptr, PAGE_SIZE)) )
-            return -EINVAL;
+    if ( !guest_handle_okay(buffer, PAGE_SIZE) )
+        return -EINVAL;
 
     gfn_lock(p2m, gfn, 0);
 
@@ -1812,7 +1809,7 @@ int p2m_mem_paging_prep(struct domain *d
 
         /* If the user did not provide a buffer, we disallow */
         ret = -EINVAL;
-        if ( unlikely(user_ptr == NULL) )
+        if ( unlikely(guest_handle_is_null(buffer)) )
             goto out;
         /* Get a free page */
         ret = -ENOMEM;
@@ -1834,7 +1831,7 @@ int p2m_mem_paging_prep(struct domain *d
 
         ASSERT( mfn_valid(mfn) );
         guest_map = map_domain_page(mfn);
-        ret = copy_from_user(guest_map, user_ptr, PAGE_SIZE);
+        ret = copy_from_guest(guest_map, buffer, PAGE_SIZE);
         unmap_domain_page(guest_map);
         if ( ret )
         {
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -741,7 +741,8 @@ void p2m_mem_paging_drop_page(struct dom
 /* Start populating a paged out frame */
 void p2m_mem_paging_populate(struct domain *d, unsigned long gfn);
 /* Prepare the p2m for paging a frame in */
-int p2m_mem_paging_prep(struct domain *d, unsigned long gfn, uint64_t buffer);
+int p2m_mem_paging_prep(struct domain *d, unsigned long gfn,
+                        XEN_GUEST_HANDLE_64(const_uint8) buffer);
 /* Resume normal operation (in case a domain was paused) */
 struct vm_event_st;
 void p2m_mem_paging_resume(struct domain *d, struct vm_event_st *rsp);
--- a/xen/include/public/memory.h
+++ b/xen/include/public/memory.h
@@ -396,10 +396,10 @@ struct xen_mem_paging_op {
     uint8_t     op;         /* XENMEM_paging_op_* */
     domid_t     domain;
 
-    /* PAGING_PREP IN: buffer to immediately fill page in */
-    uint64_aligned_t    buffer;
-    /* Other OPs */
-    uint64_aligned_t    gfn;           /* IN:  gfn of page being operated on */
+    /* IN: (XENMEM_paging_op_prep) buffer to immediately fill page from */
+    XEN_GUEST_HANDLE_64(const_uint8) buffer;
+    /* IN:  gfn of page being operated on */
+    uint64_aligned_t    gfn;
 };
 typedef struct xen_mem_paging_op xen_mem_paging_op_t;
 DEFINE_XEN_GUEST_HANDLE(xen_mem_paging_op_t);


Re: [PATCH v2 3/6] x86/mem-paging: use guest handle for XENMEM_paging_op_prep
Posted by Roger Pau Monné 3 years, 10 months ago
On Thu, Apr 23, 2020 at 10:38:18AM +0200, Jan Beulich wrote:
> While it should have been this way from the beginning, not doing so will
> become an actual problem with PVH Dom0. The interface change is binary
> compatible, but requires tools side producers to be re-built.
> 
> Drop the bogus/unnecessary page alignment restriction on the input
> buffer at the same time.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

> ---
> v2: Use HANDLE_64() instead of HANDLE_PARAM() for function parameter.
> ---
> Is there really no way to avoid the buffer copying in libxc?
> 
> --- a/tools/libxc/xc_mem_paging.c
> +++ b/tools/libxc/xc_mem_paging.c
> @@ -26,15 +26,33 @@ static int xc_mem_paging_memop(xc_interf
>                                 unsigned int op, uint64_t gfn, void *buffer)
>  {
>      xen_mem_paging_op_t mpo;
> +    DECLARE_HYPERCALL_BOUNCE(buffer, XC_PAGE_SIZE,
> +                             XC_HYPERCALL_BUFFER_BOUNCE_IN);
> +    int rc;
>  
>      memset(&mpo, 0, sizeof(mpo));
>  
>      mpo.op      = op;
>      mpo.domain  = domain_id;
>      mpo.gfn     = gfn;
> -    mpo.buffer  = (unsigned long) buffer;
>  
> -    return do_memory_op(xch, XENMEM_paging_op, &mpo, sizeof(mpo));
> +    if ( buffer )
> +    {
> +        if ( xc_hypercall_bounce_pre(xch, buffer) )
> +        {
> +            PERROR("Could not bounce memory for XENMEM_paging_op %u", op);
> +            return -1;
> +        }
> +
> +        set_xen_guest_handle(mpo.buffer, buffer);
> +    }
> +
> +    rc = do_memory_op(xch, XENMEM_paging_op, &mpo, sizeof(mpo));
> +
> +    if ( buffer )
> +        xc_hypercall_bounce_post(xch, buffer);
> +
> +    return rc;
>  }
>  
>  int xc_mem_paging_enable(xc_interface *xch, uint32_t domain_id,
> @@ -92,28 +110,13 @@ int xc_mem_paging_prep(xc_interface *xch
>  int xc_mem_paging_load(xc_interface *xch, uint32_t domain_id,
>                         uint64_t gfn, void *buffer)
>  {
> -    int rc, old_errno;
> -
>      errno = EINVAL;
>  
>      if ( !buffer )
>          return -1;
>  
> -    if ( ((unsigned long) buffer) & (XC_PAGE_SIZE - 1) )
> -        return -1;
> -
> -    if ( mlock(buffer, XC_PAGE_SIZE) )
> -        return -1;
> -
> -    rc = xc_mem_paging_memop(xch, domain_id,
> -                             XENMEM_paging_op_prep,
> -                             gfn, buffer);
> -
> -    old_errno = errno;
> -    munlock(buffer, XC_PAGE_SIZE);
> -    errno = old_errno;
> -
> -    return rc;
> +    return xc_mem_paging_memop(xch, domain_id, XENMEM_paging_op_prep,
> +                               gfn, buffer);

Sadly this function seems to still return -1/-errnoval, which is
weird, same applies to xc_mem_paging_memop. Not that you should fix it
here, just noticed.

Thanks, Roger.

Re: [PATCH v2 3/6] x86/mem-paging: use guest handle for XENMEM_paging_op_prep
Posted by Wei Liu 3 years, 10 months ago
On Thu, Apr 23, 2020 at 10:38:18AM +0200, Jan Beulich wrote:
> While it should have been this way from the beginning, not doing so will
> become an actual problem with PVH Dom0. The interface change is binary
> compatible, but requires tools side producers to be re-built.
> 
> Drop the bogus/unnecessary page alignment restriction on the input
> buffer at the same time.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Wei Liu <wl@xen.org>

> ---
> v2: Use HANDLE_64() instead of HANDLE_PARAM() for function parameter.
> ---
> Is there really no way to avoid the buffer copying in libxc?

The buffer is managed by the caller. There is no good way for libxc to
know its properties as far as I can tell.

Wei.

Re: [PATCH v2 3/6] x86/mem-paging: use guest handle for XENMEM_paging_op_prep
Posted by Andrew Cooper 3 years, 10 months ago
On 23/04/2020 09:38, Jan Beulich wrote:
> While it should have been this way from the beginning, not doing so will
> become an actual problem with PVH Dom0. The interface change is binary
> compatible, but requires tools side producers to be re-built.
>
> Drop the bogus/unnecessary page alignment restriction on the input
> buffer at the same time.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> ---
> v2: Use HANDLE_64() instead of HANDLE_PARAM() for function parameter.
> ---
> Is there really no way to avoid the buffer copying in libxc?

Not currently no.

Since we now have access to regular kernel memory via mmap() on
/dev/xen/hypercall, we are now in a position where someone could hook up
implement a small memory allocator backed exclusively by mmap()'d pages,
and this would remove all bounce buffering in userspace.

~Andrew

Ping: [PATCH v2 3/6] x86/mem-paging: use guest handle for XENMEM_paging_op_prep
Posted by Jan Beulich 3 years, 10 months ago
On 23.04.2020 10:38, Jan Beulich wrote:
> While it should have been this way from the beginning, not doing so will
> become an actual problem with PVH Dom0. The interface change is binary
> compatible, but requires tools side producers to be re-built.
> 
> Drop the bogus/unnecessary page alignment restriction on the input
> buffer at the same time.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

May I ask for a libxc side ack or otherwise, please?

Thanks, Jan

> --- a/tools/libxc/xc_mem_paging.c
> +++ b/tools/libxc/xc_mem_paging.c
> @@ -26,15 +26,33 @@ static int xc_mem_paging_memop(xc_interf
>                                 unsigned int op, uint64_t gfn, void *buffer)
>  {
>      xen_mem_paging_op_t mpo;
> +    DECLARE_HYPERCALL_BOUNCE(buffer, XC_PAGE_SIZE,
> +                             XC_HYPERCALL_BUFFER_BOUNCE_IN);
> +    int rc;
>  
>      memset(&mpo, 0, sizeof(mpo));
>  
>      mpo.op      = op;
>      mpo.domain  = domain_id;
>      mpo.gfn     = gfn;
> -    mpo.buffer  = (unsigned long) buffer;
>  
> -    return do_memory_op(xch, XENMEM_paging_op, &mpo, sizeof(mpo));
> +    if ( buffer )
> +    {
> +        if ( xc_hypercall_bounce_pre(xch, buffer) )
> +        {
> +            PERROR("Could not bounce memory for XENMEM_paging_op %u", op);
> +            return -1;
> +        }
> +
> +        set_xen_guest_handle(mpo.buffer, buffer);
> +    }
> +
> +    rc = do_memory_op(xch, XENMEM_paging_op, &mpo, sizeof(mpo));
> +
> +    if ( buffer )
> +        xc_hypercall_bounce_post(xch, buffer);
> +
> +    return rc;
>  }
>  
>  int xc_mem_paging_enable(xc_interface *xch, uint32_t domain_id,
> @@ -92,28 +110,13 @@ int xc_mem_paging_prep(xc_interface *xch
>  int xc_mem_paging_load(xc_interface *xch, uint32_t domain_id,
>                         uint64_t gfn, void *buffer)
>  {
> -    int rc, old_errno;
> -
>      errno = EINVAL;
>  
>      if ( !buffer )
>          return -1;
>  
> -    if ( ((unsigned long) buffer) & (XC_PAGE_SIZE - 1) )
> -        return -1;
> -
> -    if ( mlock(buffer, XC_PAGE_SIZE) )
> -        return -1;
> -
> -    rc = xc_mem_paging_memop(xch, domain_id,
> -                             XENMEM_paging_op_prep,
> -                             gfn, buffer);
> -
> -    old_errno = errno;
> -    munlock(buffer, XC_PAGE_SIZE);
> -    errno = old_errno;
> -
> -    return rc;
> +    return xc_mem_paging_memop(xch, domain_id, XENMEM_paging_op_prep,
> +                               gfn, buffer);
>  }
>  
>  
> --- a/xen/arch/x86/mm/p2m.c
> +++ b/xen/arch/x86/mm/p2m.c
> @@ -1779,7 +1779,8 @@ void p2m_mem_paging_populate(struct doma
>   * mfn if populate was called for  gfn which was nominated but not evicted. In
>   * this case only the p2mt needs to be forwarded.
>   */
> -int p2m_mem_paging_prep(struct domain *d, unsigned long gfn_l, uint64_t buffer)
> +int p2m_mem_paging_prep(struct domain *d, unsigned long gfn_l,
> +                        XEN_GUEST_HANDLE_64(const_uint8) buffer)
>  {
>      struct page_info *page = NULL;
>      p2m_type_t p2mt;
> @@ -1788,13 +1789,9 @@ int p2m_mem_paging_prep(struct domain *d
>      mfn_t mfn;
>      struct p2m_domain *p2m = p2m_get_hostp2m(d);
>      int ret, page_extant = 1;
> -    const void *user_ptr = (const void *) buffer;
>  
> -    if ( user_ptr )
> -        /* Sanity check the buffer and bail out early if trouble */
> -        if ( (buffer & (PAGE_SIZE - 1)) || 
> -             (!access_ok(user_ptr, PAGE_SIZE)) )
> -            return -EINVAL;
> +    if ( !guest_handle_okay(buffer, PAGE_SIZE) )
> +        return -EINVAL;
>  
>      gfn_lock(p2m, gfn, 0);
>  
> @@ -1812,7 +1809,7 @@ int p2m_mem_paging_prep(struct domain *d
>  
>          /* If the user did not provide a buffer, we disallow */
>          ret = -EINVAL;
> -        if ( unlikely(user_ptr == NULL) )
> +        if ( unlikely(guest_handle_is_null(buffer)) )
>              goto out;
>          /* Get a free page */
>          ret = -ENOMEM;
> @@ -1834,7 +1831,7 @@ int p2m_mem_paging_prep(struct domain *d
>  
>          ASSERT( mfn_valid(mfn) );
>          guest_map = map_domain_page(mfn);
> -        ret = copy_from_user(guest_map, user_ptr, PAGE_SIZE);
> +        ret = copy_from_guest(guest_map, buffer, PAGE_SIZE);
>          unmap_domain_page(guest_map);
>          if ( ret )
>          {
> --- a/xen/include/asm-x86/p2m.h
> +++ b/xen/include/asm-x86/p2m.h
> @@ -741,7 +741,8 @@ void p2m_mem_paging_drop_page(struct dom
>  /* Start populating a paged out frame */
>  void p2m_mem_paging_populate(struct domain *d, unsigned long gfn);
>  /* Prepare the p2m for paging a frame in */
> -int p2m_mem_paging_prep(struct domain *d, unsigned long gfn, uint64_t buffer);
> +int p2m_mem_paging_prep(struct domain *d, unsigned long gfn,
> +                        XEN_GUEST_HANDLE_64(const_uint8) buffer);
>  /* Resume normal operation (in case a domain was paused) */
>  struct vm_event_st;
>  void p2m_mem_paging_resume(struct domain *d, struct vm_event_st *rsp);
> --- a/xen/include/public/memory.h
> +++ b/xen/include/public/memory.h
> @@ -396,10 +396,10 @@ struct xen_mem_paging_op {
>      uint8_t     op;         /* XENMEM_paging_op_* */
>      domid_t     domain;
>  
> -    /* PAGING_PREP IN: buffer to immediately fill page in */
> -    uint64_aligned_t    buffer;
> -    /* Other OPs */
> -    uint64_aligned_t    gfn;           /* IN:  gfn of page being operated on */
> +    /* IN: (XENMEM_paging_op_prep) buffer to immediately fill page from */
> +    XEN_GUEST_HANDLE_64(const_uint8) buffer;
> +    /* IN:  gfn of page being operated on */
> +    uint64_aligned_t    gfn;
>  };
>  typedef struct xen_mem_paging_op xen_mem_paging_op_t;
>  DEFINE_XEN_GUEST_HANDLE(xen_mem_paging_op_t);
> 
> 


[PATCH v2 4/6] x86/mem-paging: add minimal lock order enforcement to p2m_mem_paging_prep()
Posted by Jan Beulich 3 years, 11 months ago
While full checking is impossible (as the lock is being acquired/
released down the call tree), perform at least a lock level check.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1813,6 +1813,7 @@ int p2m_mem_paging_prep(struct domain *d
             goto out;
         /* Get a free page */
         ret = -ENOMEM;
+        page_alloc_mm_pre_lock(d);
         page = alloc_domheap_page(d, 0);
         if ( unlikely(page == NULL) )
             goto out;


Re: [PATCH v2 4/6] x86/mem-paging: add minimal lock order enforcement to p2m_mem_paging_prep()
Posted by Roger Pau Monné 3 years, 10 months ago
On Thu, Apr 23, 2020 at 10:38:44AM +0200, Jan Beulich wrote:
> While full checking is impossible (as the lock is being acquired/
> released down the call tree), perform at least a lock level check.

I'm slightly confused, doesn't alloc_domheap_page already have it's
own lock order checking?

Thanks, Roger.

Re: [PATCH v2 4/6] x86/mem-paging: add minimal lock order enforcement to p2m_mem_paging_prep()
Posted by Jan Beulich 3 years, 10 months ago
On 14.05.2020 18:25, Roger Pau Monné wrote:
> On Thu, Apr 23, 2020 at 10:38:44AM +0200, Jan Beulich wrote:
>> While full checking is impossible (as the lock is being acquired/
>> released down the call tree), perform at least a lock level check.
> 
> I'm slightly confused, doesn't alloc_domheap_page already have it's
> own lock order checking?

I don't see how it would, as it doesn't (and can't legitimately)
include arch/x86/mm/mm-locks.h. Also maybe this comment in the
header clarifies it:

/* Page alloc lock (per-domain)
 *
 * This is an external lock, not represented by an mm_lock_t. However,
 * pod code uses it in conjunction with the p2m lock, and expecting
 * the ordering which we enforce here.
 * The lock is not recursive. */

Jan

Re: [PATCH v2 4/6] x86/mem-paging: add minimal lock order enforcement to p2m_mem_paging_prep()
Posted by Roger Pau Monné 3 years, 10 months ago
On Fri, May 15, 2020 at 11:46:23AM +0200, Jan Beulich wrote:
> [CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments unless you have verified the sender and know the content is safe.
> 
> On 14.05.2020 18:25, Roger Pau Monné wrote:
> > On Thu, Apr 23, 2020 at 10:38:44AM +0200, Jan Beulich wrote:
> >> While full checking is impossible (as the lock is being acquired/
> >> released down the call tree), perform at least a lock level check.
> > 
> > I'm slightly confused, doesn't alloc_domheap_page already have it's
> > own lock order checking?
> 
> I don't see how it would, as it doesn't (and can't legitimately)
> include arch/x86/mm/mm-locks.h. Also maybe this comment in the
> header clarifies it:
> 
> /* Page alloc lock (per-domain)
>  *
>  * This is an external lock, not represented by an mm_lock_t. However,
>  * pod code uses it in conjunction with the p2m lock, and expecting
>  * the ordering which we enforce here.
>  * The lock is not recursive. */


Thanks.

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

[PATCH v2 5/6] x86/mem-paging: move code to its dedicated source file
Posted by Jan Beulich 3 years, 11 months ago
Do a little bit of style adjustment along the way, and drop the
"p2m_mem_paging_" prefixes from the now static functions.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm/mem_paging.c
+++ b/xen/arch/x86/mm/mem_paging.c
@@ -25,6 +25,421 @@
 #include <xen/vm_event.h>
 #include <xsm/xsm.h>
 
+#include "mm-locks.h"
+
+/*
+ * p2m_mem_paging_drop_page - Tell pager to drop its reference to a paged page
+ * @d: guest domain
+ * @gfn: guest page to drop
+ *
+ * p2m_mem_paging_drop_page() will notify the pager that a paged-out gfn was
+ * released by the guest. The pager is supposed to drop its reference of the
+ * gfn.
+ */
+void p2m_mem_paging_drop_page(struct domain *d, unsigned long gfn,
+                              p2m_type_t p2mt)
+{
+    vm_event_request_t req = {
+        .reason = VM_EVENT_REASON_MEM_PAGING,
+        .u.mem_paging.gfn = gfn
+    };
+
+    /*
+     * We allow no ring in this unique case, because it won't affect
+     * correctness of the guest execution at this point.  If this is the only
+     * page that happens to be paged-out, we'll be okay..  but it's likely the
+     * guest will crash shortly anyways.
+     */
+    int rc = vm_event_claim_slot(d, d->vm_event_paging);
+
+    if ( rc < 0 )
+        return;
+
+    /* Send release notification to pager */
+    req.u.mem_paging.flags = MEM_PAGING_DROP_PAGE;
+
+    /* Update stats unless the page hasn't yet been evicted */
+    if ( p2mt != p2m_ram_paging_out )
+        atomic_dec(&d->paged_pages);
+    else
+        /* Evict will fail now, tag this request for pager */
+        req.u.mem_paging.flags |= MEM_PAGING_EVICT_FAIL;
+
+    vm_event_put_request(d, d->vm_event_paging, &req);
+}
+
+/*
+ * p2m_mem_paging_populate - Tell pager to populate a paged page
+ * @d: guest domain
+ * @gfn: guest page in paging state
+ *
+ * p2m_mem_paging_populate() will notify the pager that a page in any of the
+ * paging states needs to be written back into the guest.
+ * This function needs to be called whenever gfn_to_mfn() returns any of the p2m
+ * paging types because the gfn may not be backed by a mfn.
+ *
+ * The gfn can be in any of the paging states, but the pager needs only be
+ * notified when the gfn is in the paging-out path (paging_out or paged).  This
+ * function may be called more than once from several vcpus. If the vcpu belongs
+ * to the guest, the vcpu must be stopped and the pager notified that the vcpu
+ * was stopped. The pager needs to handle several requests for the same gfn.
+ *
+ * If the gfn is not in the paging-out path and the vcpu does not belong to the
+ * guest, nothing needs to be done and the function assumes that a request was
+ * already sent to the pager. In this case the caller has to try again until the
+ * gfn is fully paged in again.
+ */
+void p2m_mem_paging_populate(struct domain *d, unsigned long gfn_l)
+{
+    struct vcpu *v = current;
+    vm_event_request_t req = {
+        .reason = VM_EVENT_REASON_MEM_PAGING,
+        .u.mem_paging.gfn = gfn_l
+    };
+    p2m_type_t p2mt;
+    p2m_access_t a;
+    gfn_t gfn = _gfn(gfn_l);
+    mfn_t mfn;
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    int rc = vm_event_claim_slot(d, d->vm_event_paging);
+
+    /* We're paging. There should be a ring. */
+    if ( rc == -EOPNOTSUPP )
+    {
+        gdprintk(XENLOG_ERR, "Dom%d paging gfn %lx yet no ring in place\n",
+                 d->domain_id, gfn_l);
+        /* Prevent the vcpu from faulting repeatedly on the same gfn */
+        if ( v->domain == d )
+            vcpu_pause_nosync(v);
+        domain_crash(d);
+        return;
+    }
+    else if ( rc < 0 )
+        return;
+
+    /* Fix p2m mapping */
+    gfn_lock(p2m, gfn, 0);
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
+    /* Allow only nominated or evicted pages to enter page-in path */
+    if ( p2mt == p2m_ram_paging_out || p2mt == p2m_ram_paged )
+    {
+        /* Evict will fail now, tag this request for pager */
+        if ( p2mt == p2m_ram_paging_out )
+            req.u.mem_paging.flags |= MEM_PAGING_EVICT_FAIL;
+
+        rc = p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_paging_in, a);
+    }
+    gfn_unlock(p2m, gfn, 0);
+    if ( rc < 0 )
+        goto out_cancel;
+
+    /* Pause domain if request came from guest and gfn has paging type */
+    if ( p2m_is_paging(p2mt) && v->domain == d )
+    {
+        vm_event_vcpu_pause(v);
+        req.flags |= VM_EVENT_FLAG_VCPU_PAUSED;
+    }
+    /* No need to inform pager if the gfn is not in the page-out path */
+    else if ( p2mt != p2m_ram_paging_out && p2mt != p2m_ram_paged )
+    {
+        /* gfn is already on its way back and vcpu is not paused */
+    out_cancel:
+        vm_event_cancel_slot(d, d->vm_event_paging);
+        return;
+    }
+
+    /* Send request to pager */
+    req.u.mem_paging.p2mt = p2mt;
+    req.vcpu_id = v->vcpu_id;
+
+    vm_event_put_request(d, d->vm_event_paging, &req);
+}
+
+/*
+ * p2m_mem_paging_resume - Resume guest gfn
+ * @d: guest domain
+ * @rsp: vm_event response received
+ *
+ * p2m_mem_paging_resume() will forward the p2mt of a gfn to ram_rw. It is
+ * called by the pager.
+ *
+ * The gfn was previously either evicted and populated, or nominated and
+ * populated. If the page was evicted the p2mt will be p2m_ram_paging_in. If
+ * the page was just nominated the p2mt will be p2m_ram_paging_in_start because
+ * the pager did not call prepare().
+ *
+ * If the gfn was dropped the vcpu needs to be unpaused.
+ */
+void p2m_mem_paging_resume(struct domain *d, vm_event_response_t *rsp)
+{
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    p2m_type_t p2mt;
+    p2m_access_t a;
+    mfn_t mfn;
+
+    /* Fix p2m entry if the page was not dropped */
+    if ( !(rsp->u.mem_paging.flags & MEM_PAGING_DROP_PAGE) )
+    {
+        gfn_t gfn = _gfn(rsp->u.mem_access.gfn);
+
+        gfn_lock(p2m, gfn, 0);
+        mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
+        /*
+         * Allow only pages which were prepared properly, or pages which
+         * were nominated but not evicted.
+         */
+        if ( mfn_valid(mfn) && (p2mt == p2m_ram_paging_in) )
+        {
+            int rc = p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
+                                   paging_mode_log_dirty(d) ? p2m_ram_logdirty
+                                                            : p2m_ram_rw, a);
+
+            if ( !rc )
+                set_gpfn_from_mfn(mfn_x(mfn), gfn_x(gfn));
+        }
+        gfn_unlock(p2m, gfn, 0);
+    }
+}
+
+/*
+ * nominate - Mark a guest page as to-be-paged-out
+ * @d: guest domain
+ * @gfn: guest page to nominate
+ *
+ * Returns 0 for success or negative errno values if gfn is not pageable.
+ *
+ * nominate() is called by the pager and checks if a guest page can be paged
+ * out. If the following conditions are met the p2mt will be changed:
+ * - the gfn is backed by a mfn
+ * - the p2mt of the gfn is pageable
+ * - the mfn is not used for IO
+ * - the mfn has exactly one user and has no special meaning
+ *
+ * Once the p2mt is changed the page is readonly for the guest.  On success the
+ * pager can write the page contents to disk and later evict the page.
+ */
+static int nominate(struct domain *d, unsigned long gfn_l)
+{
+    struct page_info *page;
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    p2m_type_t p2mt;
+    p2m_access_t a;
+    gfn_t gfn = _gfn(gfn_l);
+    mfn_t mfn;
+    int ret = -EBUSY;
+
+    gfn_lock(p2m, gfn, 0);
+
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
+
+    /* Check if mfn is valid */
+    if ( !mfn_valid(mfn) )
+        goto out;
+
+    /* Check p2m type */
+    if ( !p2m_is_pageable(p2mt) )
+        goto out;
+
+    /* Check for io memory page */
+    if ( is_iomem_page(mfn) )
+        goto out;
+
+    /* Check page count and type */
+    page = mfn_to_page(mfn);
+    if ( (page->count_info & (PGC_count_mask | PGC_allocated)) !=
+         (1 | PGC_allocated) )
+        goto out;
+
+    if ( (page->u.inuse.type_info & PGT_count_mask) != 0 )
+        goto out;
+
+    /* Fix p2m entry */
+    ret = p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_paging_out, a);
+
+ out:
+    gfn_unlock(p2m, gfn, 0);
+    return ret;
+}
+
+/*
+ * evict - Mark a guest page as paged-out
+ * @d: guest domain
+ * @gfn: guest page to evict
+ *
+ * Returns 0 for success or negative errno values if eviction is not possible.
+ *
+ * evict() is called by the pager and will free a guest page and release it
+ * back to Xen. If the following conditions are met the page can be freed:
+ * - the gfn is backed by a mfn
+ * - the gfn was nominated
+ * - the mfn has still exactly one user and has no special meaning
+ *
+ * After successful nomination some other process could have mapped the page. In
+ * this case eviction can not be done. If the gfn was populated before the pager
+ * could evict it, eviction can not be done either. In this case the gfn is
+ * still backed by a mfn.
+ */
+static int evict(struct domain *d, unsigned long gfn_l)
+{
+    struct page_info *page;
+    p2m_type_t p2mt;
+    p2m_access_t a;
+    gfn_t gfn = _gfn(gfn_l);
+    mfn_t mfn;
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    int ret = -EBUSY;
+
+    gfn_lock(p2m, gfn, 0);
+
+    /* Get mfn */
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
+    if ( unlikely(!mfn_valid(mfn)) )
+        goto out;
+
+    /* Allow only nominated pages */
+    if ( p2mt != p2m_ram_paging_out )
+        goto out;
+
+    /* Get the page so it doesn't get modified under Xen's feet */
+    page = mfn_to_page(mfn);
+    if ( unlikely(!get_page(page, d)) )
+        goto out;
+
+    /* Check page count and type once more */
+    if ( (page->count_info & (PGC_count_mask | PGC_allocated)) !=
+         (2 | PGC_allocated) )
+        goto out_put;
+
+    if ( (page->u.inuse.type_info & PGT_count_mask) != 0 )
+        goto out_put;
+
+    /* Decrement guest domain's ref count of the page */
+    put_page_alloc_ref(page);
+
+    /* Remove mapping from p2m table */
+    ret = p2m_set_entry(p2m, gfn, INVALID_MFN, PAGE_ORDER_4K,
+                        p2m_ram_paged, a);
+
+    /* Clear content before returning the page to Xen */
+    scrub_one_page(page);
+
+    /* Track number of paged gfns */
+    atomic_inc(&d->paged_pages);
+
+ out_put:
+    /* Put the page back so it gets freed */
+    put_page(page);
+
+ out:
+    gfn_unlock(p2m, gfn, 0);
+    return ret;
+}
+
+/*
+ * prepare - Allocate a new page for the guest
+ * @d: guest domain
+ * @gfn: guest page in paging state
+ *
+ * prepare() will allocate a new page for the guest if the gfn is not backed
+ * by a mfn. It is called by the pager.
+ * It is required that the gfn was already populated. The gfn may already have a
+ * mfn if populate was called for  gfn which was nominated but not evicted. In
+ * this case only the p2mt needs to be forwarded.
+ */
+static int prepare(struct domain *d, unsigned long gfn_l,
+                   XEN_GUEST_HANDLE_64(const_uint8) buffer)
+{
+    struct page_info *page = NULL;
+    p2m_type_t p2mt;
+    p2m_access_t a;
+    gfn_t gfn = _gfn(gfn_l);
+    mfn_t mfn;
+    struct p2m_domain *p2m = p2m_get_hostp2m(d);
+    int ret, page_extant = 1;
+
+    if ( !guest_handle_okay(buffer, PAGE_SIZE) )
+        return -EINVAL;
+
+    gfn_lock(p2m, gfn, 0);
+
+    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
+
+    ret = -ENOENT;
+    /* Allow missing pages */
+    if ( (p2mt != p2m_ram_paging_in) && (p2mt != p2m_ram_paged) )
+        goto out;
+
+    /* Allocate a page if the gfn does not have one yet */
+    if ( !mfn_valid(mfn) )
+    {
+        void *guest_map;
+
+        /* If the user did not provide a buffer, we disallow */
+        ret = -EINVAL;
+        if ( unlikely(guest_handle_is_null(buffer)) )
+            goto out;
+        /* Get a free page */
+        ret = -ENOMEM;
+        page_alloc_mm_pre_lock(d);
+        page = alloc_domheap_page(d, 0);
+        if ( unlikely(page == NULL) )
+            goto out;
+        if ( unlikely(!get_page(page, d)) )
+        {
+            /*
+             * The domain can't possibly know about this page yet, so failure
+             * here is a clear indication of something fishy going on.
+             */
+            domain_crash(d);
+            page = NULL;
+            goto out;
+        }
+        mfn = page_to_mfn(page);
+        page_extant = 0;
+
+        ASSERT( mfn_valid(mfn) );
+        guest_map = map_domain_page(mfn);
+        ret = copy_from_guest(guest_map, buffer, PAGE_SIZE);
+        unmap_domain_page(guest_map);
+        if ( ret )
+        {
+            gdprintk(XENLOG_ERR,
+                     "Failed to load paging-in gfn %lx Dom%d bytes left %d\n",
+                     gfn_l, d->domain_id, ret);
+            ret = -EFAULT;
+            goto out;
+        }
+    }
+
+    /*
+     * Make the page already guest-accessible. If the pager still has a
+     * pending resume operation, it will be idempotent p2m entry-wise, but
+     * will unpause the vcpu.
+     */
+    ret = p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
+                        paging_mode_log_dirty(d) ? p2m_ram_logdirty
+                                                 : p2m_ram_rw, a);
+    if ( !ret )
+    {
+        set_gpfn_from_mfn(mfn_x(mfn), gfn_l);
+
+        if ( !page_extant )
+            atomic_dec(&d->paged_pages);
+    }
+
+ out:
+    gfn_unlock(p2m, gfn, 0);
+
+    if ( page )
+    {
+        if ( ret )
+            put_page_alloc_ref(page);
+        put_page(page);
+    }
+
+    return ret;
+}
+
 int mem_paging_memop(XEN_GUEST_HANDLE_PARAM(xen_mem_paging_op_t) arg)
 {
     int rc;
@@ -50,15 +465,15 @@ int mem_paging_memop(XEN_GUEST_HANDLE_PA
     switch( mpo.op )
     {
     case XENMEM_paging_op_nominate:
-        rc = p2m_mem_paging_nominate(d, mpo.gfn);
+        rc = nominate(d, mpo.gfn);
         break;
 
     case XENMEM_paging_op_evict:
-        rc = p2m_mem_paging_evict(d, mpo.gfn);
+        rc = evict(d, mpo.gfn);
         break;
 
     case XENMEM_paging_op_prep:
-        rc = p2m_mem_paging_prep(d, mpo.gfn, mpo.buffer);
+        rc = prepare(d, mpo.gfn, mpo.buffer);
         if ( !rc )
             copyback = 1;
         break;
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -23,7 +23,6 @@
  * along with this program; If not, see <http://www.gnu.org/licenses/>.
  */
 
-#include <xen/guest_access.h> /* copy_from_guest() */
 #include <xen/iommu.h>
 #include <xen/mem_access.h>
 #include <xen/vm_event.h>
@@ -1506,418 +1505,6 @@ int set_shared_p2m_entry(struct domain *
     return rc;
 }
 
-/**
- * p2m_mem_paging_nominate - Mark a guest page as to-be-paged-out
- * @d: guest domain
- * @gfn: guest page to nominate
- *
- * Returns 0 for success or negative errno values if gfn is not pageable.
- *
- * p2m_mem_paging_nominate() is called by the pager and checks if a guest page
- * can be paged out. If the following conditions are met the p2mt will be
- * changed:
- * - the gfn is backed by a mfn
- * - the p2mt of the gfn is pageable
- * - the mfn is not used for IO
- * - the mfn has exactly one user and has no special meaning
- *
- * Once the p2mt is changed the page is readonly for the guest.  On success the
- * pager can write the page contents to disk and later evict the page.
- */
-int p2m_mem_paging_nominate(struct domain *d, unsigned long gfn_l)
-{
-    struct page_info *page;
-    struct p2m_domain *p2m = p2m_get_hostp2m(d);
-    p2m_type_t p2mt;
-    p2m_access_t a;
-    gfn_t gfn = _gfn(gfn_l);
-    mfn_t mfn;
-    int ret = -EBUSY;
-
-    gfn_lock(p2m, gfn, 0);
-
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
-
-    /* Check if mfn is valid */
-    if ( !mfn_valid(mfn) )
-        goto out;
-
-    /* Check p2m type */
-    if ( !p2m_is_pageable(p2mt) )
-        goto out;
-
-    /* Check for io memory page */
-    if ( is_iomem_page(mfn) )
-        goto out;
-
-    /* Check page count and type */
-    page = mfn_to_page(mfn);
-    if ( (page->count_info & (PGC_count_mask | PGC_allocated)) !=
-         (1 | PGC_allocated) )
-        goto out;
-
-    if ( (page->u.inuse.type_info & PGT_count_mask) != 0 )
-        goto out;
-
-    /* Fix p2m entry */
-    ret = p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_paging_out, a);
-
- out:
-    gfn_unlock(p2m, gfn, 0);
-    return ret;
-}
-
-/**
- * p2m_mem_paging_evict - Mark a guest page as paged-out
- * @d: guest domain
- * @gfn: guest page to evict
- *
- * Returns 0 for success or negative errno values if eviction is not possible.
- *
- * p2m_mem_paging_evict() is called by the pager and will free a guest page and
- * release it back to Xen. If the following conditions are met the page can be
- * freed:
- * - the gfn is backed by a mfn
- * - the gfn was nominated
- * - the mfn has still exactly one user and has no special meaning
- *
- * After successful nomination some other process could have mapped the page. In
- * this case eviction can not be done. If the gfn was populated before the pager
- * could evict it, eviction can not be done either. In this case the gfn is
- * still backed by a mfn.
- */
-int p2m_mem_paging_evict(struct domain *d, unsigned long gfn_l)
-{
-    struct page_info *page;
-    p2m_type_t p2mt;
-    p2m_access_t a;
-    gfn_t gfn = _gfn(gfn_l);
-    mfn_t mfn;
-    struct p2m_domain *p2m = p2m_get_hostp2m(d);
-    int ret = -EBUSY;
-
-    gfn_lock(p2m, gfn, 0);
-
-    /* Get mfn */
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
-    if ( unlikely(!mfn_valid(mfn)) )
-        goto out;
-
-    /* Allow only nominated pages */
-    if ( p2mt != p2m_ram_paging_out )
-        goto out;
-
-    /* Get the page so it doesn't get modified under Xen's feet */
-    page = mfn_to_page(mfn);
-    if ( unlikely(!get_page(page, d)) )
-        goto out;
-
-    /* Check page count and type once more */
-    if ( (page->count_info & (PGC_count_mask | PGC_allocated)) !=
-         (2 | PGC_allocated) )
-        goto out_put;
-
-    if ( (page->u.inuse.type_info & PGT_count_mask) != 0 )
-        goto out_put;
-
-    /* Decrement guest domain's ref count of the page */
-    put_page_alloc_ref(page);
-
-    /* Remove mapping from p2m table */
-    ret = p2m_set_entry(p2m, gfn, INVALID_MFN, PAGE_ORDER_4K,
-                        p2m_ram_paged, a);
-
-    /* Clear content before returning the page to Xen */
-    scrub_one_page(page);
-
-    /* Track number of paged gfns */
-    atomic_inc(&d->paged_pages);
-
- out_put:
-    /* Put the page back so it gets freed */
-    put_page(page);
-
- out:
-    gfn_unlock(p2m, gfn, 0);
-    return ret;
-}
-
-/**
- * p2m_mem_paging_drop_page - Tell pager to drop its reference to a paged page
- * @d: guest domain
- * @gfn: guest page to drop
- *
- * p2m_mem_paging_drop_page() will notify the pager that a paged-out gfn was
- * released by the guest. The pager is supposed to drop its reference of the
- * gfn.
- */
-void p2m_mem_paging_drop_page(struct domain *d, unsigned long gfn,
-                                p2m_type_t p2mt)
-{
-    vm_event_request_t req = {
-        .reason = VM_EVENT_REASON_MEM_PAGING,
-        .u.mem_paging.gfn = gfn
-    };
-
-    /* We allow no ring in this unique case, because it won't affect
-     * correctness of the guest execution at this point.  If this is the only
-     * page that happens to be paged-out, we'll be okay..  but it's likely the
-     * guest will crash shortly anyways. */
-    int rc = vm_event_claim_slot(d, d->vm_event_paging);
-    if ( rc < 0 )
-        return;
-
-    /* Send release notification to pager */
-    req.u.mem_paging.flags = MEM_PAGING_DROP_PAGE;
-
-    /* Update stats unless the page hasn't yet been evicted */
-    if ( p2mt != p2m_ram_paging_out )
-        atomic_dec(&d->paged_pages);
-    else
-        /* Evict will fail now, tag this request for pager */
-        req.u.mem_paging.flags |= MEM_PAGING_EVICT_FAIL;
-
-    vm_event_put_request(d, d->vm_event_paging, &req);
-}
-
-/**
- * p2m_mem_paging_populate - Tell pager to populate a paged page
- * @d: guest domain
- * @gfn: guest page in paging state
- *
- * p2m_mem_paging_populate() will notify the pager that a page in any of the
- * paging states needs to be written back into the guest.
- * This function needs to be called whenever gfn_to_mfn() returns any of the p2m
- * paging types because the gfn may not be backed by a mfn.
- *
- * The gfn can be in any of the paging states, but the pager needs only be
- * notified when the gfn is in the paging-out path (paging_out or paged).  This
- * function may be called more than once from several vcpus. If the vcpu belongs
- * to the guest, the vcpu must be stopped and the pager notified that the vcpu
- * was stopped. The pager needs to handle several requests for the same gfn.
- *
- * If the gfn is not in the paging-out path and the vcpu does not belong to the
- * guest, nothing needs to be done and the function assumes that a request was
- * already sent to the pager. In this case the caller has to try again until the
- * gfn is fully paged in again.
- */
-void p2m_mem_paging_populate(struct domain *d, unsigned long gfn_l)
-{
-    struct vcpu *v = current;
-    vm_event_request_t req = {
-        .reason = VM_EVENT_REASON_MEM_PAGING,
-        .u.mem_paging.gfn = gfn_l
-    };
-    p2m_type_t p2mt;
-    p2m_access_t a;
-    gfn_t gfn = _gfn(gfn_l);
-    mfn_t mfn;
-    struct p2m_domain *p2m = p2m_get_hostp2m(d);
-
-    /* We're paging. There should be a ring */
-    int rc = vm_event_claim_slot(d, d->vm_event_paging);
-
-    if ( rc == -EOPNOTSUPP )
-    {
-        gdprintk(XENLOG_ERR, "Domain %hu paging gfn %lx yet no ring "
-                             "in place\n", d->domain_id, gfn_l);
-        /* Prevent the vcpu from faulting repeatedly on the same gfn */
-        if ( v->domain == d )
-            vcpu_pause_nosync(v);
-        domain_crash(d);
-        return;
-    }
-    else if ( rc < 0 )
-        return;
-
-    /* Fix p2m mapping */
-    gfn_lock(p2m, gfn, 0);
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
-    /* Allow only nominated or evicted pages to enter page-in path */
-    if ( p2mt == p2m_ram_paging_out || p2mt == p2m_ram_paged )
-    {
-        /* Evict will fail now, tag this request for pager */
-        if ( p2mt == p2m_ram_paging_out )
-            req.u.mem_paging.flags |= MEM_PAGING_EVICT_FAIL;
-
-        rc = p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K, p2m_ram_paging_in, a);
-    }
-    gfn_unlock(p2m, gfn, 0);
-    if ( rc < 0 )
-        goto out_cancel;
-
-    /* Pause domain if request came from guest and gfn has paging type */
-    if ( p2m_is_paging(p2mt) && v->domain == d )
-    {
-        vm_event_vcpu_pause(v);
-        req.flags |= VM_EVENT_FLAG_VCPU_PAUSED;
-    }
-    /* No need to inform pager if the gfn is not in the page-out path */
-    else if ( p2mt != p2m_ram_paging_out && p2mt != p2m_ram_paged )
-    {
-        /* gfn is already on its way back and vcpu is not paused */
-    out_cancel:
-        vm_event_cancel_slot(d, d->vm_event_paging);
-        return;
-    }
-
-    /* Send request to pager */
-    req.u.mem_paging.p2mt = p2mt;
-    req.vcpu_id = v->vcpu_id;
-
-    vm_event_put_request(d, d->vm_event_paging, &req);
-}
-
-/**
- * p2m_mem_paging_prep - Allocate a new page for the guest
- * @d: guest domain
- * @gfn: guest page in paging state
- *
- * p2m_mem_paging_prep() will allocate a new page for the guest if the gfn is
- * not backed by a mfn. It is called by the pager.
- * It is required that the gfn was already populated. The gfn may already have a
- * mfn if populate was called for  gfn which was nominated but not evicted. In
- * this case only the p2mt needs to be forwarded.
- */
-int p2m_mem_paging_prep(struct domain *d, unsigned long gfn_l,
-                        XEN_GUEST_HANDLE_64(const_uint8) buffer)
-{
-    struct page_info *page = NULL;
-    p2m_type_t p2mt;
-    p2m_access_t a;
-    gfn_t gfn = _gfn(gfn_l);
-    mfn_t mfn;
-    struct p2m_domain *p2m = p2m_get_hostp2m(d);
-    int ret, page_extant = 1;
-
-    if ( !guest_handle_okay(buffer, PAGE_SIZE) )
-        return -EINVAL;
-
-    gfn_lock(p2m, gfn, 0);
-
-    mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
-
-    ret = -ENOENT;
-    /* Allow missing pages */
-    if ( (p2mt != p2m_ram_paging_in) && (p2mt != p2m_ram_paged) )
-        goto out;
-
-    /* Allocate a page if the gfn does not have one yet */
-    if ( !mfn_valid(mfn) )
-    {
-        void *guest_map;
-
-        /* If the user did not provide a buffer, we disallow */
-        ret = -EINVAL;
-        if ( unlikely(guest_handle_is_null(buffer)) )
-            goto out;
-        /* Get a free page */
-        ret = -ENOMEM;
-        page_alloc_mm_pre_lock(d);
-        page = alloc_domheap_page(d, 0);
-        if ( unlikely(page == NULL) )
-            goto out;
-        if ( unlikely(!get_page(page, d)) )
-        {
-            /*
-             * The domain can't possibly know about this page yet, so failure
-             * here is a clear indication of something fishy going on.
-             */
-            domain_crash(d);
-            page = NULL;
-            goto out;
-        }
-        mfn = page_to_mfn(page);
-        page_extant = 0;
-
-        ASSERT( mfn_valid(mfn) );
-        guest_map = map_domain_page(mfn);
-        ret = copy_from_guest(guest_map, buffer, PAGE_SIZE);
-        unmap_domain_page(guest_map);
-        if ( ret )
-        {
-            gdprintk(XENLOG_ERR,
-                     "Failed to load paging-in gfn %lx Dom%d bytes left %d\n",
-                     gfn_l, d->domain_id, ret);
-            ret = -EFAULT;
-            goto out;            
-        }
-    }
-
-    /* Make the page already guest-accessible. If the pager still has a
-     * pending resume operation, it will be idempotent p2m entry-wise,
-     * but will unpause the vcpu */
-    ret = p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
-                        paging_mode_log_dirty(d) ? p2m_ram_logdirty
-                                                 : p2m_ram_rw, a);
-    if ( !ret )
-    {
-        set_gpfn_from_mfn(mfn_x(mfn), gfn_l);
-
-        if ( !page_extant )
-            atomic_dec(&d->paged_pages);
-    }
-
- out:
-    gfn_unlock(p2m, gfn, 0);
-
-    if ( page )
-    {
-        if ( ret )
-            put_page_alloc_ref(page);
-        put_page(page);
-    }
-
-    return ret;
-}
-
-/**
- * p2m_mem_paging_resume - Resume guest gfn
- * @d: guest domain
- * @rsp: vm_event response received
- *
- * p2m_mem_paging_resume() will forward the p2mt of a gfn to ram_rw. It is
- * called by the pager.
- *
- * The gfn was previously either evicted and populated, or nominated and
- * populated. If the page was evicted the p2mt will be p2m_ram_paging_in. If
- * the page was just nominated the p2mt will be p2m_ram_paging_in_start because
- * the pager did not call p2m_mem_paging_prep().
- *
- * If the gfn was dropped the vcpu needs to be unpaused.
- */
-
-void p2m_mem_paging_resume(struct domain *d, vm_event_response_t *rsp)
-{
-    struct p2m_domain *p2m = p2m_get_hostp2m(d);
-    p2m_type_t p2mt;
-    p2m_access_t a;
-    mfn_t mfn;
-
-    /* Fix p2m entry if the page was not dropped */
-    if ( !(rsp->u.mem_paging.flags & MEM_PAGING_DROP_PAGE) )
-    {
-        gfn_t gfn = _gfn(rsp->u.mem_access.gfn);
-
-        gfn_lock(p2m, gfn, 0);
-        mfn = p2m->get_entry(p2m, gfn, &p2mt, &a, 0, NULL, NULL);
-        /*
-         * Allow only pages which were prepared properly, or pages which
-         * were nominated but not evicted.
-         */
-        if ( mfn_valid(mfn) && (p2mt == p2m_ram_paging_in) )
-        {
-            int rc = p2m_set_entry(p2m, gfn, mfn, PAGE_ORDER_4K,
-                                   paging_mode_log_dirty(d) ? p2m_ram_logdirty :
-                                   p2m_ram_rw, a);
-
-            if ( !rc )
-                set_gpfn_from_mfn(mfn_x(mfn), gfn_x(gfn));
-        }
-        gfn_unlock(p2m, gfn, 0);
-    }
-}
-
 #ifdef CONFIG_HVM
 static struct p2m_domain *
 p2m_getlru_nestedp2m(struct domain *d, struct p2m_domain *p2m)
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -731,18 +731,11 @@ static inline void p2m_pod_init(struct p
 /* Modify p2m table for shared gfn */
 int set_shared_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn);
 
-/* Check if a nominated gfn is valid to be paged out */
-int p2m_mem_paging_nominate(struct domain *d, unsigned long gfn);
-/* Evict a frame */
-int p2m_mem_paging_evict(struct domain *d, unsigned long gfn);
 /* Tell xenpaging to drop a paged out frame */
 void p2m_mem_paging_drop_page(struct domain *d, unsigned long gfn, 
                                 p2m_type_t p2mt);
 /* Start populating a paged out frame */
 void p2m_mem_paging_populate(struct domain *d, unsigned long gfn);
-/* Prepare the p2m for paging a frame in */
-int p2m_mem_paging_prep(struct domain *d, unsigned long gfn,
-                        XEN_GUEST_HANDLE_64(const_uint8) buffer);
 /* Resume normal operation (in case a domain was paused) */
 struct vm_event_st;
 void p2m_mem_paging_resume(struct domain *d, struct vm_event_st *rsp);


Re: [PATCH v2 5/6] x86/mem-paging: move code to its dedicated source file
Posted by Andrew Cooper 3 years, 10 months ago
On 23/04/2020 09:39, Jan Beulich wrote:
> Do a little bit of style adjustment along the way, and drop the
> "p2m_mem_paging_" prefixes from the now static functions.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Re: [PATCH v2 5/6] x86/mem-paging: move code to its dedicated source file
Posted by Roger Pau Monné 3 years, 10 months ago
On Thu, Apr 23, 2020 at 10:39:17AM +0200, Jan Beulich wrote:
> Do a little bit of style adjustment along the way, and drop the
> "p2m_mem_paging_" prefixes from the now static functions.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.

[PATCH v2 6/6] x86/mem-paging: consistently use gfn_t
Posted by Jan Beulich 3 years, 11 months ago
Where gprintk()s get touched anyway to switch to PRI_gfn, also switch to
%pd for the domain logged.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v2: Use gfn_t for local var in mod_l1_entry(). Use PRI_gfn when printk()
    arg is gfn_x(). Re-base.

--- a/xen/arch/x86/hvm/dm.c
+++ b/xen/arch/x86/hvm/dm.c
@@ -278,7 +278,7 @@ static int set_mem_type(struct domain *d
         if ( p2m_is_paging(t) )
         {
             put_gfn(d, pfn);
-            p2m_mem_paging_populate(d, pfn);
+            p2m_mem_paging_populate(d, _gfn(pfn));
             return -EAGAIN;
         }
 
--- a/xen/arch/x86/hvm/hvm.c
+++ b/xen/arch/x86/hvm/hvm.c
@@ -1988,7 +1988,7 @@ int hvm_hap_nested_page_fault(paddr_t gp
      * locks in such circumstance.
      */
     if ( paged )
-        p2m_mem_paging_populate(currd, gfn);
+        p2m_mem_paging_populate(currd, _gfn(gfn));
 
     if ( sharing_enomem )
     {
@@ -3222,7 +3222,7 @@ enum hvm_translation_result hvm_translat
     if ( p2m_is_paging(p2mt) )
     {
         put_page(page);
-        p2m_mem_paging_populate(v->domain, gfn_x(gfn));
+        p2m_mem_paging_populate(v->domain, gfn);
         return HVMTRANS_gfn_paged_out;
     }
     if ( p2m_is_shared(p2mt) )
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2120,16 +2120,17 @@ static int mod_l1_entry(l1_pgentry_t *pl
              paging_mode_translate(pg_dom) )
         {
             p2m_type_t p2mt;
+            gfn_t gfn = _gfn(l1e_get_pfn(nl1e));
             p2m_query_t q = l1e_get_flags(nl1e) & _PAGE_RW ?
                             P2M_ALLOC | P2M_UNSHARE : P2M_ALLOC;
 
-            page = get_page_from_gfn(pg_dom, l1e_get_pfn(nl1e), &p2mt, q);
+            page = get_page_from_gfn(pg_dom, gfn_x(gfn), &p2mt, q);
 
             if ( p2m_is_paged(p2mt) )
             {
                 if ( page )
                     put_page(page);
-                p2m_mem_paging_populate(pg_dom, l1e_get_pfn(nl1e));
+                p2m_mem_paging_populate(pg_dom, gfn);
                 return -ENOENT;
             }
 
@@ -3951,7 +3952,7 @@ long do_mmu_update(
                     put_page(page);
                 if ( p2m_is_paged(p2mt) )
                 {
-                    p2m_mem_paging_populate(pt_owner, gmfn);
+                    p2m_mem_paging_populate(pt_owner, _gfn(gmfn));
                     rc = -ENOENT;
                 }
                 else
--- a/xen/arch/x86/mm/hap/guest_walk.c
+++ b/xen/arch/x86/mm/hap/guest_walk.c
@@ -68,7 +68,7 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA
         *pfec = PFEC_page_paged;
         if ( top_page )
             put_page(top_page);
-        p2m_mem_paging_populate(p2m->domain, cr3 >> PAGE_SHIFT);
+        p2m_mem_paging_populate(p2m->domain, _gfn(PFN_DOWN(cr3)));
         return gfn_x(INVALID_GFN);
     }
     if ( p2m_is_shared(p2mt) )
@@ -110,7 +110,7 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA
         {
             ASSERT(p2m_is_hostp2m(p2m));
             *pfec = PFEC_page_paged;
-            p2m_mem_paging_populate(p2m->domain, gfn_x(gfn));
+            p2m_mem_paging_populate(p2m->domain, gfn);
             return gfn_x(INVALID_GFN);
         }
         if ( p2m_is_shared(p2mt) )
--- a/xen/arch/x86/mm/mem_paging.c
+++ b/xen/arch/x86/mm/mem_paging.c
@@ -36,12 +36,11 @@
  * released by the guest. The pager is supposed to drop its reference of the
  * gfn.
  */
-void p2m_mem_paging_drop_page(struct domain *d, unsigned long gfn,
-                              p2m_type_t p2mt)
+void p2m_mem_paging_drop_page(struct domain *d, gfn_t gfn, p2m_type_t p2mt)
 {
     vm_event_request_t req = {
         .reason = VM_EVENT_REASON_MEM_PAGING,
-        .u.mem_paging.gfn = gfn
+        .u.mem_paging.gfn = gfn_x(gfn)
     };
 
     /*
@@ -89,16 +88,15 @@ void p2m_mem_paging_drop_page(struct dom
  * already sent to the pager. In this case the caller has to try again until the
  * gfn is fully paged in again.
  */
-void p2m_mem_paging_populate(struct domain *d, unsigned long gfn_l)
+void p2m_mem_paging_populate(struct domain *d, gfn_t gfn)
 {
     struct vcpu *v = current;
     vm_event_request_t req = {
         .reason = VM_EVENT_REASON_MEM_PAGING,
-        .u.mem_paging.gfn = gfn_l
+        .u.mem_paging.gfn = gfn_x(gfn)
     };
     p2m_type_t p2mt;
     p2m_access_t a;
-    gfn_t gfn = _gfn(gfn_l);
     mfn_t mfn;
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
     int rc = vm_event_claim_slot(d, d->vm_event_paging);
@@ -106,8 +104,8 @@ void p2m_mem_paging_populate(struct doma
     /* We're paging. There should be a ring. */
     if ( rc == -EOPNOTSUPP )
     {
-        gdprintk(XENLOG_ERR, "Dom%d paging gfn %lx yet no ring in place\n",
-                 d->domain_id, gfn_l);
+        gdprintk(XENLOG_ERR, "%pd paging gfn %"PRI_gfn" yet no ring in place\n",
+                 d, gfn_x(gfn));
         /* Prevent the vcpu from faulting repeatedly on the same gfn */
         if ( v->domain == d )
             vcpu_pause_nosync(v);
@@ -218,13 +216,12 @@ void p2m_mem_paging_resume(struct domain
  * Once the p2mt is changed the page is readonly for the guest.  On success the
  * pager can write the page contents to disk and later evict the page.
  */
-static int nominate(struct domain *d, unsigned long gfn_l)
+static int nominate(struct domain *d, gfn_t gfn)
 {
     struct page_info *page;
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
     p2m_type_t p2mt;
     p2m_access_t a;
-    gfn_t gfn = _gfn(gfn_l);
     mfn_t mfn;
     int ret = -EBUSY;
 
@@ -279,12 +276,11 @@ static int nominate(struct domain *d, un
  * could evict it, eviction can not be done either. In this case the gfn is
  * still backed by a mfn.
  */
-static int evict(struct domain *d, unsigned long gfn_l)
+static int evict(struct domain *d, gfn_t gfn)
 {
     struct page_info *page;
     p2m_type_t p2mt;
     p2m_access_t a;
-    gfn_t gfn = _gfn(gfn_l);
     mfn_t mfn;
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
     int ret = -EBUSY;
@@ -346,13 +342,12 @@ static int evict(struct domain *d, unsig
  * mfn if populate was called for  gfn which was nominated but not evicted. In
  * this case only the p2mt needs to be forwarded.
  */
-static int prepare(struct domain *d, unsigned long gfn_l,
+static int prepare(struct domain *d, gfn_t gfn,
                    XEN_GUEST_HANDLE_64(const_uint8) buffer)
 {
     struct page_info *page = NULL;
     p2m_type_t p2mt;
     p2m_access_t a;
-    gfn_t gfn = _gfn(gfn_l);
     mfn_t mfn;
     struct p2m_domain *p2m = p2m_get_hostp2m(d);
     int ret, page_extant = 1;
@@ -404,8 +399,8 @@ static int prepare(struct domain *d, uns
         if ( ret )
         {
             gdprintk(XENLOG_ERR,
-                     "Failed to load paging-in gfn %lx Dom%d bytes left %d\n",
-                     gfn_l, d->domain_id, ret);
+                     "Failed to load paging-in %pd gfn %"PRI_gfn" bytes left %d\n",
+                     d, gfn_x(gfn), ret);
             ret = -EFAULT;
             goto out;
         }
@@ -421,7 +416,7 @@ static int prepare(struct domain *d, uns
                                                  : p2m_ram_rw, a);
     if ( !ret )
     {
-        set_gpfn_from_mfn(mfn_x(mfn), gfn_l);
+        set_gpfn_from_mfn(mfn_x(mfn), gfn_x(gfn));
 
         if ( !page_extant )
             atomic_dec(&d->paged_pages);
@@ -465,15 +460,15 @@ int mem_paging_memop(XEN_GUEST_HANDLE_PA
     switch( mpo.op )
     {
     case XENMEM_paging_op_nominate:
-        rc = nominate(d, mpo.gfn);
+        rc = nominate(d, _gfn(mpo.gfn));
         break;
 
     case XENMEM_paging_op_evict:
-        rc = evict(d, mpo.gfn);
+        rc = evict(d, _gfn(mpo.gfn));
         break;
 
     case XENMEM_paging_op_prep:
-        rc = prepare(d, mpo.gfn, mpo.buffer);
+        rc = prepare(d, _gfn(mpo.gfn), mpo.buffer);
         if ( !rc )
             copyback = 1;
         break;
--- a/xen/arch/x86/mm/p2m.c
+++ b/xen/arch/x86/mm/p2m.c
@@ -1835,7 +1835,7 @@ void *map_domain_gfn(struct p2m_domain *
         ASSERT(p2m_is_hostp2m(p2m));
         if ( page )
             put_page(page);
-        p2m_mem_paging_populate(p2m->domain, gfn_x(gfn));
+        p2m_mem_paging_populate(p2m->domain, gfn);
         *pfec = PFEC_page_paged;
         return NULL;
     }
--- a/xen/arch/x86/traps.c
+++ b/xen/arch/x86/traps.c
@@ -848,7 +848,7 @@ int guest_wrmsr_xen(struct vcpu *v, uint
 
             if ( p2m_is_paging(t) )
             {
-                p2m_mem_paging_populate(d, gmfn);
+                p2m_mem_paging_populate(d, _gfn(gmfn));
                 return X86EMUL_RETRY;
             }
 
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -322,7 +322,7 @@ int guest_remove_page(struct domain *d,
 
         put_gfn(d, gmfn);
 
-        p2m_mem_paging_drop_page(d, gmfn, p2mt);
+        p2m_mem_paging_drop_page(d, _gfn(gmfn), p2mt);
 
         return 0;
     }
@@ -1667,7 +1667,7 @@ int check_get_page_from_gfn(struct domai
         if ( page )
             put_page(page);
 
-        p2m_mem_paging_populate(d, gfn_x(gfn));
+        p2m_mem_paging_populate(d, gfn);
         return -EAGAIN;
     }
 #endif
--- a/xen/include/asm-x86/p2m.h
+++ b/xen/include/asm-x86/p2m.h
@@ -732,10 +732,9 @@ static inline void p2m_pod_init(struct p
 int set_shared_p2m_entry(struct domain *d, unsigned long gfn, mfn_t mfn);
 
 /* Tell xenpaging to drop a paged out frame */
-void p2m_mem_paging_drop_page(struct domain *d, unsigned long gfn, 
-                                p2m_type_t p2mt);
+void p2m_mem_paging_drop_page(struct domain *d, gfn_t gfn, p2m_type_t p2mt);
 /* Start populating a paged out frame */
-void p2m_mem_paging_populate(struct domain *d, unsigned long gfn);
+void p2m_mem_paging_populate(struct domain *d, gfn_t gfn);
 /* Resume normal operation (in case a domain was paused) */
 struct vm_event_st;
 void p2m_mem_paging_resume(struct domain *d, struct vm_event_st *rsp);


Re: [PATCH v2 6/6] x86/mem-paging: consistently use gfn_t
Posted by Andrew Cooper 3 years, 10 months ago
On 23/04/2020 09:39, Jan Beulich wrote:
> --- a/xen/arch/x86/mm/hap/guest_walk.c
> +++ b/xen/arch/x86/mm/hap/guest_walk.c
> @@ -68,7 +68,7 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA
>          *pfec = PFEC_page_paged;
>          if ( top_page )
>              put_page(top_page);
> -        p2m_mem_paging_populate(p2m->domain, cr3 >> PAGE_SHIFT);
> +        p2m_mem_paging_populate(p2m->domain, _gfn(PFN_DOWN(cr3)));

addr_to_gfn()

Otherwise, Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>

Re: [PATCH v2 6/6] x86/mem-paging: consistently use gfn_t
Posted by Andrew Cooper 3 years, 10 months ago
On 15/05/2020 16:49, Andrew Cooper wrote:
> On 23/04/2020 09:39, Jan Beulich wrote:
>> --- a/xen/arch/x86/mm/hap/guest_walk.c
>> +++ b/xen/arch/x86/mm/hap/guest_walk.c
>> @@ -68,7 +68,7 @@ unsigned long hap_p2m_ga_to_gfn(GUEST_PA
>>          *pfec = PFEC_page_paged;
>>          if ( top_page )
>>              put_page(top_page);
>> -        p2m_mem_paging_populate(p2m->domain, cr3 >> PAGE_SHIFT);
>> +        p2m_mem_paging_populate(p2m->domain, _gfn(PFN_DOWN(cr3)));
> addr_to_gfn()

Sorry - gaddr_to_gfn()

~Andrew

>
> Otherwise, Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>
>


Re: [PATCH v2 6/6] x86/mem-paging: consistently use gfn_t
Posted by Roger Pau Monné 3 years, 10 months ago
On Thu, Apr 23, 2020 at 10:39:39AM +0200, Jan Beulich wrote:
> Where gprintk()s get touched anyway to switch to PRI_gfn, also switch to
> %pd for the domain logged.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Reviewed-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks, Roger.