[Xen-devel] [PATCH] fix BUG in gnttab_unpopulate_status_frames()

Paul Durrant posted 1 patch 4 years, 9 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/xen tags/patchew/20190730164401.34097-1-paul.durrant@citrix.com
xen/common/grant_table.c | 9 +++++++++
1 file changed, 9 insertions(+)
[Xen-devel] [PATCH] fix BUG in gnttab_unpopulate_status_frames()
Posted by Paul Durrant 4 years, 9 months ago
Since commit ec83f825627 "mm.h: add helper function to test-and-clear
_PGC_allocated" (and subsequent fix-up 44a887d021d "mm.h: fix BUG_ON()
condition in put_page_alloc_ref()") setting grant table version from 2
back to 1 has been vulnerable to hitting the BUG_ON in put_page_alloc_ref()
during gnttab_unpopulate_status_frames() because that function does not
acquire a local page reference.

This patch fixes the problem by first acquiring a local page reference on a
status frame (which should always succeed and so a failure results in a
domain_crash()) before attempting to 'unassign' it from the guest by
dropping the allocation reference. The local reference can then be dropped.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wl@xen.org>
---
 xen/common/grant_table.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 97695a221a..b9ca388051 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1682,6 +1682,14 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt)
         struct page_info *pg = virt_to_page(gt->status[i]);
         gfn_t gfn = gnttab_get_frame_gfn(gt, true, i);
 
+        if ( !get_page(pg, d) )
+        {
+            gprintk(XENLOG_ERR,
+                    "Could not get a reference to status frame %u\n", i);
+            domain_crash(d);
+            return -EINVAL;
+        }
+
         /*
          * For translated domains, recovering from failure after partial
          * changes were made is more complicated than it seems worth
@@ -1708,6 +1716,7 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt)
 
         BUG_ON(page_get_owner(pg) != d);
         put_page_alloc_ref(pg);
+        put_page(pg);
 
         if ( pg->count_info & ~PGC_xen_heap )
         {
-- 
2.20.1.2.gb21ebb671


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] fix BUG in gnttab_unpopulate_status_frames()
Posted by Jan Beulich 4 years, 8 months ago
On 30.07.2019 18:44, Paul Durrant wrote:
> --- a/xen/common/grant_table.c
> +++ b/xen/common/grant_table.c
> @@ -1682,6 +1682,14 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt)
>           struct page_info *pg = virt_to_page(gt->status[i]);
>           gfn_t gfn = gnttab_get_frame_gfn(gt, true, i);
>   
> +        if ( !get_page(pg, d) )
> +        {
> +            gprintk(XENLOG_ERR,
> +                    "Could not get a reference to status frame %u\n", i);
> +            domain_crash(d);
> +            return -EINVAL;
> +        }
> +
>           /*
>            * For translated domains, recovering from failure after partial
>            * changes were made is more complicated than it seems worth
> @@ -1708,6 +1716,7 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt)
>   
>           BUG_ON(page_get_owner(pg) != d);
>           put_page_alloc_ref(pg);
> +        put_page(pg);
>   
>           if ( pg->count_info & ~PGC_xen_heap )
>           {
> 

I dislike this approach, and not chosing the alternative of excluding
xenheap pages in the check in put_page_alloc_ref() (as I had recommended
elsewhere) should at least be discussed in the description. It is the
very nature of xenheap pages that they won't get freed, and hence don't
need this extra ref to be held for clearing PGC_allocated.

Jan
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] fix BUG in gnttab_unpopulate_status_frames()
Posted by George Dunlap 4 years, 8 months ago
On 8/2/19 3:44 PM, Jan Beulich wrote:
> On 30.07.2019 18:44, Paul Durrant wrote:
>> --- a/xen/common/grant_table.c
>> +++ b/xen/common/grant_table.c
>> @@ -1682,6 +1682,14 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt)
>>           struct page_info *pg = virt_to_page(gt->status[i]);
>>           gfn_t gfn = gnttab_get_frame_gfn(gt, true, i);
>>   
>> +        if ( !get_page(pg, d) )
>> +        {
>> +            gprintk(XENLOG_ERR,
>> +                    "Could not get a reference to status frame %u\n", i);
>> +            domain_crash(d);
>> +            return -EINVAL;
>> +        }
>> +
>>           /*
>>            * For translated domains, recovering from failure after partial
>>            * changes were made is more complicated than it seems worth
>> @@ -1708,6 +1716,7 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt)
>>   
>>           BUG_ON(page_get_owner(pg) != d);
>>           put_page_alloc_ref(pg);
>> +        put_page(pg);
>>   
>>           if ( pg->count_info & ~PGC_xen_heap )
>>           {
>>
> 
> I dislike this approach, and not chosing the alternative of excluding
> xenheap pages in the check in put_page_alloc_ref() (as I had recommended
> elsewhere) should at least be discussed in the description. It is the
> very nature of xenheap pages that they won't get freed, and hence don't
> need this extra ref to be held for clearing PGC_allocated.

Also, it looks like there are other places where the BUG_ON() may /
should be firing:  namely, vmx_free_vlapic_mapping() and
unshare_xenoprof_page_with_guest().  Teaching put_page_alloc_ref() that
dropping PGC_allocated when PGC_xen_heap is set is safe would fix all
three at once.

Possibly more importantly, suppose that the first time
gnttab_unpopulate_status_frames() comes around, gt->status[1] is still
mapped by the guest.  Then gt->status[0] will have its refcount reduced
to 0 (but not freed), but gt->status[1] will be restored to its previous
state.  If the guest unmaps gt->status[1] and
gnttab_unpopulate_status_frames() is called again, then the
get_page(gt->status[0]) will fail (since its refcount is 0), causing the
guest to be crashed instead.

Not terrible for such a wonkily-behaving guest; but I think I'd rather
go with the "special-case xenheap pages" option.

 -George


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] fix BUG in gnttab_unpopulate_status_frames()
Posted by Paul Durrant 4 years, 8 months ago
> -----Original Message-----
> From: George Dunlap <george.dunlap@citrix.com>
> Sent: 02 August 2019 16:28
> To: Jan Beulich <JBeulich@suse.com>; Paul Durrant <Paul.Durrant@citrix.com>
> Cc: xen-devel@lists.xenproject.org; Julien Grall <julien.grall@arm.com>; Andrew Cooper
> <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Ian Jackson
> <Ian.Jackson@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; Konrad Rzeszutek Wilk
> <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>; Wei Liu <wl@xen.org>
> Subject: Re: [PATCH] fix BUG in gnttab_unpopulate_status_frames()
> 
> On 8/2/19 3:44 PM, Jan Beulich wrote:
> > On 30.07.2019 18:44, Paul Durrant wrote:
> >> --- a/xen/common/grant_table.c
> >> +++ b/xen/common/grant_table.c
> >> @@ -1682,6 +1682,14 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt)
> >>           struct page_info *pg = virt_to_page(gt->status[i]);
> >>           gfn_t gfn = gnttab_get_frame_gfn(gt, true, i);
> >>
> >> +        if ( !get_page(pg, d) )
> >> +        {
> >> +            gprintk(XENLOG_ERR,
> >> +                    "Could not get a reference to status frame %u\n", i);
> >> +            domain_crash(d);
> >> +            return -EINVAL;
> >> +        }
> >> +
> >>           /*
> >>            * For translated domains, recovering from failure after partial
> >>            * changes were made is more complicated than it seems worth
> >> @@ -1708,6 +1716,7 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt)
> >>
> >>           BUG_ON(page_get_owner(pg) != d);
> >>           put_page_alloc_ref(pg);
> >> +        put_page(pg);
> >>
> >>           if ( pg->count_info & ~PGC_xen_heap )
> >>           {
> >>
> >
> > I dislike this approach, and not chosing the alternative of excluding
> > xenheap pages in the check in put_page_alloc_ref() (as I had recommended
> > elsewhere) should at least be discussed in the description. It is the
> > very nature of xenheap pages that they won't get freed, and hence don't
> > need this extra ref to be held for clearing PGC_allocated.
> 
> Also, it looks like there are other places where the BUG_ON() may /
> should be firing:  namely, vmx_free_vlapic_mapping() and
> unshare_xenoprof_page_with_guest().  Teaching put_page_alloc_ref() that
> dropping PGC_allocated when PGC_xen_heap is set is safe would fix all
> three at once.
> 
> Possibly more importantly, suppose that the first time
> gnttab_unpopulate_status_frames() comes around, gt->status[1] is still
> mapped by the guest.  Then gt->status[0] will have its refcount reduced
> to 0 (but not freed), but gt->status[1] will be restored to its previous
> state.  If the guest unmaps gt->status[1] and
> gnttab_unpopulate_status_frames() is called again, then the
> get_page(gt->status[0]) will fail (since its refcount is 0), causing the
> guest to be crashed instead.
> 
> Not terrible for such a wonkily-behaving guest; but I think I'd rather
> go with the "special-case xenheap pages" option.

Ok, I see you've committed a patch to that effect while I was away.

  Paul

> 
>  -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
Re: [Xen-devel] [PATCH] fix BUG in gnttab_unpopulate_status_frames()
Posted by George Dunlap 4 years, 8 months ago
On 8/12/19 11:38 AM, Paul Durrant wrote:
>> -----Original Message-----
>> From: George Dunlap <george.dunlap@citrix.com>
>> Not terrible for such a wonkily-behaving guest; but I think I'd rather
>> go with the "special-case xenheap pages" option.
> 
> Ok, I see you've committed a patch to that effect while I was away.

Yes; Andy wanted to get tests passing, and asked me to submit a change
along the lines of what I was proposing.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel