Since commit ec83f825627 "mm.h: add helper function to test-and-clear
_PGC_allocated" (and subsequent fix-up 44a887d021d "mm.h: fix BUG_ON()
condition in put_page_alloc_ref()") setting grant table version from 2
back to 1 has been vulnerable to hitting the BUG_ON in put_page_alloc_ref()
during gnttab_unpopulate_status_frames() because that function does not
acquire a local page reference.
This patch fixes the problem by first acquiring a local page reference on a
status frame (which should always succeed and so a failure results in a
domain_crash()) before attempting to 'unassign' it from the guest by
dropping the allocation reference. The local reference can then be dropped.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
---
Cc: Andrew Cooper <andrew.cooper3@citrix.com>
Cc: George Dunlap <George.Dunlap@eu.citrix.com>
Cc: Ian Jackson <ian.jackson@eu.citrix.com>
Cc: Jan Beulich <jbeulich@suse.com>
Cc: Julien Grall <julien.grall@arm.com>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Tim Deegan <tim@xen.org>
Cc: Wei Liu <wl@xen.org>
---
xen/common/grant_table.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c
index 97695a221a..b9ca388051 100644
--- a/xen/common/grant_table.c
+++ b/xen/common/grant_table.c
@@ -1682,6 +1682,14 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt)
struct page_info *pg = virt_to_page(gt->status[i]);
gfn_t gfn = gnttab_get_frame_gfn(gt, true, i);
+ if ( !get_page(pg, d) )
+ {
+ gprintk(XENLOG_ERR,
+ "Could not get a reference to status frame %u\n", i);
+ domain_crash(d);
+ return -EINVAL;
+ }
+
/*
* For translated domains, recovering from failure after partial
* changes were made is more complicated than it seems worth
@@ -1708,6 +1716,7 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt)
BUG_ON(page_get_owner(pg) != d);
put_page_alloc_ref(pg);
+ put_page(pg);
if ( pg->count_info & ~PGC_xen_heap )
{
--
2.20.1.2.gb21ebb671
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
On 30.07.2019 18:44, Paul Durrant wrote: > --- a/xen/common/grant_table.c > +++ b/xen/common/grant_table.c > @@ -1682,6 +1682,14 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt) > struct page_info *pg = virt_to_page(gt->status[i]); > gfn_t gfn = gnttab_get_frame_gfn(gt, true, i); > > + if ( !get_page(pg, d) ) > + { > + gprintk(XENLOG_ERR, > + "Could not get a reference to status frame %u\n", i); > + domain_crash(d); > + return -EINVAL; > + } > + > /* > * For translated domains, recovering from failure after partial > * changes were made is more complicated than it seems worth > @@ -1708,6 +1716,7 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt) > > BUG_ON(page_get_owner(pg) != d); > put_page_alloc_ref(pg); > + put_page(pg); > > if ( pg->count_info & ~PGC_xen_heap ) > { > I dislike this approach, and not chosing the alternative of excluding xenheap pages in the check in put_page_alloc_ref() (as I had recommended elsewhere) should at least be discussed in the description. It is the very nature of xenheap pages that they won't get freed, and hence don't need this extra ref to be held for clearing PGC_allocated. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 8/2/19 3:44 PM, Jan Beulich wrote: > On 30.07.2019 18:44, Paul Durrant wrote: >> --- a/xen/common/grant_table.c >> +++ b/xen/common/grant_table.c >> @@ -1682,6 +1682,14 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt) >> struct page_info *pg = virt_to_page(gt->status[i]); >> gfn_t gfn = gnttab_get_frame_gfn(gt, true, i); >> >> + if ( !get_page(pg, d) ) >> + { >> + gprintk(XENLOG_ERR, >> + "Could not get a reference to status frame %u\n", i); >> + domain_crash(d); >> + return -EINVAL; >> + } >> + >> /* >> * For translated domains, recovering from failure after partial >> * changes were made is more complicated than it seems worth >> @@ -1708,6 +1716,7 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt) >> >> BUG_ON(page_get_owner(pg) != d); >> put_page_alloc_ref(pg); >> + put_page(pg); >> >> if ( pg->count_info & ~PGC_xen_heap ) >> { >> > > I dislike this approach, and not chosing the alternative of excluding > xenheap pages in the check in put_page_alloc_ref() (as I had recommended > elsewhere) should at least be discussed in the description. It is the > very nature of xenheap pages that they won't get freed, and hence don't > need this extra ref to be held for clearing PGC_allocated. Also, it looks like there are other places where the BUG_ON() may / should be firing: namely, vmx_free_vlapic_mapping() and unshare_xenoprof_page_with_guest(). Teaching put_page_alloc_ref() that dropping PGC_allocated when PGC_xen_heap is set is safe would fix all three at once. Possibly more importantly, suppose that the first time gnttab_unpopulate_status_frames() comes around, gt->status[1] is still mapped by the guest. Then gt->status[0] will have its refcount reduced to 0 (but not freed), but gt->status[1] will be restored to its previous state. If the guest unmaps gt->status[1] and gnttab_unpopulate_status_frames() is called again, then the get_page(gt->status[0]) will fail (since its refcount is 0), causing the guest to be crashed instead. Not terrible for such a wonkily-behaving guest; but I think I'd rather go with the "special-case xenheap pages" option. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
> -----Original Message----- > From: George Dunlap <george.dunlap@citrix.com> > Sent: 02 August 2019 16:28 > To: Jan Beulich <JBeulich@suse.com>; Paul Durrant <Paul.Durrant@citrix.com> > Cc: xen-devel@lists.xenproject.org; Julien Grall <julien.grall@arm.com>; Andrew Cooper > <Andrew.Cooper3@citrix.com>; George Dunlap <George.Dunlap@citrix.com>; Ian Jackson > <Ian.Jackson@citrix.com>; Stefano Stabellini <sstabellini@kernel.org>; Konrad Rzeszutek Wilk > <konrad.wilk@oracle.com>; Tim (Xen.org) <tim@xen.org>; Wei Liu <wl@xen.org> > Subject: Re: [PATCH] fix BUG in gnttab_unpopulate_status_frames() > > On 8/2/19 3:44 PM, Jan Beulich wrote: > > On 30.07.2019 18:44, Paul Durrant wrote: > >> --- a/xen/common/grant_table.c > >> +++ b/xen/common/grant_table.c > >> @@ -1682,6 +1682,14 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt) > >> struct page_info *pg = virt_to_page(gt->status[i]); > >> gfn_t gfn = gnttab_get_frame_gfn(gt, true, i); > >> > >> + if ( !get_page(pg, d) ) > >> + { > >> + gprintk(XENLOG_ERR, > >> + "Could not get a reference to status frame %u\n", i); > >> + domain_crash(d); > >> + return -EINVAL; > >> + } > >> + > >> /* > >> * For translated domains, recovering from failure after partial > >> * changes were made is more complicated than it seems worth > >> @@ -1708,6 +1716,7 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt) > >> > >> BUG_ON(page_get_owner(pg) != d); > >> put_page_alloc_ref(pg); > >> + put_page(pg); > >> > >> if ( pg->count_info & ~PGC_xen_heap ) > >> { > >> > > > > I dislike this approach, and not chosing the alternative of excluding > > xenheap pages in the check in put_page_alloc_ref() (as I had recommended > > elsewhere) should at least be discussed in the description. It is the > > very nature of xenheap pages that they won't get freed, and hence don't > > need this extra ref to be held for clearing PGC_allocated. > > Also, it looks like there are other places where the BUG_ON() may / > should be firing: namely, vmx_free_vlapic_mapping() and > unshare_xenoprof_page_with_guest(). Teaching put_page_alloc_ref() that > dropping PGC_allocated when PGC_xen_heap is set is safe would fix all > three at once. > > Possibly more importantly, suppose that the first time > gnttab_unpopulate_status_frames() comes around, gt->status[1] is still > mapped by the guest. Then gt->status[0] will have its refcount reduced > to 0 (but not freed), but gt->status[1] will be restored to its previous > state. If the guest unmaps gt->status[1] and > gnttab_unpopulate_status_frames() is called again, then the > get_page(gt->status[0]) will fail (since its refcount is 0), causing the > guest to be crashed instead. > > Not terrible for such a wonkily-behaving guest; but I think I'd rather > go with the "special-case xenheap pages" option. Ok, I see you've committed a patch to that effect while I was away. Paul > > -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
On 8/12/19 11:38 AM, Paul Durrant wrote: >> -----Original Message----- >> From: George Dunlap <george.dunlap@citrix.com> >> Not terrible for such a wonkily-behaving guest; but I think I'd rather >> go with the "special-case xenheap pages" option. > > Ok, I see you've committed a patch to that effect while I was away. Yes; Andy wanted to get tests passing, and asked me to submit a change along the lines of what I was proposing. -George _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
© 2016 - 2024 Red Hat, Inc.