From: Julien Grall <jgrall@amazon.com>
At the moment, assign_pages() on the page to be inuse (PGC_state_inuse)
and the state value to be 0.
However, the code may race with the page offlining code (see
offline_page()). Depending on the ordering, the page may be in offlining
state (PGC_state_offlining) before it is assigned to a domain.
On debug build, this may result to hit the assert or just clobber the
state. On non-debug build, the state will get clobbered.
Incidentally the flag PGC_broken will get clobbered as well.
Grab the heap_lock to prevent a race with offline_page() and keep the
state and broken flag around.
Signed-off-by: Julien Grall <jgrall@amazon.com>
---
Changes in v2:
- Superseed <20200204133357.32101-1-julien@xen.org>
- Fix the race with offline_page()
---
xen/common/page_alloc.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)
diff --git a/xen/common/page_alloc.c b/xen/common/page_alloc.c
index 97902d42c1..a684dbf37c 100644
--- a/xen/common/page_alloc.c
+++ b/xen/common/page_alloc.c
@@ -2283,15 +2283,27 @@ int assign_pages(
get_knownalive_domain(d);
}
+ spin_lock(&heap_lock);
for ( i = 0; i < (1 << order); i++ )
{
+ /*
+ * We should only be here if the page is inuse or offlining.
+ * The latter happen if we race with mark_page_offline() as we
+ * don't hold the heap_lock.
+ */
+ ASSERT(page_state_is(&pg[i], inuse) ||
+ page_state_is(&pg[i], offlining));
+ ASSERT(!(pg[i].count_info & ~(PGC_state | PGC_broken)));
ASSERT(page_get_owner(&pg[i]) == NULL);
- ASSERT(!pg[i].count_info);
page_set_owner(&pg[i], d);
smp_wmb(); /* Domain pointer must be visible before updating refcnt. */
- pg[i].count_info = PGC_allocated | 1;
+
+ pg[i].count_info &= PGC_state | PGC_broken;
+ pg[i].count_info |= PGC_allocated | 1;
+
page_list_add_tail(&pg[i], &d->page_list);
}
+ spin_unlock(&heap_lock);
out:
spin_unlock(&d->page_alloc_lock);
--
2.17.1
_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xenproject.org
https://lists.xenproject.org/mailman/listinfo/xen-devel
On 06.02.2020 11:38, Julien Grall wrote: > However, the code may race with the page offlining code (see > offline_page()). Depending on the ordering, the page may be in offlining > state (PGC_state_offlining) before it is assigned to a domain. > > On debug build, this may result to hit the assert or just clobber the > state. On non-debug build, the state will get clobbered. > > Incidentally the flag PGC_broken will get clobbered as well. As mentioned when I first pointed out this issue, it is wider than just assign_pages() afaict, which is specifically why I said I wouldn't expect you to want to deal with it alongside the "implicit inuse" aspect. Fixing just one instance of it without also addressing the others isn't going to help. IOW you could leave the code the way it was in v1 in this regard, and then we (you, me, or yet someone else) take care of the race aspect globally for the tree. Jan _______________________________________________ Xen-devel mailing list Xen-devel@lists.xenproject.org https://lists.xenproject.org/mailman/listinfo/xen-devel
© 2016 - 2024 Red Hat, Inc.