xen/arch/x86/mm.c | 70 +++++++++++++++++++++++++++++++------------------------ 1 file changed, 39 insertions(+), 31 deletions(-)
This is a continuation of the cleanup and commenting in:
9186e96b199e ("x86/pv: Clean up _get_page_type()")
8cc5036bc385 ("x86/pv: Fix ABAC cmpxchg() race in _get_page_type()")
With the re-arranged and newly commented logic, it's far more clear that the
second half of _get_page_type() only has work to do for page validation.
Introduce an early exit for PGT_validated. This makes the fastpath marginally
faster, and simplifies the subsequent logic as it no longer needs to exclude
the fully validated case.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
CC: George Dunlap <george.dunlap@eu.citrix.com>
Not that it's relevant, but bloat-o-meter says:
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-300 (-300)
Function old new delta
_get_page_type 6618 6318 -300
which is more impressive than I was expecting.
---
xen/arch/x86/mm.c | 70 +++++++++++++++++++++++++++++++------------------------
1 file changed, 39 insertions(+), 31 deletions(-)
diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
index ac74ae389c99..57751d2ed70f 100644
--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -3002,6 +3002,17 @@ static int _get_page_type(struct page_info *page, unsigned long type,
* fully validated (PGT_[type] | PGT_validated | >0).
*/
+ /* If the page is fully validated, we're done. */
+ if ( likely(nx & PGT_validated) )
+ return 0;
+
+ /*
+ * The page is in the "validate locked" state. We have exclusive access,
+ * and any concurrent callers are waiting in the cmpxchg() loop above.
+ *
+ * Exclusive access ends when PGT_validated or PGT_partial get set.
+ */
+
if ( unlikely((x & PGT_count_mask) == 0) )
{
struct domain *d = page_get_owner(page);
@@ -3071,43 +3082,40 @@ static int _get_page_type(struct page_info *page, unsigned long type,
}
}
- if ( unlikely(!(nx & PGT_validated)) )
+ /*
+ * Flush the cache if there were previously non-coherent mappings of
+ * this page, and we're trying to use it as anything other than a
+ * writeable page. This forces the page to be coherent before we
+ * validate its contents for safety.
+ */
+ if ( (nx & PGT_non_coherent) && type != PGT_writable_page )
{
- /*
- * Flush the cache if there were previously non-coherent mappings of
- * this page, and we're trying to use it as anything other than a
- * writeable page. This forces the page to be coherent before we
- * validate its contents for safety.
- */
- if ( (nx & PGT_non_coherent) && type != PGT_writable_page )
- {
- void *addr = __map_domain_page(page);
+ void *addr = __map_domain_page(page);
- cache_flush(addr, PAGE_SIZE);
- unmap_domain_page(addr);
+ cache_flush(addr, PAGE_SIZE);
+ unmap_domain_page(addr);
- page->u.inuse.type_info &= ~PGT_non_coherent;
- }
+ page->u.inuse.type_info &= ~PGT_non_coherent;
+ }
- /*
- * No special validation needed for writable or shared pages. Page
- * tables and GDT/LDT need to have their contents audited.
- *
- * per validate_page(), non-atomic updates are fine here.
- */
- if ( type == PGT_writable_page || type == PGT_shared_page )
- page->u.inuse.type_info |= PGT_validated;
- else
+ /*
+ * No special validation needed for writable or shared pages. Page
+ * tables and GDT/LDT need to have their contents audited.
+ *
+ * per validate_page(), non-atomic updates are fine here.
+ */
+ if ( type == PGT_writable_page || type == PGT_shared_page )
+ page->u.inuse.type_info |= PGT_validated;
+ else
+ {
+ if ( !(x & PGT_partial) )
{
- if ( !(x & PGT_partial) )
- {
- page->nr_validated_ptes = 0;
- page->partial_flags = 0;
- page->linear_pt_count = 0;
- }
-
- rc = validate_page(page, type, preemptible);
+ page->nr_validated_ptes = 0;
+ page->partial_flags = 0;
+ page->linear_pt_count = 0;
}
+
+ rc = validate_page(page, type, preemptible);
}
out:
--
2.11.0
On 17.06.2022 12:47, Andrew Cooper wrote: > This is a continuation of the cleanup and commenting in: > 9186e96b199e ("x86/pv: Clean up _get_page_type()") > 8cc5036bc385 ("x86/pv: Fix ABAC cmpxchg() race in _get_page_type()") > > With the re-arranged and newly commented logic, it's far more clear that the > second half of _get_page_type() only has work to do for page validation. To be honest "far more clear" reads misleading to me: Part of the re- arrangement was to move later the early setting of PGT_validated for PGT_writable pages, without which the stated fact wasn't entirely true prior to the re-arrangement. How about s/far more/now/ ? > Introduce an early exit for PGT_validated. This makes the fastpath marginally > faster, and simplifies the subsequent logic as it no longer needs to exclude > the fully validated case. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> Preferably with the wording above adjusted: Reviewed-by: Jan Beulich <jbeulich@suse.com> > Not that it's relevant, but bloat-o-meter says: > > add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-300 (-300) > Function old new delta > _get_page_type 6618 6318 -300 > > which is more impressive than I was expecting. And I have to admit I'm having trouble seeing why that would be. Jan
On 22/06/2022 10:58, Jan Beulich wrote: > On 17.06.2022 12:47, Andrew Cooper wrote: >> This is a continuation of the cleanup and commenting in: >> 9186e96b199e ("x86/pv: Clean up _get_page_type()") >> 8cc5036bc385 ("x86/pv: Fix ABAC cmpxchg() race in _get_page_type()") >> >> With the re-arranged and newly commented logic, it's far more clear that the >> second half of _get_page_type() only has work to do for page validation. > To be honest "far more clear" reads misleading to me: Part of the re- > arrangement was to move later the early setting of PGT_validated for > PGT_writable pages, without which the stated fact wasn't entirely true > prior to the re-arrangement. How about s/far more/now/ ? > >> Introduce an early exit for PGT_validated. This makes the fastpath marginally >> faster, and simplifies the subsequent logic as it no longer needs to exclude >> the fully validated case. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > Preferably with the wording above adjusted: > Reviewed-by: Jan Beulich <jbeulich@suse.com> Ok. >> Not that it's relevant, but bloat-o-meter says: >> >> add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-300 (-300) >> Function old new delta >> _get_page_type 6618 6318 -300 >> >> which is more impressive than I was expecting. > And I have to admit I'm having trouble seeing why that would be. I was surprised too, but it's deterministic with GCC 11. I initially disbelieved it and did full clean rebuilds. validate_page() is fully inlined, and there were a reasonable number of jmp.d32 -> jmp.d8 changes, but I very much doubt there are 75 of them. I can only assume there's a reasonable chunk of logic which succumbs to DCE. ~Andrew
© 2016 - 2024 Red Hat, Inc.