[PATCH] revert "x86/mm: re-implement get_page_light() using an atomic increment"

Jan Beulich posted 1 patch 2 weeks, 6 days ago
Failed in applying to current master (apply log)
[PATCH] revert "x86/mm: re-implement get_page_light() using an atomic increment"
Posted by Jan Beulich 2 weeks, 6 days ago
revert "x86/mm: re-implement get_page_light() using an atomic increment"

This reverts commit c40bc0576dcc5acd4d7e22ef628eb4642f568533.

That change aimed at eliminating an open-coded lock-like construct,
which really isn't all that similar to, in particular, get_page(). The
function always succeeds. Any remaining concern would want taking care
of by placing block_lock_speculation() at the end of the function.
Since the function is called only during page (de)validation, any
possible performance concerns over such extra serialization could
likely be addressed by pre-validating (e.g. via pinning) page tables.

The fundamental issue with the change being reverted is that it detects
bad state only after already having caused possible corruption. While
the system is going to be halted in such an event, there is a time
window during which the resulting incorrect state could be leveraged by
a clever (in particular: fast enough) attacker.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

--- a/xen/arch/x86/mm.c
+++ b/xen/arch/x86/mm.c
@@ -2582,10 +2582,16 @@ bool get_page(struct page_info *page, const struct domain *domain)
  */
 static void get_page_light(struct page_info *page)
 {
-    unsigned long old_pgc = arch_fetch_and_add(&page->count_info, 1);
+    unsigned long x, nx, y = page->count_info;
 
-    BUG_ON(!(old_pgc & PGC_count_mask)); /* Not allocated? */
-    BUG_ON(!((old_pgc + 1) & PGC_count_mask)); /* Overflow? */
+    do {
+        x  = y;
+        nx = x + 1;
+        BUG_ON(!(x & PGC_count_mask)); /* Not allocated? */
+        BUG_ON(!(nx & PGC_count_mask)); /* Overflow? */
+        y = cmpxchg(&page->count_info, x, nx);
+    }
+    while ( unlikely(y != x) );
 }
 
 static int validate_page(struct page_info *page, unsigned long type,
Re: [PATCH] revert "x86/mm: re-implement get_page_light() using an atomic increment"
Posted by Roger Pau Monné 2 weeks, 6 days ago
On Mon, Apr 29, 2024 at 03:01:00PM +0200, Jan Beulich wrote:
> revert "x86/mm: re-implement get_page_light() using an atomic increment"
> 
> This reverts commit c40bc0576dcc5acd4d7e22ef628eb4642f568533.
> 
> That change aimed at eliminating an open-coded lock-like construct,
> which really isn't all that similar to, in particular, get_page(). The
> function always succeeds. Any remaining concern would want taking care
> of by placing block_lock_speculation() at the end of the function.
> Since the function is called only during page (de)validation, any
> possible performance concerns over such extra serialization could
> likely be addressed by pre-validating (e.g. via pinning) page tables.
> 
> The fundamental issue with the change being reverted is that it detects
> bad state only after already having caused possible corruption. While
> the system is going to be halted in such an event, there is a time
> window during which the resulting incorrect state could be leveraged by
> a clever (in particular: fast enough) attacker.
> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Roger Pau Monné <roger.pau@citrix.com>

Thanks.

Re: [PATCH] revert "x86/mm: re-implement get_page_light() using an atomic increment"
Posted by Andrew Cooper 2 weeks, 6 days ago
On 29/04/2024 2:01 pm, Jan Beulich wrote:
> revert "x86/mm: re-implement get_page_light() using an atomic increment"
>
> This reverts commit c40bc0576dcc5acd4d7e22ef628eb4642f568533.
>
> That change aimed at eliminating an open-coded lock-like construct,
> which really isn't all that similar to, in particular, get_page(). The
> function always succeeds. Any remaining concern would want taking care
> of by placing block_lock_speculation() at the end of the function.
> Since the function is called only during page (de)validation, any
> possible performance concerns over such extra serialization could
> likely be addressed by pre-validating (e.g. via pinning) page tables.
>
> The fundamental issue with the change being reverted is that it detects
> bad state only after already having caused possible corruption. While
> the system is going to be halted in such an event, there is a time
> window during which the resulting incorrect state could be leveraged by
> a clever (in particular: fast enough) attacker.
>
> Signed-off-by: Jan Beulich <jbeulich@suse.com>

Acked-by: Andrew Cooper <andrew.cooper3@citrix.com>