[RFC PATCH v2] arm/ptw: Handle atomic updates of page tables entries in MMIO during PTW.

Jonathan Cameron via posted 1 patch 8 months, 3 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20240219161229.11776-1-Jonathan.Cameron@huawei.com
Maintainers: Peter Maydell <peter.maydell@linaro.org>
There is a newer version of this series
target/arm/ptw.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++--
1 file changed, 62 insertions(+), 2 deletions(-)
[RFC PATCH v2] arm/ptw: Handle atomic updates of page tables entries in MMIO during PTW.
Posted by Jonathan Cameron via 8 months, 3 weeks ago
I'm far from confident this handling here is correct. Hence
RFC.  In particular not sure on what locks I should hold for this
to be even moderately safe.

The function already appears to be inconsistent in what it returns
as the CONFIG_ATOMIC64 block returns the endian converted 'eventual'
value of the cmpxchg whereas the TCG_OVERSIZED_GUEST case returns
the previous value.

Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
---
v2: Thanks Peter for reviewing.
 - Handle the address space as in arm_ldq_ptw() - I should have looked
   at the code immediately above :(
   The result ends up a little more convoluted than I'd like. Could factor
   this block of code out perhaps. I'm also not sure on the fault type
   that is appropriate here.
 - Switch to 'need_lock' as per Philippe's feedback on the x86 fixes.
   likely() doesn't seem appropriate here though.
 
target/arm/ptw.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 62 insertions(+), 2 deletions(-)

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 5eb3577bcd..ba1a27ca2b 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -711,8 +711,68 @@ static uint64_t arm_casq_ptw(CPUARMState *env, uint64_t old_val,
     void *host = ptw->out_host;
 
     if (unlikely(!host)) {
-        fi->type = ARMFault_UnsuppAtomicUpdate;
-        return 0;
+        /* Page table in MMIO Memory Region */
+        CPUState *cs = env_cpu(env);
+        MemTxAttrs attrs = {
+            .space = ptw->out_space,
+            .secure = arm_space_is_secure(ptw->out_space),
+        };
+        AddressSpace *as = arm_addressspace(cs, attrs);
+        MemTxResult result = MEMTX_OK;
+        bool need_lock = !bql_locked();
+
+        if (need_lock) {
+            bql_lock();
+        }
+        if (ptw->out_be) {
+            cur_val = address_space_ldq_be(as, ptw->out_phys, attrs, &result);
+            if (unlikely(result != MEMTX_OK)) {
+                fi->type = ARMFault_SyncExternalOnWalk;
+                fi->ea = arm_extabort_type(result);
+                if (need_lock) {
+                    bql_unlock();
+                }
+                return old_val;
+            }
+            if (cur_val == old_val) {
+                address_space_stq_be(as, ptw->out_phys, new_val, attrs, &result);
+                if (unlikely(result != MEMTX_OK)) {
+                    fi->type = ARMFault_SyncExternalOnWalk;
+                    fi->ea = arm_extabort_type(result);
+                    if (need_lock) {
+                        bql_unlock();
+                    }
+                    return old_val;
+                }
+                cur_val = new_val;
+            }
+        } else {
+            cur_val = address_space_ldq_le(as, ptw->out_phys, attrs, &result);
+            if (unlikely(result != MEMTX_OK)) {
+                fi->type = ARMFault_SyncExternalOnWalk;
+                fi->ea = arm_extabort_type(result);
+                if (need_lock) {
+                    bql_unlock();
+                }
+                return old_val;
+            }
+            if (cur_val == old_val) {
+                address_space_stq_le(as, ptw->out_phys, new_val, attrs, &result);
+                if (unlikely(result != MEMTX_OK)) {
+                    fi->type = ARMFault_SyncExternalOnWalk;
+                    fi->ea = arm_extabort_type(result);
+                    if (need_lock) {
+                        bql_unlock();
+                    }
+                    return old_val;
+                }
+                cur_val = new_val;
+            }
+        }
+        if (need_lock) {
+            bql_unlock();
+        }
+        return cur_val;
     }
 
     /*
-- 
2.39.2
Re: [RFC PATCH v2] arm/ptw: Handle atomic updates of page tables entries in MMIO during PTW.
Posted by Richard Henderson 8 months, 3 weeks ago
On 2/19/24 06:12, Jonathan Cameron wrote:
> I'm far from confident this handling here is correct. Hence
> RFC.  In particular not sure on what locks I should hold for this
> to be even moderately safe.
> 
> The function already appears to be inconsistent in what it returns
> as the CONFIG_ATOMIC64 block returns the endian converted 'eventual'
> value of the cmpxchg whereas the TCG_OVERSIZED_GUEST case returns
> the previous value.
> 
> Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> ---
> v2: Thanks Peter for reviewing.
>   - Handle the address space as in arm_ldq_ptw() - I should have looked
>     at the code immediately above :(
>     The result ends up a little more convoluted than I'd like. Could factor
>     this block of code out perhaps. I'm also not sure on the fault type
>     that is appropriate here.
>   - Switch to 'need_lock' as per Philippe's feedback on the x86 fixes.
>     likely() doesn't seem appropriate here though.
>   
> target/arm/ptw.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++--
>   1 file changed, 62 insertions(+), 2 deletions(-)
> 
> diff --git a/target/arm/ptw.c b/target/arm/ptw.c
> index 5eb3577bcd..ba1a27ca2b 100644
> --- a/target/arm/ptw.c
> +++ b/target/arm/ptw.c
> @@ -711,8 +711,68 @@ static uint64_t arm_casq_ptw(CPUARMState *env, uint64_t old_val,
>       void *host = ptw->out_host;
>   
>       if (unlikely(!host)) {
> -        fi->type = ARMFault_UnsuppAtomicUpdate;
> -        return 0;
> +        /* Page table in MMIO Memory Region */
> +        CPUState *cs = env_cpu(env);
> +        MemTxAttrs attrs = {
> +            .space = ptw->out_space,
> +            .secure = arm_space_is_secure(ptw->out_space),
> +        };
> +        AddressSpace *as = arm_addressspace(cs, attrs);
> +        MemTxResult result = MEMTX_OK;
> +        bool need_lock = !bql_locked();
> +
> +        if (need_lock) {
> +            bql_lock();
> +        }
> +        if (ptw->out_be) {
> +            cur_val = address_space_ldq_be(as, ptw->out_phys, attrs, &result);
> +            if (unlikely(result != MEMTX_OK)) {
> +                fi->type = ARMFault_SyncExternalOnWalk;
> +                fi->ea = arm_extabort_type(result);
> +                if (need_lock) {
> +                    bql_unlock();
> +                }
> +                return old_val;
> +            }

Use BQL_LOCK_GUARD() and avoid all of the repeated unlocks at each return point.

You can merge all of the error paths, e.g.

     cur_val = (ptw->out_be
                ? address_space_ldq_be(as, ptw->out_phys, attrs, &result)
                : address_space_ldq_le(as, ptw->out_phys, attrs, &result));
     if (result == MEMTX_OK && cur_val == old_val) {
         if (ptw->out_be) {
             address_space_stq_be(as, ptw->out_phys, new_val, attrs, &result);
         } else {
             address_space_stq_le(as, ptw->out_phys, new_val, attrs, &result);
         }
     }
     if (unlikely(result != MEMTX_OK)) {
         fi->type = ...
         return old_val;
     }
     return cur_val;



r~
Re: [RFC PATCH v2] arm/ptw: Handle atomic updates of page tables entries in MMIO during PTW.
Posted by Peter Maydell 8 months, 3 weeks ago
On Thu, 22 Feb 2024 at 21:21, Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 2/19/24 06:12, Jonathan Cameron wrote:
> > I'm far from confident this handling here is correct. Hence
> > RFC.  In particular not sure on what locks I should hold for this
> > to be even moderately safe.
> >
> > The function already appears to be inconsistent in what it returns
> > as the CONFIG_ATOMIC64 block returns the endian converted 'eventual'
> > value of the cmpxchg whereas the TCG_OVERSIZED_GUEST case returns
> > the previous value.

I think this is a bug in the TCG_OVERSIZED_GUEST handling,
which we've never noticed because you only see that case
on a 32-bit host.

> > Signed-off-by: Jonathan Cameron <Jonathan.Cameron@huawei.com>
> > ---
> > v2: Thanks Peter for reviewing.
> >   - Handle the address space as in arm_ldq_ptw() - I should have looked
> >     at the code immediately above :(
> >     The result ends up a little more convoluted than I'd like. Could factor
> >     this block of code out perhaps. I'm also not sure on the fault type
> >     that is appropriate here.
> >   - Switch to 'need_lock' as per Philippe's feedback on the x86 fixes.
> >     likely() doesn't seem appropriate here though.
> >
> > target/arm/ptw.c | 64 ++++++++++++++++++++++++++++++++++++++++++++++--
> >   1 file changed, 62 insertions(+), 2 deletions(-)
> >
> > diff --git a/target/arm/ptw.c b/target/arm/ptw.c
> > index 5eb3577bcd..ba1a27ca2b 100644
> > --- a/target/arm/ptw.c
> > +++ b/target/arm/ptw.c
> > @@ -711,8 +711,68 @@ static uint64_t arm_casq_ptw(CPUARMState *env, uint64_t old_val,
> >       void *host = ptw->out_host;
> >
> >       if (unlikely(!host)) {
> > -        fi->type = ARMFault_UnsuppAtomicUpdate;
> > -        return 0;
> > +        /* Page table in MMIO Memory Region */
> > +        CPUState *cs = env_cpu(env);
> > +        MemTxAttrs attrs = {
> > +            .space = ptw->out_space,
> > +            .secure = arm_space_is_secure(ptw->out_space),
> > +        };
> > +        AddressSpace *as = arm_addressspace(cs, attrs);
> > +        MemTxResult result = MEMTX_OK;
> > +        bool need_lock = !bql_locked();
> > +
> > +        if (need_lock) {
> > +            bql_lock();
> > +        }
> > +        if (ptw->out_be) {
> > +            cur_val = address_space_ldq_be(as, ptw->out_phys, attrs, &result);
> > +            if (unlikely(result != MEMTX_OK)) {
> > +                fi->type = ARMFault_SyncExternalOnWalk;
> > +                fi->ea = arm_extabort_type(result);
> > +                if (need_lock) {
> > +                    bql_unlock();
> > +                }
> > +                return old_val;
> > +            }
>
> Use BQL_LOCK_GUARD() and avoid all of the repeated unlocks at each return point.

Is the BQL definitely sufficient to ensure atomicity here?
I guess that given that we know it's not backed by host
memory, any other vCPU trying to access the MMIO region
at the same time will have to take the BQL first, so it
seems like it ought to be good enough.

-- PMM
Re: [RFC PATCH v2] arm/ptw: Handle atomic updates of page tables entries in MMIO during PTW.
Posted by Peter Maydell 8 months, 3 weeks ago
On Fri, 23 Feb 2024 at 10:02, Peter Maydell <peter.maydell@linaro.org> wrote:
>
> On Thu, 22 Feb 2024 at 21:21, Richard Henderson
> <richard.henderson@linaro.org> wrote:
> >
> > On 2/19/24 06:12, Jonathan Cameron wrote:
> > > I'm far from confident this handling here is correct. Hence
> > > RFC.  In particular not sure on what locks I should hold for this
> > > to be even moderately safe.
> > >
> > > The function already appears to be inconsistent in what it returns
> > > as the CONFIG_ATOMIC64 block returns the endian converted 'eventual'
> > > value of the cmpxchg whereas the TCG_OVERSIZED_GUEST case returns
> > > the previous value.
>
> I think this is a bug in the TCG_OVERSIZED_GUEST handling,
> which we've never noticed because you only see that case
> on a 32-bit host.

Having looked through the code and talked to Richard on IRC,
I think that the TCG_OVERSIZED_GUEST handling is correct.
The return value of qatomic_cmpxchg__nocheck() is the value
that was in memory before the cmpxchg. So the TCG_OVERSIZED_GUEST
code's semantics are the same as the normal path. (The comment on
qatomic_cmpxchg__nocheck() that describes this as the
"eventual value" is rather confusing so we should improve that.)

thanks
-- PMM