[PATCH 1/2] target/i386: fix hang when using slow path for ptw_setl

Pierrick Bouvier posted 2 patches 1 month ago
There is a newer version of this series
[PATCH 1/2] target/i386: fix hang when using slow path for ptw_setl
Posted by Pierrick Bouvier 1 month ago
When instrumenting memory accesses for plugin, we force memory accesses
to use the slow path for mmu. [1]
This create a situation where we end up calling ptw_setl_slow.

Since this function gets called during a cpu_exec, start_exclusive then
hangs. This exclusive section was introduced initially for security
reasons [2].

I suspect this code path was never triggered, because ptw_setl_slow
would always be called transitively from cpu_exec, resulting in a hang.

[1] https://gitlab.com/qemu-project/qemu/-/commit/6d03226b42247b68ab2f0b3663e0f624335a4055
[2] https://gitlab.com/qemu-project/qemu/-/issues/279

Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2566
Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
---
 target/i386/tcg/sysemu/excp_helper.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
index 8fb05b1f531..f30102b5362 100644
--- a/target/i386/tcg/sysemu/excp_helper.c
+++ b/target/i386/tcg/sysemu/excp_helper.c
@@ -108,6 +108,9 @@ static bool ptw_setl_slow(const PTETranslate *in, uint32_t old, uint32_t new)
 {
     uint32_t cmp;
 
+    /* We are in cpu_exec, and start_exclusive can't be called directly.*/
+    g_assert(current_cpu && current_cpu->running);
+    cpu_exec_end(current_cpu);
     /* Does x86 really perform a rmw cycle on mmio for ptw? */
     start_exclusive();
     cmp = cpu_ldl_mmuidx_ra(in->env, in->gaddr, in->ptw_idx, 0);
@@ -115,6 +118,7 @@ static bool ptw_setl_slow(const PTETranslate *in, uint32_t old, uint32_t new)
         cpu_stl_mmuidx_ra(in->env, in->gaddr, new, in->ptw_idx, 0);
     }
     end_exclusive();
+    cpu_exec_start(current_cpu);
     return cmp == old;
 }
 
-- 
2.39.5
Re: [PATCH 1/2] target/i386: fix hang when using slow path for ptw_setl
Posted by Richard Henderson 1 month ago
On 10/23/24 23:20, Pierrick Bouvier wrote:
> When instrumenting memory accesses for plugin, we force memory accesses
> to use the slow path for mmu. [1]
> This create a situation where we end up calling ptw_setl_slow.
> 
> Since this function gets called during a cpu_exec, start_exclusive then
> hangs. This exclusive section was introduced initially for security
> reasons [2].
> 
> I suspect this code path was never triggered, because ptw_setl_slow
> would always be called transitively from cpu_exec, resulting in a hang.
> 
> [1] https://gitlab.com/qemu-project/qemu/-/commit/6d03226b42247b68ab2f0b3663e0f624335a4055
> [2] https://gitlab.com/qemu-project/qemu/-/issues/279
> 
> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2566
> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>

Oh, wow.  I believe this will be fixed by

https://lore.kernel.org/qemu-devel/20241023033432.1353830-19-richard.henderson@linaro.org/

which is in a pending PR.


r~


> ---
>   target/i386/tcg/sysemu/excp_helper.c | 4 ++++
>   1 file changed, 4 insertions(+)
> 
> diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
> index 8fb05b1f531..f30102b5362 100644
> --- a/target/i386/tcg/sysemu/excp_helper.c
> +++ b/target/i386/tcg/sysemu/excp_helper.c
> @@ -108,6 +108,9 @@ static bool ptw_setl_slow(const PTETranslate *in, uint32_t old, uint32_t new)
>   {
>       uint32_t cmp;
>   
> +    /* We are in cpu_exec, and start_exclusive can't be called directly.*/
> +    g_assert(current_cpu && current_cpu->running);
> +    cpu_exec_end(current_cpu);
>       /* Does x86 really perform a rmw cycle on mmio for ptw? */
>       start_exclusive();
>       cmp = cpu_ldl_mmuidx_ra(in->env, in->gaddr, in->ptw_idx, 0);
> @@ -115,6 +118,7 @@ static bool ptw_setl_slow(const PTETranslate *in, uint32_t old, uint32_t new)
>           cpu_stl_mmuidx_ra(in->env, in->gaddr, new, in->ptw_idx, 0);
>       }
>       end_exclusive();
> +    cpu_exec_start(current_cpu);
>       return cmp == old;
>   }
>
Re: [PATCH 1/2] target/i386: fix hang when using slow path for ptw_setl
Posted by Pierrick Bouvier 4 weeks, 1 day ago

On 10/24/24 09:25, Richard Henderson wrote:
> On 10/23/24 23:20, Pierrick Bouvier wrote:
>> When instrumenting memory accesses for plugin, we force memory accesses
>> to use the slow path for mmu. [1]
>> This create a situation where we end up calling ptw_setl_slow.
>>
>> Since this function gets called during a cpu_exec, start_exclusive then
>> hangs. This exclusive section was introduced initially for security
>> reasons [2].
>>
>> I suspect this code path was never triggered, because ptw_setl_slow
>> would always be called transitively from cpu_exec, resulting in a hang.
>>
>> [1] https://gitlab.com/qemu-project/qemu/-/commit/6d03226b42247b68ab2f0b3663e0f624335a4055
>> [2] https://gitlab.com/qemu-project/qemu/-/issues/279
>>
>> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2566
>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> 
> Oh, wow.  I believe this will be fixed by
> 
> https://lore.kernel.org/qemu-devel/20241023033432.1353830-19-richard.henderson@linaro.org/
> 
> which is in a pending PR.
> 

I confirm this fix the issue and it's now merged upstream.

> 
> r~
> 
> 
>> ---
>>    target/i386/tcg/sysemu/excp_helper.c | 4 ++++
>>    1 file changed, 4 insertions(+)
>>
>> diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
>> index 8fb05b1f531..f30102b5362 100644
>> --- a/target/i386/tcg/sysemu/excp_helper.c
>> +++ b/target/i386/tcg/sysemu/excp_helper.c
>> @@ -108,6 +108,9 @@ static bool ptw_setl_slow(const PTETranslate *in, uint32_t old, uint32_t new)
>>    {
>>        uint32_t cmp;
>>    
>> +    /* We are in cpu_exec, and start_exclusive can't be called directly.*/
>> +    g_assert(current_cpu && current_cpu->running);
>> +    cpu_exec_end(current_cpu);
>>        /* Does x86 really perform a rmw cycle on mmio for ptw? */
>>        start_exclusive();
>>        cmp = cpu_ldl_mmuidx_ra(in->env, in->gaddr, in->ptw_idx, 0);
>> @@ -115,6 +118,7 @@ static bool ptw_setl_slow(const PTETranslate *in, uint32_t old, uint32_t new)
>>            cpu_stl_mmuidx_ra(in->env, in->gaddr, new, in->ptw_idx, 0);
>>        }
>>        end_exclusive();
>> +    cpu_exec_start(current_cpu);
>>        return cmp == old;
>>    }
>>    
>
Re: [PATCH 1/2] target/i386: fix hang when using slow path for ptw_setl
Posted by Pierrick Bouvier 1 month ago
On 10/24/24 09:25, Richard Henderson wrote:
> On 10/23/24 23:20, Pierrick Bouvier wrote:
>> When instrumenting memory accesses for plugin, we force memory accesses
>> to use the slow path for mmu. [1]
>> This create a situation where we end up calling ptw_setl_slow.
>>
>> Since this function gets called during a cpu_exec, start_exclusive then
>> hangs. This exclusive section was introduced initially for security
>> reasons [2].
>>
>> I suspect this code path was never triggered, because ptw_setl_slow
>> would always be called transitively from cpu_exec, resulting in a hang.
>>
>> [1] https://gitlab.com/qemu-project/qemu/-/commit/6d03226b42247b68ab2f0b3663e0f624335a4055
>> [2] https://gitlab.com/qemu-project/qemu/-/issues/279
>>
>> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2566
>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
> 
> Oh, wow.  I believe this will be fixed by
> 
> https://lore.kernel.org/qemu-devel/20241023033432.1353830-19-richard.henderson@linaro.org/
> 
> which is in a pending PR.
> 

It might the issue by not triggering the situation we observed.
However, we still have a hidden dead code path where start_exclusive is 
called from cpu_exec, not being related to the plugins.

> 
> r~
> 
> 
>> ---
>>    target/i386/tcg/sysemu/excp_helper.c | 4 ++++
>>    1 file changed, 4 insertions(+)
>>
>> diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/excp_helper.c
>> index 8fb05b1f531..f30102b5362 100644
>> --- a/target/i386/tcg/sysemu/excp_helper.c
>> +++ b/target/i386/tcg/sysemu/excp_helper.c
>> @@ -108,6 +108,9 @@ static bool ptw_setl_slow(const PTETranslate *in, uint32_t old, uint32_t new)
>>    {
>>        uint32_t cmp;
>>    
>> +    /* We are in cpu_exec, and start_exclusive can't be called directly.*/
>> +    g_assert(current_cpu && current_cpu->running);
>> +    cpu_exec_end(current_cpu);
>>        /* Does x86 really perform a rmw cycle on mmio for ptw? */
>>        start_exclusive();
>>        cmp = cpu_ldl_mmuidx_ra(in->env, in->gaddr, in->ptw_idx, 0);
>> @@ -115,6 +118,7 @@ static bool ptw_setl_slow(const PTETranslate *in, uint32_t old, uint32_t new)
>>            cpu_stl_mmuidx_ra(in->env, in->gaddr, new, in->ptw_idx, 0);
>>        }
>>        end_exclusive();
>> +    cpu_exec_start(current_cpu);
>>        return cmp == old;
>>    }
>>    
>
Re: [PATCH 1/2] target/i386: fix hang when using slow path for ptw_setl
Posted by Richard Henderson 4 weeks, 1 day ago
On 10/24/24 18:14, Pierrick Bouvier wrote:
> On 10/24/24 09:25, Richard Henderson wrote:
>> On 10/23/24 23:20, Pierrick Bouvier wrote:
>>> When instrumenting memory accesses for plugin, we force memory accesses
>>> to use the slow path for mmu. [1]
>>> This create a situation where we end up calling ptw_setl_slow.
>>>
>>> Since this function gets called during a cpu_exec, start_exclusive then
>>> hangs. This exclusive section was introduced initially for security
>>> reasons [2].
>>>
>>> I suspect this code path was never triggered, because ptw_setl_slow
>>> would always be called transitively from cpu_exec, resulting in a hang.
>>>
>>> [1] https://gitlab.com/qemu-project/qemu/-/commit/6d03226b42247b68ab2f0b3663e0f624335a4055
>>> [2] https://gitlab.com/qemu-project/qemu/-/issues/279
>>>
>>> Fixes: https://gitlab.com/qemu-project/qemu/-/issues/2566
>>> Signed-off-by: Pierrick Bouvier <pierrick.bouvier@linaro.org>
>>
>> Oh, wow.  I believe this will be fixed by
>>
>> https://lore.kernel.org/qemu-devel/20241023033432.1353830-19-richard.henderson@linaro.org/
>>
>> which is in a pending PR.
>>
> 
> It might the issue by not triggering the situation we observed.
> However, we still have a hidden dead code path where start_exclusive is called from 
> cpu_exec, not being related to the plugins.

You're right, this would affect mmio, were the os so careless as to place page tables in mmio.

>>> +    /* We are in cpu_exec, and start_exclusive can't be called directly.*/
>>> +    g_assert(current_cpu && current_cpu->running);
>>> +    cpu_exec_end(current_cpu);

Better to use env_cpu(in->env).


r~