[v1] Fixes for TCG hangs

[Qemu-devel] [PATCH v1 3/3] tcg-runtime: short-circuit lookup_tb_ptr on IRQs

Posted by Alex Bennée 8 years, 8 months ago

While the next TB would detect the exit flag has been set there is no
point if we can exit sooner. We also check cpu->interrupt_request as
some front-ends can set it rather than using the cpu_interrupt() API
call and would normally be expecting the IRQ to get picked up on the
previously fairly regular exits from the run loop.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
---
 tcg-runtime.c | 47 +++++++++++++++++++++++++++--------------------
 1 file changed, 27 insertions(+), 20 deletions(-)

diff --git a/tcg-runtime.c b/tcg-runtime.c
index f4bfa9cea6..a025a6f194 100644
--- a/tcg-runtime.c
+++ b/tcg-runtime.c
@@ -147,28 +147,35 @@ uint64_t HELPER(ctpop_i64)(uint64_t arg)
 void *HELPER(lookup_tb_ptr)(CPUArchState *env, target_ulong addr)
 {
     CPUState *cpu = ENV_GET_CPU(env);
-    unsigned int addr_hash = tb_jmp_cache_hash_func(addr);
     void *code_ptr = NULL;
-    TranslationBlock *tb;
-
-    tb = atomic_rcu_read(&cpu->tb_jmp_cache[addr_hash]);
-    if (likely(tb)) {
-        target_ulong cs_base, pc;
-        uint32_t flags;
-
-        cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
-
-        if (likely(tb->pc == addr && tb->cs_base == cs_base &&
-                   tb->flags == flags)) {
-            code_ptr = tb->tc_ptr;
-        } else {
-            /* If we didn't find it in the jmp_cache we still might
-             * find it in the global tb_htable
-             */
-            tb = tb_htable_lookup(cpu, addr, cs_base, flags);
-            if (likely(tb)) {
-                atomic_set(&cpu->tb_jmp_cache[addr_hash], tb);
+
+    /* If there is an interrupt pending request or the TCG exit flag
+     * has been set we might as well stop here and return to the main
+     * loop.
+     */
+    if (!cpu->icount_decr.u16.high && !cpu->interrupt_request) {
+        unsigned int addr_hash = tb_jmp_cache_hash_func(addr);
+        TranslationBlock *tb;
+
+        tb = atomic_rcu_read(&cpu->tb_jmp_cache[addr_hash]);
+        if (likely(tb)) {
+            target_ulong cs_base, pc;
+            uint32_t flags;
+
+            cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
+
+            if (likely(tb->pc == addr && tb->cs_base == cs_base &&
+                       tb->flags == flags)) {
                 code_ptr = tb->tc_ptr;
+            } else {
+                /* If we didn't find it in the jmp_cache we still might
+                 * find it in the global tb_htable
+                 */
+                tb = tb_htable_lookup(cpu, addr, cs_base, flags);
+                if (likely(tb)) {
+                    atomic_set(&cpu->tb_jmp_cache[addr_hash], tb);
+                    code_ptr = tb->tc_ptr;
+                }
             }
         }
     }
-- 
2.13.0

Re: [Qemu-devel] [PATCH v1 3/3] tcg-runtime: short-circuit lookup_tb_ptr on IRQs

Posted by Richard Henderson 8 years, 8 months ago

On 06/14/2017 07:02 AM, Alex Bennée wrote:
> While the next TB would detect the exit flag has been set there is no
> point if we can exit sooner. We also check cpu->interrupt_request as
> some front-ends can set it rather than using the cpu_interrupt() API
> call and would normally be expecting the IRQ to get picked up on the
> previously fairly regular exits from the run loop.
> 
> Signed-off-by: Alex Bennée<alex.bennee@linaro.org>
> ---
>   tcg-runtime.c | 47 +++++++++++++++++++++++++++--------------------
>   1 file changed, 27 insertions(+), 20 deletions(-)

I can confirm this fixes the Alpha hang as well.


r~

Re: [Qemu-devel] [PATCH v1 3/3] tcg-runtime: short-circuit lookup_tb_ptr on IRQs

Posted by Paolo Bonzini 8 years, 8 months ago

On 14/06/2017 17:45, Richard Henderson wrote:
> While the next TB would detect the exit flag has been set there is no
> point if we can exit sooner. We also check cpu->interrupt_request as
> some front-ends can set it rather than using the cpu_interrupt() API
> call and would normally be expecting the IRQ to get picked up on the
> previously fairly regular exits from the run loop. 

This is not what happens actually; it's not about front-ends setting 
cpu->interrupt_request, it's about front-ends doing exit_tb when they 
wanted to re-evaluate cpu_handle_interrupt.

cpu_exit is used when device code causes a rising edge in 
cpu->interrupt_request.  What we have here is that the MSR write causes 
cc->cpu_exec_interrupt's return value to change from false to true.

I think this is a band-aid, and would rather fix the front-ends as in 
Emilio's patch.  For Alpha my guess would be:

diff --git a/target/alpha/translate.c b/target/alpha/translate.c
index 7c45ae360c..6e2ee3f958 100644
--- a/target/alpha/translate.c
+++ b/target/alpha/translate.c
@@ -1198,7 +1198,9 @@ static ExitStatus gen_call_pal(DisasContext *ctx, int palcode)
             tcg_gen_andi_i64(tmp, ctx->ir[IR_A0], PS_INT_MASK);
              tcg_gen_st8_i64(tmp, cpu_env, offsetof(CPUAlphaState, ps));
             tcg_temp_free(tmp);
-            break;
+
+            /* Reevaluate interrupts */
+            return EXIT_PC_STALE;

         case 0x36:
             /* RDPS */

It's okay as a last resort I guess, but there's still a lot of
time before 2.9.

Paolo

Re: [Qemu-devel] [PATCH v1 3/3] tcg-runtime: short-circuit lookup_tb_ptr on IRQs

Posted by Richard Henderson 8 years, 8 months ago

On 06/14/2017 09:08 AM, Paolo Bonzini wrote:
> I think this is a band-aid, and would rather fix the front-ends as in
> Emilio's patch.  For Alpha my guess would be:
> 
> diff --git a/target/alpha/translate.c b/target/alpha/translate.c
> index 7c45ae360c..6e2ee3f958 100644
> --- a/target/alpha/translate.c
> +++ b/target/alpha/translate.c
> @@ -1198,7 +1198,9 @@ static ExitStatus gen_call_pal(DisasContext *ctx, int palcode)
>               tcg_gen_andi_i64(tmp, ctx->ir[IR_A0], PS_INT_MASK);
>                tcg_gen_st8_i64(tmp, cpu_env, offsetof(CPUAlphaState, ps));
>               tcg_temp_free(tmp);
> -            break;
> +
> +            /* Reevaluate interrupts */
> +            return EXIT_PC_STALE;
>   
>           case 0x36:
>               /* RDPS */

Thanks!

You're right that adjusting SWPIPL along these lines does fix the problem for 
Alpha.  Given that Alpha would typically hang in arch_idle, I'd been focusing 
primarily on WTINT.


r~

Re: [Qemu-devel] [PATCH v1 3/3] tcg-runtime: short-circuit lookup_tb_ptr on IRQs

Posted by Alex Bennée 8 years, 8 months ago

Paolo Bonzini <pbonzini@redhat.com> writes:

> On 14/06/2017 17:45, Richard Henderson wrote:
>> While the next TB would detect the exit flag has been set there is no
>> point if we can exit sooner. We also check cpu->interrupt_request as
>> some front-ends can set it rather than using the cpu_interrupt() API
>> call and would normally be expecting the IRQ to get picked up on the
>> previously fairly regular exits from the run loop.
>
> This is not what happens actually; it's not about front-ends setting
> cpu->interrupt_request, it's about front-ends doing exit_tb when they
> wanted to re-evaluate cpu_handle_interrupt.
>
> cpu_exit is used when device code causes a rising edge in
> cpu->interrupt_request.  What we have here is that the MSR write causes
> cc->cpu_exec_interrupt's return value to change from false to true.
>
> I think this is a band-aid, and would rather fix the front-ends as in
> Emilio's patch.

It seems a shame to cause all msr accesses to trigger and exit when we
only care about the unmasking case. How about:


Author: Alex Bennée <alex.bennee@linaro.org>
Date:   Wed Jun 14 18:46:01 2017 +0100

    target/arm/op_helper: ensure we exit the run-loop

    When IRQs are un-masked we need to ensure the run-loop is exited so we
    can evaluate arm_cpu_do_interrupt.

    Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
index 2a85666579..7e67bb3db2 100644
--- a/target/arm/op_helper.c
+++ b/target/arm/op_helper.c
@@ -835,6 +835,9 @@ void HELPER(msr_i_pstate)(CPUARMState *env, uint32_t op, uint32_t imm)
         break;
     case 0x1f: /* DAIFClear */
         env->daif &= ~((imm << 6) & PSTATE_DAIF);
+        /* This may result in pending IRQs being unmasked so ensure we
+           exit the loop */
+        cpu_exit(ENV_GET_CPU(env));
         break;
     default:
         g_assert_not_reached();

Re: [Qemu-devel] [PATCH v1 3/3] tcg-runtime: short-circuit lookup_tb_ptr on IRQs

Posted by Richard Henderson 8 years, 8 months ago

On 06/14/2017 10:49 AM, Alex Bennée wrote:
>> I think this is a band-aid, and would rather fix the front-ends as in
>> Emilio's patch.
> 
> It seems a shame to cause all msr accesses to trigger and exit when we
> only care about the unmasking case. How about:
> 
> 
> Author: Alex Bennée <alex.bennee@linaro.org>
> Date:   Wed Jun 14 18:46:01 2017 +0100
> 
>      target/arm/op_helper: ensure we exit the run-loop
> 
>      When IRQs are un-masked we need to ensure the run-loop is exited so we
>      can evaluate arm_cpu_do_interrupt.
> 
>      Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
> 
> diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
> index 2a85666579..7e67bb3db2 100644
> --- a/target/arm/op_helper.c
> +++ b/target/arm/op_helper.c
> @@ -835,6 +835,9 @@ void HELPER(msr_i_pstate)(CPUARMState *env, uint32_t op, uint32_t imm)
>           break;
>       case 0x1f: /* DAIFClear */
>           env->daif &= ~((imm << 6) & PSTATE_DAIF);
> +        /* This may result in pending IRQs being unmasked so ensure we
> +           exit the loop */
> +        cpu_exit(ENV_GET_CPU(env));

That works for me.  And I guess that takes care of any potential problems with 
A32 as well?


r~

Re: [Qemu-devel] [PATCH v1 3/3] tcg-runtime: short-circuit lookup_tb_ptr on IRQs

Posted by Peter Maydell 8 years, 8 months ago

On 14 June 2017 at 18:49, Alex Bennée <alex.bennee@linaro.org> wrote:
> diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
> index 2a85666579..7e67bb3db2 100644
> --- a/target/arm/op_helper.c
> +++ b/target/arm/op_helper.c
> @@ -835,6 +835,9 @@ void HELPER(msr_i_pstate)(CPUARMState *env, uint32_t op, uint32_t imm)
>          break;
>      case 0x1f: /* DAIFClear */
>          env->daif &= ~((imm << 6) & PSTATE_DAIF);
> +        /* This may result in pending IRQs being unmasked so ensure we
> +           exit the loop */
> +        cpu_exit(ENV_GET_CPU(env));
>          break;
>      default:
>          g_assert_not_reached();

The 'op' field we're switching on here is just a constant
from the instruction encoding, so I'd rather see us
identify that in translate-a64.c and end the TB or
whatever when we need to, rather than doing the
longjump-out-of-here that cpu_exit() does at runtime.

thanks
-- PMM

Re: [Qemu-devel] [PATCH v1 3/3] tcg-runtime: short-circuit lookup_tb_ptr on IRQs

Posted by Richard Henderson 8 years, 8 months ago

On 06/14/2017 12:11 PM, Peter Maydell wrote:
> On 14 June 2017 at 18:49, Alex Bennée <alex.bennee@linaro.org> wrote:
>> diff --git a/target/arm/op_helper.c b/target/arm/op_helper.c
>> index 2a85666579..7e67bb3db2 100644
>> --- a/target/arm/op_helper.c
>> +++ b/target/arm/op_helper.c
>> @@ -835,6 +835,9 @@ void HELPER(msr_i_pstate)(CPUARMState *env, uint32_t op, uint32_t imm)
>>           break;
>>       case 0x1f: /* DAIFClear */
>>           env->daif &= ~((imm << 6) & PSTATE_DAIF);
>> +        /* This may result in pending IRQs being unmasked so ensure we
>> +           exit the loop */
>> +        cpu_exit(ENV_GET_CPU(env));
>>           break;
>>       default:
>>           g_assert_not_reached();
> 
> The 'op' field we're switching on here is just a constant
> from the instruction encoding, so I'd rather see us
> identify that in translate-a64.c and end the TB or
> whatever when we need to, rather than doing the
> longjump-out-of-here that cpu_exit() does at runtime.

cpu_exit isn't the longjmp; this is just a set of exit_request and icount_decr.

That said, you're right that we can do this more directly from the translator.


r~