[PATCH] tcg/ppc: Fix race in goto_tb implementation

Jordan Niethe posted 1 patch 9 months, 4 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230717012327.20149-1-jniethe5@gmail.com
Maintainers: Richard Henderson <richard.henderson@linaro.org>
There is a newer version of this series
tcg/ppc/tcg-target.c.inc | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
[PATCH] tcg/ppc: Fix race in goto_tb implementation
Posted by Jordan Niethe 9 months, 4 weeks ago
Commit 20b6643324 ("tcg/ppc: Reorg goto_tb implementation") modified
goto_tb to ensure only a single instruction was patched to prevent
incorrect behaviour if a thread was in the middle of multiple
instructions when they were replaced. However this introduced a race
between loading the jmp target into TCG_REG_TB and patching and
executing the direct branch.

The relevent part of the goto_tb implementation:

    ld TCG_REG_TB, TARGET_ADDR_LOCATION(TCG_REG_TB)
  patch_location:
    mtctr TCG_REG_TB
    bctr

tb_target_set_jmp_target() will replace 'patch_location' with a direct
branch if the target is in range. The direct branch now relies on
TCG_REG_TB being set up correctly by the ld. Prior to this commit
multiple instructions were patched in for the direct branch case; these
instructions would initalise TCG_REG_TB to the same value as the branch
target.

Imagine the following sequence:

1) Thread A is executing the goto_tb sequence and loads the jmp
   target into TCG_REG_TB.

2) Thread B updates the jmp target address and calls
   tb_target_set_jmp_target(). This patches a new direct branch into the
   goto_tb sequence.

3) Thread A executes the newly patched direct branch. The value in
   TCG_REG_TB still contains the old jmp target.

TCG_REG_TB MUST contain the translation block's tc.ptr. Execution will
eventually crash after performing memory accesses generated from a
faulty value in TCG_REG_TB.

This presents as segfaults or illegal instruction exceptions.

Do not revert commit 20b6643324 as it did fix a different race
condition. Instead remove the direct branch optimization and always use
indirect branches.

The direct branch optimization can be re-added later with a race free
sequence.

Gitlab issue: https://gitlab.com/qemu-project/qemu/-/issues/1726

Fixes: 20b6643324 ("tcg/ppc: Reorg goto_tb implementation")

Reported-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
Co-developed-by: Benjamin Gray <bgray@linux.ibm.com>
Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
---
 tcg/ppc/tcg-target.c.inc | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 8d6899cf40..a7323f479b 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -2533,11 +2533,10 @@ static void tcg_out_goto_tb(TCGContext *s, int which)
         ptrdiff_t offset = tcg_tbrel_diff(s, (void *)ptr);
         tcg_out_mem_long(s, LD, LDX, TCG_REG_TB, TCG_REG_TB, offset);
     
-        /* Direct branch will be patched by tb_target_set_jmp_target. */
+        /* TODO: Use direct branches when possible. */
         set_jmp_insn_offset(s, which);
         tcg_out32(s, MTSPR | RS(TCG_REG_TB) | CTR);
 
-        /* When branch is out of range, fall through to indirect. */
         tcg_out32(s, BCCTR | BO_ALWAYS);
 
         /* For the unlinked case, need to reset TCG_REG_TB.  */
@@ -2565,10 +2564,11 @@ void tb_target_set_jmp_target(const TranslationBlock *tb, int n,
     intptr_t diff = addr - jmp_rx;
     tcg_insn_unit insn;
 
+    if (USE_REG_TB)
+        return;
+
     if (in_range_b(diff)) {
         insn = B | (diff & 0x3fffffc);
-    } else if (USE_REG_TB) {
-        insn = MTSPR | RS(TCG_REG_TB) | CTR;
     } else {
         insn = NOP;
     }
-- 
2.39.3
Re: [PATCH] tcg/ppc: Fix race in goto_tb implementation
Posted by Jordan Niethe 9 months, 4 weeks ago
On Mon, Jul 17, 2023 at 11:24 AM Jordan Niethe <jniethe5@gmail.com> wrote:
[snip]
>
> Reported-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
> Co-developed-by: Benjamin Gray <bgray@linux.ibm.com>
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>

Sorry, should also have:
Signed-off-by: Benjamin Gray <bgray@linux.ibm.com>

[snip]
Re: [PATCH] tcg/ppc: Fix race in goto_tb implementation
Posted by Michael Tokarev 9 months, 4 weeks ago
17.07.2023 04:23, Jordan Niethe wrote:
> Commit 20b6643324 ("tcg/ppc: Reorg goto_tb implementation") modified
> goto_tb to ensure only a single instruction was patched to prevent
> incorrect behaviour if a thread was in the middle of multiple
> instructions when they were replaced. However this introduced a race
> between loading the jmp target into TCG_REG_TB and patching and
> executing the direct branch.
> 
> The relevent part of the goto_tb implementation:
> 
>      ld TCG_REG_TB, TARGET_ADDR_LOCATION(TCG_REG_TB)
>    patch_location:
>      mtctr TCG_REG_TB
>      bctr
> 
> tb_target_set_jmp_target() will replace 'patch_location' with a direct
> branch if the target is in range. The direct branch now relies on
> TCG_REG_TB being set up correctly by the ld. Prior to this commit
> multiple instructions were patched in for the direct branch case; these
> instructions would initalise TCG_REG_TB to the same value as the branch
> target.
> 
> Imagine the following sequence:
> 
> 1) Thread A is executing the goto_tb sequence and loads the jmp
>     target into TCG_REG_TB.
> 
> 2) Thread B updates the jmp target address and calls
>     tb_target_set_jmp_target(). This patches a new direct branch into the
>     goto_tb sequence.
> 
> 3) Thread A executes the newly patched direct branch. The value in
>     TCG_REG_TB still contains the old jmp target.
> 
> TCG_REG_TB MUST contain the translation block's tc.ptr. Execution will
> eventually crash after performing memory accesses generated from a
> faulty value in TCG_REG_TB.
> 
> This presents as segfaults or illegal instruction exceptions.
> 
> Do not revert commit 20b6643324 as it did fix a different race
> condition. Instead remove the direct branch optimization and always use
> indirect branches.
> 
> The direct branch optimization can be re-added later with a race free
> sequence.
> 
> Gitlab issue: https://gitlab.com/qemu-project/qemu/-/issues/1726

I confirm this fixes the issue hit by debian as well, - 30 runs in
a row already with this patch and counting, while before it failed
almost reliable on first try, sometimes on 2nd.

Tested-by: Michael Tokarev <mjt@tls.msk.ru>

(this is max I can do, as I don't know tcg at all :) )

Thank you very much for the fix!

/mjt

> @@ -2565,10 +2564,11 @@ void tb_target_set_jmp_target(const TranslationBlock *tb, int n,
>       intptr_t diff = addr - jmp_rx;
>       tcg_insn_unit insn;
>   
> +    if (USE_REG_TB)
> +        return;
> +
>       if (in_range_b(diff)) {
>           insn = B | (diff & 0x3fffffc);
Re: [PATCH] tcg/ppc: Fix race in goto_tb implementation
Posted by Richard Henderson 9 months, 4 weeks ago
On 7/17/23 02:23, Jordan Niethe wrote:
> Commit 20b6643324 ("tcg/ppc: Reorg goto_tb implementation") modified
> goto_tb to ensure only a single instruction was patched to prevent
> incorrect behaviour if a thread was in the middle of multiple
> instructions when they were replaced. However this introduced a race
> between loading the jmp target into TCG_REG_TB and patching and
> executing the direct branch.
> 
> The relevent part of the goto_tb implementation:
> 
>      ld TCG_REG_TB, TARGET_ADDR_LOCATION(TCG_REG_TB)
>    patch_location:
>      mtctr TCG_REG_TB
>      bctr
> 
> tb_target_set_jmp_target() will replace 'patch_location' with a direct
> branch if the target is in range. The direct branch now relies on
> TCG_REG_TB being set up correctly by the ld. Prior to this commit
> multiple instructions were patched in for the direct branch case; these
> instructions would initalise TCG_REG_TB to the same value as the branch
> target.
> 
> Imagine the following sequence:
> 
> 1) Thread A is executing the goto_tb sequence and loads the jmp
>     target into TCG_REG_TB.
> 
> 2) Thread B updates the jmp target address and calls
>     tb_target_set_jmp_target(). This patches a new direct branch into the
>     goto_tb sequence.
> 
> 3) Thread A executes the newly patched direct branch. The value in
>     TCG_REG_TB still contains the old jmp target.
> 
> TCG_REG_TB MUST contain the translation block's tc.ptr. Execution will
> eventually crash after performing memory accesses generated from a
> faulty value in TCG_REG_TB.
> 
> This presents as segfaults or illegal instruction exceptions.
> 
> Do not revert commit 20b6643324 as it did fix a different race
> condition. Instead remove the direct branch optimization and always use
> indirect branches.
> 
> The direct branch optimization can be re-added later with a race free
> sequence.
> 
> Gitlab issue: https://gitlab.com/qemu-project/qemu/-/issues/1726
> 
> Fixes: 20b6643324 ("tcg/ppc: Reorg goto_tb implementation")
> 
> Reported-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
> Co-developed-by: Benjamin Gray <bgray@linux.ibm.com>
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
> ---
>   tcg/ppc/tcg-target.c.inc | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
> index 8d6899cf40..a7323f479b 100644
> --- a/tcg/ppc/tcg-target.c.inc
> +++ b/tcg/ppc/tcg-target.c.inc
> @@ -2533,11 +2533,10 @@ static void tcg_out_goto_tb(TCGContext *s, int which)
>           ptrdiff_t offset = tcg_tbrel_diff(s, (void *)ptr);
>           tcg_out_mem_long(s, LD, LDX, TCG_REG_TB, TCG_REG_TB, offset);
>       
> -        /* Direct branch will be patched by tb_target_set_jmp_target. */
> +        /* TODO: Use direct branches when possible. */
>           set_jmp_insn_offset(s, which);
>           tcg_out32(s, MTSPR | RS(TCG_REG_TB) | CTR);
>   
> -        /* When branch is out of range, fall through to indirect. */
>           tcg_out32(s, BCCTR | BO_ALWAYS);
>   
>           /* For the unlinked case, need to reset TCG_REG_TB.  */
> @@ -2565,10 +2564,11 @@ void tb_target_set_jmp_target(const TranslationBlock *tb, int n,
>       intptr_t diff = addr - jmp_rx;
>       tcg_insn_unit insn;
>   
> +    if (USE_REG_TB)
> +        return;
> +

Braces.  Otherwise,

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>

This is an excellent reason to add support for power10, and its pc-relative references. 
We can disable REG_TB for that case.


r~
Re: [PATCH] tcg/ppc: Fix race in goto_tb implementation
Posted by Cédric Le Goater 9 months, 4 weeks ago
On 7/17/23 09:35, Richard Henderson wrote:
> On 7/17/23 02:23, Jordan Niethe wrote:
>> Commit 20b6643324 ("tcg/ppc: Reorg goto_tb implementation") modified
>> goto_tb to ensure only a single instruction was patched to prevent
>> incorrect behaviour if a thread was in the middle of multiple
>> instructions when they were replaced. However this introduced a race
>> between loading the jmp target into TCG_REG_TB and patching and
>> executing the direct branch.
>>
>> The relevent part of the goto_tb implementation:
>>
>>      ld TCG_REG_TB, TARGET_ADDR_LOCATION(TCG_REG_TB)
>>    patch_location:
>>      mtctr TCG_REG_TB
>>      bctr
>>
>> tb_target_set_jmp_target() will replace 'patch_location' with a direct
>> branch if the target is in range. The direct branch now relies on
>> TCG_REG_TB being set up correctly by the ld. Prior to this commit
>> multiple instructions were patched in for the direct branch case; these
>> instructions would initalise TCG_REG_TB to the same value as the branch
>> target.
>>
>> Imagine the following sequence:
>>
>> 1) Thread A is executing the goto_tb sequence and loads the jmp
>>     target into TCG_REG_TB.
>>
>> 2) Thread B updates the jmp target address and calls
>>     tb_target_set_jmp_target(). This patches a new direct branch into the
>>     goto_tb sequence.
>>
>> 3) Thread A executes the newly patched direct branch. The value in
>>     TCG_REG_TB still contains the old jmp target.
>>
>> TCG_REG_TB MUST contain the translation block's tc.ptr. Execution will
>> eventually crash after performing memory accesses generated from a
>> faulty value in TCG_REG_TB.
>>
>> This presents as segfaults or illegal instruction exceptions.
>>
>> Do not revert commit 20b6643324 as it did fix a different race
>> condition. Instead remove the direct branch optimization and always use
>> indirect branches.
>>
>> The direct branch optimization can be re-added later with a race free
>> sequence.
>>
>> Gitlab issue: https://gitlab.com/qemu-project/qemu/-/issues/1726

Resolves: ...

Please send a v2 with updated tags. Run scripts/checkpatch.pl also.

Thanks,

C.


>>
>> Fixes: 20b6643324 ("tcg/ppc: Reorg goto_tb implementation")
>>
>> Reported-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
>> Co-developed-by: Benjamin Gray <bgray@linux.ibm.com>
>> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
>> ---
>>   tcg/ppc/tcg-target.c.inc | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
>> index 8d6899cf40..a7323f479b 100644
>> --- a/tcg/ppc/tcg-target.c.inc
>> +++ b/tcg/ppc/tcg-target.c.inc
>> @@ -2533,11 +2533,10 @@ static void tcg_out_goto_tb(TCGContext *s, int which)
>>           ptrdiff_t offset = tcg_tbrel_diff(s, (void *)ptr);
>>           tcg_out_mem_long(s, LD, LDX, TCG_REG_TB, TCG_REG_TB, offset);
>> -        /* Direct branch will be patched by tb_target_set_jmp_target. */
>> +        /* TODO: Use direct branches when possible. */
>>           set_jmp_insn_offset(s, which);
>>           tcg_out32(s, MTSPR | RS(TCG_REG_TB) | CTR);
>> -        /* When branch is out of range, fall through to indirect. */
>>           tcg_out32(s, BCCTR | BO_ALWAYS);
>>           /* For the unlinked case, need to reset TCG_REG_TB.  */
>> @@ -2565,10 +2564,11 @@ void tb_target_set_jmp_target(const TranslationBlock *tb, int n,
>>       intptr_t diff = addr - jmp_rx;
>>       tcg_insn_unit insn;
>> +    if (USE_REG_TB)
>> +        return;
>> +
> 
> Braces.  Otherwise,
> 
> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
> 
> This is an excellent reason to add support for power10, and its pc-relative references. We can disable REG_TB for that case.
> 
> 
> r~


Re: [PATCH] tcg/ppc: Fix race in goto_tb implementation
Posted by Anushree Mathur 9 months, 4 weeks ago
On 7/17/23 06:53, Jordan Niethe wrote:
> Commit 20b6643324 ("tcg/ppc: Reorg goto_tb implementation") modified
> goto_tb to ensure only a single instruction was patched to prevent
> incorrect behaviour if a thread was in the middle of multiple
> instructions when they were replaced. However this introduced a race
> between loading the jmp target into TCG_REG_TB and patching and
> executing the direct branch.
>
> The relevent part of the goto_tb implementation:
>
>      ld TCG_REG_TB, TARGET_ADDR_LOCATION(TCG_REG_TB)
>    patch_location:
>      mtctr TCG_REG_TB
>      bctr
>
> tb_target_set_jmp_target() will replace 'patch_location' with a direct
> branch if the target is in range. The direct branch now relies on
> TCG_REG_TB being set up correctly by the ld. Prior to this commit
> multiple instructions were patched in for the direct branch case; these
> instructions would initalise TCG_REG_TB to the same value as the branch
> target.
>
> Imagine the following sequence:
>
> 1) Thread A is executing the goto_tb sequence and loads the jmp
>     target into TCG_REG_TB.
>
> 2) Thread B updates the jmp target address and calls
>     tb_target_set_jmp_target(). This patches a new direct branch into the
>     goto_tb sequence.
>
> 3) Thread A executes the newly patched direct branch. The value in
>     TCG_REG_TB still contains the old jmp target.
>
> TCG_REG_TB MUST contain the translation block's tc.ptr. Execution will
> eventually crash after performing memory accesses generated from a
> faulty value in TCG_REG_TB.
>
> This presents as segfaults or illegal instruction exceptions.
>
> Do not revert commit 20b6643324 as it did fix a different race
> condition. Instead remove the direct branch optimization and always use
> indirect branches.
>
> The direct branch optimization can be re-added later with a race free
> sequence.
>
> Gitlab issue: https://gitlab.com/qemu-project/qemu/-/issues/1726
>
> Fixes: 20b6643324 ("tcg/ppc: Reorg goto_tb implementation")
>
> Reported-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>
> Co-developed-by: Benjamin Gray <bgray@linux.ibm.com>
> Signed-off-by: Jordan Niethe <jniethe5@gmail.com>
I tested with -smp 2/4/8/16 and it works fine with all.

Tested-by: Anushree Mathur <anushree.mathur@linux.vnet.ibm.com>

Thanks,

Anushree-Mathur

> ---
>   tcg/ppc/tcg-target.c.inc | 8 ++++----
>   1 file changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
> index 8d6899cf40..a7323f479b 100644
> --- a/tcg/ppc/tcg-target.c.inc
> +++ b/tcg/ppc/tcg-target.c.inc
> @@ -2533,11 +2533,10 @@ static void tcg_out_goto_tb(TCGContext *s, int which)
>           ptrdiff_t offset = tcg_tbrel_diff(s, (void *)ptr);
>           tcg_out_mem_long(s, LD, LDX, TCG_REG_TB, TCG_REG_TB, offset);
>       
> -        /* Direct branch will be patched by tb_target_set_jmp_target. */
> +        /* TODO: Use direct branches when possible. */
>           set_jmp_insn_offset(s, which);
>           tcg_out32(s, MTSPR | RS(TCG_REG_TB) | CTR);
>
> -        /* When branch is out of range, fall through to indirect. */
>           tcg_out32(s, BCCTR | BO_ALWAYS);
>
>           /* For the unlinked case, need to reset TCG_REG_TB.  */
> @@ -2565,10 +2564,11 @@ void tb_target_set_jmp_target(const TranslationBlock *tb, int n,
>       intptr_t diff = addr - jmp_rx;
>       tcg_insn_unit insn;
>
> +    if (USE_REG_TB)
> +        return;
> +
>       if (in_range_b(diff)) {
>           insn = B | (diff & 0x3fffffc);
> -    } else if (USE_REG_TB) {
> -        insn = MTSPR | RS(TCG_REG_TB) | CTR;
>       } else {
>           insn = NOP;
>       }