[Qemu-devel] [PATCH] tcg/aarch64: Use ADR for shorter jumps

Pranith Kumar posted 1 patch 6 years, 8 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20170712221414.3867-1-bobby.prani@gmail.com
Test FreeBSD passed
Test checkpatch passed
Test docker passed
Test s390x passed
tcg/aarch64/tcg-target.inc.c | 14 +++++++++-----
1 file changed, 9 insertions(+), 5 deletions(-)
[Qemu-devel] [PATCH] tcg/aarch64: Use ADR for shorter jumps
Posted by Pranith Kumar 6 years, 8 months ago
Use ADR instruction for shorter jumps.

I was going through rth's email and realized that I should have done
this the first time.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
---
 tcg/aarch64/tcg-target.inc.c | 14 +++++++++-----
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
index 04bc369a92..5121ebc1a1 100644
--- a/tcg/aarch64/tcg-target.inc.c
+++ b/tcg/aarch64/tcg-target.inc.c
@@ -886,12 +886,16 @@ void aarch64_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
         i1 = I3206_B | ((offset >> 2) & 0x3ffffff);
         i2 = NOP;
     } else {
-        offset = (addr >> 12) - (jmp_addr >> 12);
+        if (offset == sextract64(offset, 0, 21)) {
+            i1 = I3406_ADR;
+            i2 = NOP;
+        } else {
+            offset = (addr >> 12) - (jmp_addr >> 12);
 
-        /* patch ADRP */
-        i1 = I3406_ADRP | (offset & 3) << 29 | (offset & 0x1ffffc) << (5 - 2) | rd;
-        /* patch ADDI */
-        i2 = I3401_ADDI | rt << 31 | (addr & 0xfff) << 10 | rd << 5 | rd;
+            i1 = I3406_ADRP;
+            i2 = I3401_ADDI | rt << 31 | (addr & 0xfff) << 10 | rd << 5 | rd;
+        }
+        i1 |= (offset & 3) << 29 | (offset & 0x1ffffc) << (5 - 2) | rd;
     }
     pair = (uint64_t)i2 << 32 | i1;
     atomic_set((uint64_t *)jmp_addr, pair);
-- 
2.13.0


Re: [Qemu-devel] [PATCH] tcg/aarch64: Use ADR for shorter jumps
Posted by Richard Henderson 6 years, 8 months ago
On 07/12/2017 12:14 PM, Pranith Kumar wrote:
> Use ADR instruction for shorter jumps.
> 
> I was going through rth's email and realized that I should have done
> this the first time.
> 
> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
> ---
>   tcg/aarch64/tcg-target.inc.c | 14 +++++++++-----
>   1 file changed, 9 insertions(+), 5 deletions(-)
> 
> diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
> index 04bc369a92..5121ebc1a1 100644
> --- a/tcg/aarch64/tcg-target.inc.c
> +++ b/tcg/aarch64/tcg-target.inc.c
> @@ -886,12 +886,16 @@ void aarch64_tb_set_jmp_target(uintptr_t jmp_addr, uintptr_t addr)
>           i1 = I3206_B | ((offset >> 2) & 0x3ffffff);
>           i2 = NOP;
>       } else {
> -        offset = (addr >> 12) - (jmp_addr >> 12);
> +        if (offset == sextract64(offset, 0, 21)) {
> +            i1 = I3406_ADR;
> +            i2 = NOP;
> +        } else {

This is dead code because it is covered by the direct jump above.
B has a 26-bit range, whereas ADR has a 21-bit range.


r~

Re: [Qemu-devel] [PATCH] tcg/aarch64: Use ADR for shorter jumps
Posted by Pranith Kumar 6 years, 8 months ago
On Wed, Jul 12, 2017 at 7:08 PM, Richard Henderson <rth@twiddle.net> wrote:
> On 07/12/2017 12:14 PM, Pranith Kumar wrote:
>>
>> Use ADR instruction for shorter jumps.
>>
>> I was going through rth's email and realized that I should have done
>> this the first time.
>>
>> Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
>> ---
>>   tcg/aarch64/tcg-target.inc.c | 14 +++++++++-----
>>   1 file changed, 9 insertions(+), 5 deletions(-)
>>
>> diff --git a/tcg/aarch64/tcg-target.inc.c b/tcg/aarch64/tcg-target.inc.c
>> index 04bc369a92..5121ebc1a1 100644
>> --- a/tcg/aarch64/tcg-target.inc.c
>> +++ b/tcg/aarch64/tcg-target.inc.c
>> @@ -886,12 +886,16 @@ void aarch64_tb_set_jmp_target(uintptr_t jmp_addr,
>> uintptr_t addr)
>>           i1 = I3206_B | ((offset >> 2) & 0x3ffffff);
>>           i2 = NOP;
>>       } else {
>> -        offset = (addr >> 12) - (jmp_addr >> 12);
>> +        if (offset == sextract64(offset, 0, 21)) {
>> +            i1 = I3406_ADR;
>> +            i2 = NOP;
>> +        } else {
>
>
> This is dead code because it is covered by the direct jump above.
> B has a 26-bit range, whereas ADR has a 21-bit range.

Hmm, yep. And I guess it doesn't make sense to use ADR for short jumps
because this will be a 2 instruction jump vs 1 instruction for direct
branch.

Thanks,
-- 
Pranith