[PATCH v3 14/14] RFC tcg/ppc: Disable TCG_REG_TB for Power9/Power10

Richard Henderson posted 14 patches 2 years, 5 months ago
Maintainers: Richard Henderson <richard.henderson@linaro.org>, WANG Xuerui <git@xen0n.name>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Aurelien Jarno <aurelien@aurel32.net>, Huacai Chen <chenhuacai@kernel.org>, Jiaxun Yang <jiaxun.yang@flygoat.com>, Aleksandar Rikalo <aleksandar.rikalo@syrmia.com>, Palmer Dabbelt <palmer@dabbelt.com>, Alistair Francis <Alistair.Francis@wdc.com>, Stefan Weil <sw@weilnetz.de>
There is a newer version of this series
[PATCH v3 14/14] RFC tcg/ppc: Disable TCG_REG_TB for Power9/Power10
Posted by Richard Henderson 2 years, 5 months ago
This may or may not improve performance.
It appears to result in slightly larger code,
but perhaps not enough to matter.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
---
 tcg/ppc/tcg-target.c.inc | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
index 20aaa90af2..c1e0efb498 100644
--- a/tcg/ppc/tcg-target.c.inc
+++ b/tcg/ppc/tcg-target.c.inc
@@ -83,7 +83,7 @@
 #define TCG_VEC_TMP2    TCG_REG_V1
 
 #define TCG_REG_TB     TCG_REG_R31
-#define USE_REG_TB     (TCG_TARGET_REG_BITS == 64)
+#define USE_REG_TB     (TCG_TARGET_REG_BITS == 64 && !have_isa_3_00)
 
 /* Shorthand for size of a pointer.  Avoid promotion to unsigned.  */
 #define SZP  ((int)sizeof(void *))
-- 
2.34.1
Re: [PATCH v3 14/14] RFC tcg/ppc: Disable TCG_REG_TB for Power9/Power10
Posted by Jordan Niethe 2 years, 4 months ago
On Wed, Aug 16, 2023 at 5:57 AM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> This may or may not improve performance.
> It appears to result in slightly larger code,
> but perhaps not enough to matter.

I have collected some power9 macro performance data for an smp compile workload:

Setup
-----

- Power9 powernv host
- mttcg smp 8 guest

Method
------

- Warm up compile skiboot (https://github.com/open-power/skiboot)
- Average time taken for 5 trials compiling skiboot with -j `nproc`

Results
-------


|                Patch                | Mean time (s) | stdev | Decrease (%) |
|-------------------------------------|---------------|-------|--------------|
| tcg: Add tcg_out_tb_start...        |        161.77 |  2.39 |            - |
| tcg/ppc: Enable direct branching... |        145.81 |  1.71 |          9.9 |
| tcg/ppc: Use ADDPCIS...             |        146.44 |  1.28 |          9.5 |
| RFC tcg/ppc: Disable TCG_REG_TB...  |        145.95 |  1.07 |          9.7 |


- Enabling direct branching is a performance gain, beyond that less conclusive.
- Using pcaddis for direct branching seems slightly better than bl +4
sequence for ISA v3.0.
- PC relative addressing seems slightly better than TOC relative addressing.

Any other suggestions for performance comparison?
I still have to try on a Power10.

>
> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
> ---
>  tcg/ppc/tcg-target.c.inc | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tcg/ppc/tcg-target.c.inc b/tcg/ppc/tcg-target.c.inc
> index 20aaa90af2..c1e0efb498 100644
> --- a/tcg/ppc/tcg-target.c.inc
> +++ b/tcg/ppc/tcg-target.c.inc
> @@ -83,7 +83,7 @@
>  #define TCG_VEC_TMP2    TCG_REG_V1
>
>  #define TCG_REG_TB     TCG_REG_R31
> -#define USE_REG_TB     (TCG_TARGET_REG_BITS == 64)
> +#define USE_REG_TB     (TCG_TARGET_REG_BITS == 64 && !have_isa_3_00)
>
>  /* Shorthand for size of a pointer.  Avoid promotion to unsigned.  */
>  #define SZP  ((int)sizeof(void *))
> --
> 2.34.1
>