sh4: mac.w: memory accesses are 16-bit words

[PATCH] sh4: mac.w: memory accesses are 16-bit words

Posted by Zack Buhman 1 year, 10 months ago

Before this change, executing a code sequence such as:

           mova   tblm,r0
           mov    r0,r1
           mova   tbln,r0
           clrs
           clrmac
           mac.w  @r0+,@r1+
           mac.w  @r0+,@r1+

           .align 4
  tblm:    .word  0x1234
           .word  0x5678
  tbln:    .word  0x9abc
           .word  0xdefg

Does not result in correct behavior:

Expected behavior:
  first macw : macl = 0x1234 * 0x9abc + 0x0
               mach = 0x0

  second macw: macl = 0x5678 * 0xdefg + 0xb00a630
               mach = 0x0

Observed behavior (qemu-sh4eb, prior to this commit):

  first macw : macl = 0x5678 * 0xdefg + 0x0
               mach = 0x0

  second macw: (unaligned longword memory access, SIGBUS)

Various SH-4 ISA manuals also confirm that `mac.w` is a 16-bit word memory
access, not a 32-bit longword memory access.

Signed-off-by: Zack Buhman <zack@buhman.org>
---
 target/sh4/translate.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/target/sh4/translate.c b/target/sh4/translate.c
index a9b1bc7524..6643c14dde 100644
--- a/target/sh4/translate.c
+++ b/target/sh4/translate.c
@@ -816,10 +816,10 @@ static void _decode_opc(DisasContext * ctx)
             TCGv arg0, arg1;
             arg0 = tcg_temp_new();
             tcg_gen_qemu_ld_i32(arg0, REG(B7_4), ctx->memidx,
-                                MO_TESL | MO_ALIGN);
+                                MO_TESW | MO_ALIGN);
             arg1 = tcg_temp_new();
             tcg_gen_qemu_ld_i32(arg1, REG(B11_8), ctx->memidx,
-                                MO_TESL | MO_ALIGN);
+                                MO_TESW | MO_ALIGN);
             gen_helper_macw(tcg_env, arg0, arg1);
             tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 2);
             tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 2);
-- 
2.41.0

Re: [PATCH] sh4: mac.w: memory accesses are 16-bit words

Posted by Philippe Mathieu-Daudé 1 year, 10 months ago

On 2/4/24 11:37, Zack Buhman wrote:
> Before this change, executing a code sequence such as:
> 
>             mova   tblm,r0
>             mov    r0,r1
>             mova   tbln,r0
>             clrs
>             clrmac
>             mac.w  @r0+,@r1+
>             mac.w  @r0+,@r1+
> 
>             .align 4
>    tblm:    .word  0x1234
>             .word  0x5678
>    tbln:    .word  0x9abc
>             .word  0xdefg
> 
> Does not result in correct behavior:
> 
> Expected behavior:
>    first macw : macl = 0x1234 * 0x9abc + 0x0
>                 mach = 0x0
> 
>    second macw: macl = 0x5678 * 0xdefg + 0xb00a630
>                 mach = 0x0
> 
> Observed behavior (qemu-sh4eb, prior to this commit):
> 
>    first macw : macl = 0x5678 * 0xdefg + 0x0
>                 mach = 0x0
> 
>    second macw: (unaligned longword memory access, SIGBUS)
> 
> Various SH-4 ISA manuals also confirm that `mac.w` is a 16-bit word memory
> access, not a 32-bit longword memory access.
> 
> Signed-off-by: Zack Buhman <zack@buhman.org>
> ---
>   target/sh4/translate.c | 4 ++--
>   1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/target/sh4/translate.c b/target/sh4/translate.c
> index a9b1bc7524..6643c14dde 100644
> --- a/target/sh4/translate.c
> +++ b/target/sh4/translate.c
> @@ -816,10 +816,10 @@ static void _decode_opc(DisasContext * ctx)
>               TCGv arg0, arg1;
>               arg0 = tcg_temp_new();
>               tcg_gen_qemu_ld_i32(arg0, REG(B7_4), ctx->memidx,
> -                                MO_TESL | MO_ALIGN);
> +                                MO_TESW | MO_ALIGN);
>               arg1 = tcg_temp_new();
>               tcg_gen_qemu_ld_i32(arg1, REG(B11_8), ctx->memidx,
> -                                MO_TESL | MO_ALIGN);
> +                                MO_TESW | MO_ALIGN);

Apparently invalid since its introduction in commit fdf9b3e831.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>

>               gen_helper_macw(tcg_env, arg0, arg1);
>               tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 2);
>               tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 2);

Re: [PATCH] sh4: mac.w: memory accesses are 16-bit words

Posted by Yoshinori Sato 1 year, 10 months ago

On Tue, 02 Apr 2024 22:26:25 +0900,
Philippe Mathieu-Daudé wrote:
> 
> On 2/4/24 11:37, Zack Buhman wrote:
> > Before this change, executing a code sequence such as:
> > 
> >             mova   tblm,r0
> >             mov    r0,r1
> >             mova   tbln,r0
> >             clrs
> >             clrmac
> >             mac.w  @r0+,@r1+
> >             mac.w  @r0+,@r1+
> > 
> >             .align 4
> >    tblm:    .word  0x1234
> >             .word  0x5678
> >    tbln:    .word  0x9abc
> >             .word  0xdefg
> > 
> > Does not result in correct behavior:
> > 
> > Expected behavior:
> >    first macw : macl = 0x1234 * 0x9abc + 0x0
> >                 mach = 0x0
> > 
> >    second macw: macl = 0x5678 * 0xdefg + 0xb00a630
> >                 mach = 0x0
> > 
> > Observed behavior (qemu-sh4eb, prior to this commit):
> > 
> >    first macw : macl = 0x5678 * 0xdefg + 0x0
> >                 mach = 0x0
> > 
> >    second macw: (unaligned longword memory access, SIGBUS)
> > 
> > Various SH-4 ISA manuals also confirm that `mac.w` is a 16-bit word memory
> > access, not a 32-bit longword memory access.
> > 
> > Signed-off-by: Zack Buhman <zack@buhman.org>
> > ---
> >   target/sh4/translate.c | 4 ++--
> >   1 file changed, 2 insertions(+), 2 deletions(-)
> > 
> > diff --git a/target/sh4/translate.c b/target/sh4/translate.c
> > index a9b1bc7524..6643c14dde 100644
> > --- a/target/sh4/translate.c
> > +++ b/target/sh4/translate.c
> > @@ -816,10 +816,10 @@ static void _decode_opc(DisasContext * ctx)
> >               TCGv arg0, arg1;
> >               arg0 = tcg_temp_new();
> >               tcg_gen_qemu_ld_i32(arg0, REG(B7_4), ctx->memidx,
> > -                                MO_TESL | MO_ALIGN);
> > +                                MO_TESW | MO_ALIGN);
> >               arg1 = tcg_temp_new();
> >               tcg_gen_qemu_ld_i32(arg1, REG(B11_8), ctx->memidx,
> > -                                MO_TESL | MO_ALIGN);
> > +                                MO_TESW | MO_ALIGN);
> 
> Apparently invalid since its introduction in commit fdf9b3e831.
> 
> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
> 
> >               gen_helper_macw(tcg_env, arg0, arg1);
> >               tcg_gen_addi_i32(REG(B11_8), REG(B11_8), 2);
> >               tcg_gen_addi_i32(REG(B7_4), REG(B7_4), 2);
> 

SH4 Software manual said.
https://www.renesas.com/us/en/document/mas/sh-4-software-manual
> This instruction performs signed multiplication of the 16-bit operands
> whose addresses are the contents of general registers Rm and Rn,
> adds the 32-bit result to the MAC register contents, and stores the
> result in the MAC register. Operands Rm and Rn are each incremented
> by 2 each time they are read.

Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp>

-- 
Yosinori Sato