[PATCH v3] target/ppc: Implement new wait variants

Nicholas Piggin posted 1 patch 1 year, 9 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20220720133352.904263-1-npiggin@gmail.com
Maintainers: "Cédric Le Goater" <clg@kaod.org>, Daniel Henrique Barboza <danielhb413@gmail.com>, David Gibson <david@gibson.dropbear.id.au>, Greg Kurz <groug@kaod.org>
target/ppc/internal.h  |  3 ++
target/ppc/translate.c | 96 ++++++++++++++++++++++++++++++++++++++----
2 files changed, 91 insertions(+), 8 deletions(-)
[PATCH v3] target/ppc: Implement new wait variants
Posted by Nicholas Piggin 1 year, 9 months ago
ISA v2.06 adds new variations of wait, specified by the WC field. These
are not all compatible with the prior wait implementation, because they
add additional conditions that cause the processor to resume, which can
cause software to hang or run very slowly.

ISA v3.0 changed the wait opcode and removed the new variants (retaining
the WC field but making non-zero values reserved).

ISA v3.1 added new WC values to the new wait opcode, and added a PL
field.

This implements the new wait encoding and supports WC variants with
no-op implementations, which provides basic correctness as explained in
comments.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
v3:
- Add EXTRACT_HELPERs
- Reserved fields should be ignored, not trap.
- v3.1 defines special case of reserved PL values being treated as
  a no-op when WC=2.
- Change code organization to (hopefully) be easier to follow each
  ISA / variation.
- Tested old wait variant with Linux e6500 boot and verify that
  gen_wait is called and takes the expected path.

Thanks,
Nick

 target/ppc/internal.h  |  3 ++
 target/ppc/translate.c | 96 ++++++++++++++++++++++++++++++++++++++----
 2 files changed, 91 insertions(+), 8 deletions(-)

diff --git a/target/ppc/internal.h b/target/ppc/internal.h
index 2add128cd1..57c0a42a6b 100644
--- a/target/ppc/internal.h
+++ b/target/ppc/internal.h
@@ -168,6 +168,9 @@ EXTRACT_HELPER_SPLIT_3(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
 /* darn */
 EXTRACT_HELPER(L, 16, 2);
 #endif
+/* wait */
+EXTRACT_HELPER(WC, 21, 2);
+EXTRACT_HELPER(PL, 16, 2);
 
 /***                            Jump target decoding                       ***/
 /* Immediate address */
diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 1d6daa4608..e0a835ac90 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -4066,12 +4066,91 @@ static void gen_sync(DisasContext *ctx)
 /* wait */
 static void gen_wait(DisasContext *ctx)
 {
-    TCGv_i32 t0 = tcg_const_i32(1);
-    tcg_gen_st_i32(t0, cpu_env,
-                   -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
-    tcg_temp_free_i32(t0);
-    /* Stop translation, as the CPU is supposed to sleep from now */
-    gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
+    uint32_t wc;
+
+    if (ctx->insns_flags & PPC_WAIT) {
+        /* v2.03-v2.07 define an older incompatible 'wait' encoding. */
+
+        if (ctx->insns_flags2 & PPC2_PM_ISA206) {
+            /* v2.06 introduced the WC field. WC > 0 may be treated as no-op. */
+            wc = WC(ctx->opcode);
+        } else {
+            wc = 0;
+        }
+
+    } else if (ctx->insns_flags2 & PPC2_ISA300) {
+        /* v3.0 defines a new 'wait' encoding. */
+        wc = WC(ctx->opcode);
+        if (ctx->insns_flags2 & PPC2_ISA310) {
+            uint32_t pl = PL(ctx->opcode);
+
+            /* WC 1,2 may be treated as no-op. WC 3 is reserved. */
+            if (wc == 3) {
+                gen_invalid(ctx);
+                return;
+            }
+
+            /* PL 1-3 are reserved. If WC=2 then the insn is treated as noop. */
+            if (pl > 0 && wc != 2) {
+                gen_invalid(ctx);
+                return;
+            }
+
+        } else { /* ISA300 */
+            /* WC 1-3 are reserved */
+            if (wc > 0) {
+                gen_invalid(ctx);
+                return;
+            }
+        }
+
+    } else {
+        warn_report("wait instruction decoded with wrong ISA flags.");
+        gen_invalid(ctx);
+        return;
+    }
+
+    /*
+     * wait without WC field or with WC=0 waits for an exception / interrupt
+     * to occur.
+     */
+    if (wc == 0) {
+        TCGv_i32 t0 = tcg_const_i32(1);
+        tcg_gen_st_i32(t0, cpu_env,
+                       -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
+        tcg_temp_free_i32(t0);
+        /* Stop translation, as the CPU is supposed to sleep from now */
+        gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
+    }
+
+    /*
+     * Other wait types must not just wait until an exception occurs because
+     * ignoring their other wake-up conditions could cause a hang.
+     *
+     * For v2.06 and 2.07, wc=1,2,3 are architected but may be implemented as
+     * no-ops.
+     *
+     * wc=1 and wc=3 explicitly allow the instruction to be treated as a no-op.
+     *
+     * wc=2 waits for an implementation-specific condition, such could be
+     * always true, so it can be implemented as a no-op.
+     *
+     * For v3.1, wc=1,2 are architected but may be implemented as no-ops.
+     *
+     * wc=1 (waitrsv) waits for an exception or a reservation to be lost.
+     * Reservation-loss may have implementation-specific conditions, so it
+     * can be implemented as a no-op.
+     *
+     * wc=2 waits for an exception or an amount of time to pass. This
+     * amount is implementation-specific so it can be implemented as a
+     * no-op.
+     *
+     * ISA v3.1 allows for execution to resume "in the rare case of
+     * an implementation-dependent event", so in any case software must
+     * not depend on the architected resumption condition to become
+     * true, so no-op implementations should be architecturally correct
+     * (if suboptimal).
+     */
 }
 
 #if defined(TARGET_PPC64)
@@ -6852,8 +6931,9 @@ GEN_HANDLER2(stdcx_, "stdcx.", 0x1F, 0x16, 0x06, 0x00000000, PPC_64B),
 GEN_HANDLER_E(stqcx_, 0x1F, 0x16, 0x05, 0, PPC_NONE, PPC2_LSQ_ISA207),
 #endif
 GEN_HANDLER(sync, 0x1F, 0x16, 0x12, 0x039FF801, PPC_MEM_SYNC),
-GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x03FFF801, PPC_WAIT),
-GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039FF801, PPC_NONE, PPC2_ISA300),
+/* ISA v3.0 changed the extended opcode from 62 to 30 */
+GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x039FF801, PPC_WAIT),
+GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039CF801, PPC_NONE, PPC2_ISA300),
 GEN_HANDLER(b, 0x12, 0xFF, 0xFF, 0x00000000, PPC_FLOW),
 GEN_HANDLER(bc, 0x10, 0xFF, 0xFF, 0x00000000, PPC_FLOW),
 GEN_HANDLER(bcctr, 0x13, 0x10, 0x10, 0x00000000, PPC_FLOW),
-- 
2.35.1
Re: [PATCH v3] target/ppc: Implement new wait variants
Posted by Daniel Henrique Barboza 1 year, 9 months ago

On 7/20/22 10:33, Nicholas Piggin wrote:
> ISA v2.06 adds new variations of wait, specified by the WC field. These
> are not all compatible with the prior wait implementation, because they
> add additional conditions that cause the processor to resume, which can
> cause software to hang or run very slowly.

So I suppose this is not a new feature, but a bug fix to remediate these hangs
cause by the incompatibility of the WC field  with other ISA versions. Is
that right?

I'm explicitly asking for it because if it's a bug fix it's ok to pick it
during the freeze. Especially here, given that what you're doing is mostly
adding no-ops for conditions that we're not covering.

> 
> ISA v3.0 changed the wait opcode and removed the new variants (retaining
> the WC field but making non-zero values reserved).
> 
> ISA v3.1 added new WC values to the new wait opcode, and added a PL
> field.
> 
> This implements the new wait encoding and supports WC variants with
> no-op implementations, which provides basic correctness as explained in
> comments.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> v3:
> - Add EXTRACT_HELPERs
> - Reserved fields should be ignored, not trap.
> - v3.1 defines special case of reserved PL values being treated as
>    a no-op when WC=2.
> - Change code organization to (hopefully) be easier to follow each
>    ISA / variation.
> - Tested old wait variant with Linux e6500 boot and verify that
>    gen_wait is called and takes the expected path.
> 
> Thanks,
> Nick
> 
>   target/ppc/internal.h  |  3 ++
>   target/ppc/translate.c | 96 ++++++++++++++++++++++++++++++++++++++----
>   2 files changed, 91 insertions(+), 8 deletions(-)
> 
> diff --git a/target/ppc/internal.h b/target/ppc/internal.h
> index 2add128cd1..57c0a42a6b 100644
> --- a/target/ppc/internal.h
> +++ b/target/ppc/internal.h
> @@ -168,6 +168,9 @@ EXTRACT_HELPER_SPLIT_3(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
>   /* darn */
>   EXTRACT_HELPER(L, 16, 2);
>   #endif
> +/* wait */
> +EXTRACT_HELPER(WC, 21, 2);
> +EXTRACT_HELPER(PL, 16, 2);
>   
>   /***                            Jump target decoding                       ***/
>   /* Immediate address */
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index 1d6daa4608..e0a835ac90 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -4066,12 +4066,91 @@ static void gen_sync(DisasContext *ctx)
>   /* wait */
>   static void gen_wait(DisasContext *ctx)
>   {
> -    TCGv_i32 t0 = tcg_const_i32(1);
> -    tcg_gen_st_i32(t0, cpu_env,
> -                   -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
> -    tcg_temp_free_i32(t0);
> -    /* Stop translation, as the CPU is supposed to sleep from now */
> -    gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
> +    uint32_t wc;
> +
> +    if (ctx->insns_flags & PPC_WAIT) {
> +        /* v2.03-v2.07 define an older incompatible 'wait' encoding. */
> +
> +        if (ctx->insns_flags2 & PPC2_PM_ISA206) {
> +            /* v2.06 introduced the WC field. WC > 0 may be treated as no-op. */
> +            wc = WC(ctx->opcode);
> +        } else {
> +            wc = 0;
> +        }
> +
> +    } else if (ctx->insns_flags2 & PPC2_ISA300) {
> +        /* v3.0 defines a new 'wait' encoding. */
> +        wc = WC(ctx->opcode);


The ISA seems to indicate that WC=3 is always reserved in both ISA300 and
ISA310. I believe you can check for WC=3 and gen_invalid() return right
off the bat at this point.


Thanks,


Daniel



> +        if (ctx->insns_flags2 & PPC2_ISA310) {
> +            uint32_t pl = PL(ctx->opcode);
> +
> +            /* WC 1,2 may be treated as no-op. WC 3 is reserved. */
> +            if (wc == 3) {
> +                gen_invalid(ctx);
> +                return;
> +            }
> +
> +            /* PL 1-3 are reserved. If WC=2 then the insn is treated as noop. */
> +            if (pl > 0 && wc != 2) {
> +                gen_invalid(ctx);
> +                return;
> +            }
> +
> +        } else { /* ISA300 */
> +            /* WC 1-3 are reserved */
> +            if (wc > 0) {
> +                gen_invalid(ctx);
> +                return;
> +            }
> +        }
> +
> +    } else {
> +        warn_report("wait instruction decoded with wrong ISA flags.");
> +        gen_invalid(ctx);
> +        return;
> +    }
> +
> +    /*
> +     * wait without WC field or with WC=0 waits for an exception / interrupt
> +     * to occur.
> +     */
> +    if (wc == 0) {
> +        TCGv_i32 t0 = tcg_const_i32(1);
> +        tcg_gen_st_i32(t0, cpu_env,
> +                       -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
> +        tcg_temp_free_i32(t0);
> +        /* Stop translation, as the CPU is supposed to sleep from now */
> +        gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
> +    }
> +
> +    /*
> +     * Other wait types must not just wait until an exception occurs because
> +     * ignoring their other wake-up conditions could cause a hang.
> +     *
> +     * For v2.06 and 2.07, wc=1,2,3 are architected but may be implemented as
> +     * no-ops.
> +     *
> +     * wc=1 and wc=3 explicitly allow the instruction to be treated as a no-op.
> +     *
> +     * wc=2 waits for an implementation-specific condition, such could be
> +     * always true, so it can be implemented as a no-op.
> +     *
> +     * For v3.1, wc=1,2 are architected but may be implemented as no-ops.
> +     *
> +     * wc=1 (waitrsv) waits for an exception or a reservation to be lost.
> +     * Reservation-loss may have implementation-specific conditions, so it
> +     * can be implemented as a no-op.
> +     *
> +     * wc=2 waits for an exception or an amount of time to pass. This
> +     * amount is implementation-specific so it can be implemented as a
> +     * no-op.
> +     *
> +     * ISA v3.1 allows for execution to resume "in the rare case of
> +     * an implementation-dependent event", so in any case software must
> +     * not depend on the architected resumption condition to become
> +     * true, so no-op implementations should be architecturally correct
> +     * (if suboptimal).
> +     */
>   }
>   
>   #if defined(TARGET_PPC64)
> @@ -6852,8 +6931,9 @@ GEN_HANDLER2(stdcx_, "stdcx.", 0x1F, 0x16, 0x06, 0x00000000, PPC_64B),
>   GEN_HANDLER_E(stqcx_, 0x1F, 0x16, 0x05, 0, PPC_NONE, PPC2_LSQ_ISA207),
>   #endif
>   GEN_HANDLER(sync, 0x1F, 0x16, 0x12, 0x039FF801, PPC_MEM_SYNC),
> -GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x03FFF801, PPC_WAIT),
> -GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039FF801, PPC_NONE, PPC2_ISA300),
> +/* ISA v3.0 changed the extended opcode from 62 to 30 */
> +GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x039FF801, PPC_WAIT),
> +GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039CF801, PPC_NONE, PPC2_ISA300),
>   GEN_HANDLER(b, 0x12, 0xFF, 0xFF, 0x00000000, PPC_FLOW),
>   GEN_HANDLER(bc, 0x10, 0xFF, 0xFF, 0x00000000, PPC_FLOW),
>   GEN_HANDLER(bcctr, 0x13, 0x10, 0x10, 0x00000000, PPC_FLOW),
Re: [PATCH v3] target/ppc: Implement new wait variants
Posted by Joel Stanley 1 year, 9 months ago
On Wed, 27 Jul 2022 at 13:49, Daniel Henrique Barboza
<danielhb413@gmail.com> wrote:
>
>
>
> On 7/20/22 10:33, Nicholas Piggin wrote:
> > ISA v2.06 adds new variations of wait, specified by the WC field. These
> > are not all compatible with the prior wait implementation, because they
> > add additional conditions that cause the processor to resume, which can
> > cause software to hang or run very slowly.
>
> So I suppose this is not a new feature, but a bug fix to remediate these hangs
> cause by the incompatibility of the WC field  with other ISA versions. Is
> that right?

That's the case. Nick has some kernel patches that make Linux use the
new opcode:

 https://lore.kernel.org/all/20220720132132.903462-1-npiggin@gmail.com/

With these applied the kernel hangs during boot if more than one CPU
is present. I was able to reproduce with ppc64le_defconfig and this
command line:

 qemu-system-ppc64 -M pseries,x-vof=on -cpu POWER10 -smp 2 -nographic
-kernel zImage.pseries -no-reboot

Qemu will exit (as there's no filesystem) if the test "passes", or
hang during boot if it hits the bug.

There's a kernel here with the patches applied in case someone else
wants to test:

https://ozlabs.org/~joel/zImage.pseries-v5.19-rc8-wait-v3

Tested-by: Joel Stanley <joel@jms.id.au>

Because of the hang it would be best if we merged the patch as a fix
sooner rather than later.

Cheers,

Joel

> I'm explicitly asking for it because if it's a bug fix it's ok to pick it
> during the freeze. Especially here, given that what you're doing is mostly
> adding no-ops for conditions that we're not covering.
>
> >
> > ISA v3.0 changed the wait opcode and removed the new variants (retaining
> > the WC field but making non-zero values reserved).
> >
> > ISA v3.1 added new WC values to the new wait opcode, and added a PL
> > field.
> >
> > This implements the new wait encoding and supports WC variants with
> > no-op implementations, which provides basic correctness as explained in
> > comments.
> >
> > Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> > ---
> > v3:
> > - Add EXTRACT_HELPERs
> > - Reserved fields should be ignored, not trap.
> > - v3.1 defines special case of reserved PL values being treated as
> >    a no-op when WC=2.
> > - Change code organization to (hopefully) be easier to follow each
> >    ISA / variation.
> > - Tested old wait variant with Linux e6500 boot and verify that
> >    gen_wait is called and takes the expected path.
> >
> > Thanks,
> > Nick
> >
> >   target/ppc/internal.h  |  3 ++
> >   target/ppc/translate.c | 96 ++++++++++++++++++++++++++++++++++++++----
> >   2 files changed, 91 insertions(+), 8 deletions(-)
> >
> > diff --git a/target/ppc/internal.h b/target/ppc/internal.h
> > index 2add128cd1..57c0a42a6b 100644
> > --- a/target/ppc/internal.h
> > +++ b/target/ppc/internal.h
> > @@ -168,6 +168,9 @@ EXTRACT_HELPER_SPLIT_3(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
> >   /* darn */
> >   EXTRACT_HELPER(L, 16, 2);
> >   #endif
> > +/* wait */
> > +EXTRACT_HELPER(WC, 21, 2);
> > +EXTRACT_HELPER(PL, 16, 2);
> >
> >   /***                            Jump target decoding                       ***/
> >   /* Immediate address */
> > diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> > index 1d6daa4608..e0a835ac90 100644
> > --- a/target/ppc/translate.c
> > +++ b/target/ppc/translate.c
> > @@ -4066,12 +4066,91 @@ static void gen_sync(DisasContext *ctx)
> >   /* wait */
> >   static void gen_wait(DisasContext *ctx)
> >   {
> > -    TCGv_i32 t0 = tcg_const_i32(1);
> > -    tcg_gen_st_i32(t0, cpu_env,
> > -                   -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
> > -    tcg_temp_free_i32(t0);
> > -    /* Stop translation, as the CPU is supposed to sleep from now */
> > -    gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
> > +    uint32_t wc;
> > +
> > +    if (ctx->insns_flags & PPC_WAIT) {
> > +        /* v2.03-v2.07 define an older incompatible 'wait' encoding. */
> > +
> > +        if (ctx->insns_flags2 & PPC2_PM_ISA206) {
> > +            /* v2.06 introduced the WC field. WC > 0 may be treated as no-op. */
> > +            wc = WC(ctx->opcode);
> > +        } else {
> > +            wc = 0;
> > +        }
> > +
> > +    } else if (ctx->insns_flags2 & PPC2_ISA300) {
> > +        /* v3.0 defines a new 'wait' encoding. */
> > +        wc = WC(ctx->opcode);
>
>
> The ISA seems to indicate that WC=3 is always reserved in both ISA300 and
> ISA310. I believe you can check for WC=3 and gen_invalid() return right
> off the bat at this point.
>
>
> Thanks,
>
>
> Daniel
>
>
>
> > +        if (ctx->insns_flags2 & PPC2_ISA310) {
> > +            uint32_t pl = PL(ctx->opcode);
> > +
> > +            /* WC 1,2 may be treated as no-op. WC 3 is reserved. */
> > +            if (wc == 3) {
> > +                gen_invalid(ctx);
> > +                return;
> > +            }
> > +
> > +            /* PL 1-3 are reserved. If WC=2 then the insn is treated as noop. */
> > +            if (pl > 0 && wc != 2) {
> > +                gen_invalid(ctx);
> > +                return;
> > +            }
> > +
> > +        } else { /* ISA300 */
> > +            /* WC 1-3 are reserved */
> > +            if (wc > 0) {
> > +                gen_invalid(ctx);
> > +                return;
> > +            }
> > +        }
> > +
> > +    } else {
> > +        warn_report("wait instruction decoded with wrong ISA flags.");
> > +        gen_invalid(ctx);
> > +        return;
> > +    }
> > +
> > +    /*
> > +     * wait without WC field or with WC=0 waits for an exception / interrupt
> > +     * to occur.
> > +     */
> > +    if (wc == 0) {
> > +        TCGv_i32 t0 = tcg_const_i32(1);
> > +        tcg_gen_st_i32(t0, cpu_env,
> > +                       -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
> > +        tcg_temp_free_i32(t0);
> > +        /* Stop translation, as the CPU is supposed to sleep from now */
> > +        gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
> > +    }
> > +
> > +    /*
> > +     * Other wait types must not just wait until an exception occurs because
> > +     * ignoring their other wake-up conditions could cause a hang.
> > +     *
> > +     * For v2.06 and 2.07, wc=1,2,3 are architected but may be implemented as
> > +     * no-ops.
> > +     *
> > +     * wc=1 and wc=3 explicitly allow the instruction to be treated as a no-op.
> > +     *
> > +     * wc=2 waits for an implementation-specific condition, such could be
> > +     * always true, so it can be implemented as a no-op.
> > +     *
> > +     * For v3.1, wc=1,2 are architected but may be implemented as no-ops.
> > +     *
> > +     * wc=1 (waitrsv) waits for an exception or a reservation to be lost.
> > +     * Reservation-loss may have implementation-specific conditions, so it
> > +     * can be implemented as a no-op.
> > +     *
> > +     * wc=2 waits for an exception or an amount of time to pass. This
> > +     * amount is implementation-specific so it can be implemented as a
> > +     * no-op.
> > +     *
> > +     * ISA v3.1 allows for execution to resume "in the rare case of
> > +     * an implementation-dependent event", so in any case software must
> > +     * not depend on the architected resumption condition to become
> > +     * true, so no-op implementations should be architecturally correct
> > +     * (if suboptimal).
> > +     */
> >   }
> >
> >   #if defined(TARGET_PPC64)
> > @@ -6852,8 +6931,9 @@ GEN_HANDLER2(stdcx_, "stdcx.", 0x1F, 0x16, 0x06, 0x00000000, PPC_64B),
> >   GEN_HANDLER_E(stqcx_, 0x1F, 0x16, 0x05, 0, PPC_NONE, PPC2_LSQ_ISA207),
> >   #endif
> >   GEN_HANDLER(sync, 0x1F, 0x16, 0x12, 0x039FF801, PPC_MEM_SYNC),
> > -GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x03FFF801, PPC_WAIT),
> > -GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039FF801, PPC_NONE, PPC2_ISA300),
> > +/* ISA v3.0 changed the extended opcode from 62 to 30 */
> > +GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x039FF801, PPC_WAIT),
> > +GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039CF801, PPC_NONE, PPC2_ISA300),
> >   GEN_HANDLER(b, 0x12, 0xFF, 0xFF, 0x00000000, PPC_FLOW),
> >   GEN_HANDLER(bc, 0x10, 0xFF, 0xFF, 0x00000000, PPC_FLOW),
> >   GEN_HANDLER(bcctr, 0x13, 0x10, 0x10, 0x00000000, PPC_FLOW),
>
Re: [PATCH v3] target/ppc: Implement new wait variants
Posted by Daniel Henrique Barboza 1 year, 9 months ago

On 7/28/22 02:29, Joel Stanley wrote:
> On Wed, 27 Jul 2022 at 13:49, Daniel Henrique Barboza
> <danielhb413@gmail.com> wrote:
>>
>>
>>
>> On 7/20/22 10:33, Nicholas Piggin wrote:
>>> ISA v2.06 adds new variations of wait, specified by the WC field. These
>>> are not all compatible with the prior wait implementation, because they
>>> add additional conditions that cause the processor to resume, which can
>>> cause software to hang or run very slowly.
>>
>> So I suppose this is not a new feature, but a bug fix to remediate these hangs
>> cause by the incompatibility of the WC field  with other ISA versions. Is
>> that right?
> 
> That's the case. Nick has some kernel patches that make Linux use the
> new opcode:
> 
>   https://lore.kernel.org/all/20220720132132.903462-1-npiggin@gmail.com/
> 
> With these applied the kernel hangs during boot if more than one CPU
> is present. I was able to reproduce with ppc64le_defconfig and this
> command line:
> 
>   qemu-system-ppc64 -M pseries,x-vof=on -cpu POWER10 -smp 2 -nographic
> -kernel zImage.pseries -no-reboot
> 
> Qemu will exit (as there's no filesystem) if the test "passes", or
> hang during boot if it hits the bug.

Thanks for the explanation. I'll handle it as a bug fix then.


Also, Nick, down below:


> 
> There's a kernel here with the patches applied in case someone else
> wants to test:
> 
> https://ozlabs.org/~joel/zImage.pseries-v5.19-rc8-wait-v3
> 
> Tested-by: Joel Stanley <joel@jms.id.au>
> 
> Because of the hang it would be best if we merged the patch as a fix
> sooner rather than later.
> 
> Cheers,
> 
> Joel
> 
>> I'm explicitly asking for it because if it's a bug fix it's ok to pick it
>> during the freeze. Especially here, given that what you're doing is mostly
>> adding no-ops for conditions that we're not covering.
>>
>>>
>>> ISA v3.0 changed the wait opcode and removed the new variants (retaining
>>> the WC field but making non-zero values reserved).
>>>
>>> ISA v3.1 added new WC values to the new wait opcode, and added a PL
>>> field.
>>>
>>> This implements the new wait encoding and supports WC variants with
>>> no-op implementations, which provides basic correctness as explained in
>>> comments.
>>>
>>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>>> ---
>>> v3:
>>> - Add EXTRACT_HELPERs
>>> - Reserved fields should be ignored, not trap.
>>> - v3.1 defines special case of reserved PL values being treated as
>>>     a no-op when WC=2.
>>> - Change code organization to (hopefully) be easier to follow each
>>>     ISA / variation.
>>> - Tested old wait variant with Linux e6500 boot and verify that
>>>     gen_wait is called and takes the expected path.
>>>
>>> Thanks,
>>> Nick
>>>
>>>    target/ppc/internal.h  |  3 ++
>>>    target/ppc/translate.c | 96 ++++++++++++++++++++++++++++++++++++++----
>>>    2 files changed, 91 insertions(+), 8 deletions(-)
>>>
>>> diff --git a/target/ppc/internal.h b/target/ppc/internal.h
>>> index 2add128cd1..57c0a42a6b 100644
>>> --- a/target/ppc/internal.h
>>> +++ b/target/ppc/internal.h
>>> @@ -168,6 +168,9 @@ EXTRACT_HELPER_SPLIT_3(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
>>>    /* darn */
>>>    EXTRACT_HELPER(L, 16, 2);
>>>    #endif
>>> +/* wait */
>>> +EXTRACT_HELPER(WC, 21, 2);
>>> +EXTRACT_HELPER(PL, 16, 2);
>>>
>>>    /***                            Jump target decoding                       ***/
>>>    /* Immediate address */
>>> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
>>> index 1d6daa4608..e0a835ac90 100644
>>> --- a/target/ppc/translate.c
>>> +++ b/target/ppc/translate.c
>>> @@ -4066,12 +4066,91 @@ static void gen_sync(DisasContext *ctx)
>>>    /* wait */
>>>    static void gen_wait(DisasContext *ctx)
>>>    {
>>> -    TCGv_i32 t0 = tcg_const_i32(1);
>>> -    tcg_gen_st_i32(t0, cpu_env,
>>> -                   -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
>>> -    tcg_temp_free_i32(t0);
>>> -    /* Stop translation, as the CPU is supposed to sleep from now */
>>> -    gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
>>> +    uint32_t wc;
>>> +
>>> +    if (ctx->insns_flags & PPC_WAIT) {
>>> +        /* v2.03-v2.07 define an older incompatible 'wait' encoding. */
>>> +
>>> +        if (ctx->insns_flags2 & PPC2_PM_ISA206) {
>>> +            /* v2.06 introduced the WC field. WC > 0 may be treated as no-op. */
>>> +            wc = WC(ctx->opcode);
>>> +        } else {
>>> +            wc = 0;
>>> +        }
>>> +
>>> +    } else if (ctx->insns_flags2 & PPC2_ISA300) {
>>> +        /* v3.0 defines a new 'wait' encoding. */
>>> +        wc = WC(ctx->opcode);
>>
>>
>> The ISA seems to indicate that WC=3 is always reserved in both ISA300 and
>> ISA310. I believe you can check for WC=3 and gen_invalid() return right
>> off the bat at this point.


I had a change of heart about this comment. It's better to have each ISA version
handle their conditions in separate, even if they overlap, to make it easier to
extend the function later on if required.


This means that the patch LGTM, so


Reviewed-by: Daniel Henrique Barboza <danielhb413@gmail.com>



>>
>>
>> Thanks,
>>
>>
>> Daniel
>>
>>
>>
>>> +        if (ctx->insns_flags2 & PPC2_ISA310) {
>>> +            uint32_t pl = PL(ctx->opcode);
>>> +
>>> +            /* WC 1,2 may be treated as no-op. WC 3 is reserved. */
>>> +            if (wc == 3) {
>>> +                gen_invalid(ctx);
>>> +                return;
>>> +            }
>>> +
>>> +            /* PL 1-3 are reserved. If WC=2 then the insn is treated as noop. */
>>> +            if (pl > 0 && wc != 2) {
>>> +                gen_invalid(ctx);
>>> +                return;
>>> +            }
>>> +
>>> +        } else { /* ISA300 */
>>> +            /* WC 1-3 are reserved */
>>> +            if (wc > 0) {
>>> +                gen_invalid(ctx);
>>> +                return;
>>> +            }
>>> +        }
>>> +
>>> +    } else {
>>> +        warn_report("wait instruction decoded with wrong ISA flags.");
>>> +        gen_invalid(ctx);
>>> +        return;
>>> +    }
>>> +
>>> +    /*
>>> +     * wait without WC field or with WC=0 waits for an exception / interrupt
>>> +     * to occur.
>>> +     */
>>> +    if (wc == 0) {
>>> +        TCGv_i32 t0 = tcg_const_i32(1);
>>> +        tcg_gen_st_i32(t0, cpu_env,
>>> +                       -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
>>> +        tcg_temp_free_i32(t0);
>>> +        /* Stop translation, as the CPU is supposed to sleep from now */
>>> +        gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
>>> +    }
>>> +
>>> +    /*
>>> +     * Other wait types must not just wait until an exception occurs because
>>> +     * ignoring their other wake-up conditions could cause a hang.
>>> +     *
>>> +     * For v2.06 and 2.07, wc=1,2,3 are architected but may be implemented as
>>> +     * no-ops.
>>> +     *
>>> +     * wc=1 and wc=3 explicitly allow the instruction to be treated as a no-op.
>>> +     *
>>> +     * wc=2 waits for an implementation-specific condition, such could be
>>> +     * always true, so it can be implemented as a no-op.
>>> +     *
>>> +     * For v3.1, wc=1,2 are architected but may be implemented as no-ops.
>>> +     *
>>> +     * wc=1 (waitrsv) waits for an exception or a reservation to be lost.
>>> +     * Reservation-loss may have implementation-specific conditions, so it
>>> +     * can be implemented as a no-op.
>>> +     *
>>> +     * wc=2 waits for an exception or an amount of time to pass. This
>>> +     * amount is implementation-specific so it can be implemented as a
>>> +     * no-op.
>>> +     *
>>> +     * ISA v3.1 allows for execution to resume "in the rare case of
>>> +     * an implementation-dependent event", so in any case software must
>>> +     * not depend on the architected resumption condition to become
>>> +     * true, so no-op implementations should be architecturally correct
>>> +     * (if suboptimal).
>>> +     */
>>>    }
>>>
>>>    #if defined(TARGET_PPC64)
>>> @@ -6852,8 +6931,9 @@ GEN_HANDLER2(stdcx_, "stdcx.", 0x1F, 0x16, 0x06, 0x00000000, PPC_64B),
>>>    GEN_HANDLER_E(stqcx_, 0x1F, 0x16, 0x05, 0, PPC_NONE, PPC2_LSQ_ISA207),
>>>    #endif
>>>    GEN_HANDLER(sync, 0x1F, 0x16, 0x12, 0x039FF801, PPC_MEM_SYNC),
>>> -GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x03FFF801, PPC_WAIT),
>>> -GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039FF801, PPC_NONE, PPC2_ISA300),
>>> +/* ISA v3.0 changed the extended opcode from 62 to 30 */
>>> +GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x039FF801, PPC_WAIT),
>>> +GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039CF801, PPC_NONE, PPC2_ISA300),
>>>    GEN_HANDLER(b, 0x12, 0xFF, 0xFF, 0x00000000, PPC_FLOW),
>>>    GEN_HANDLER(bc, 0x10, 0xFF, 0xFF, 0x00000000, PPC_FLOW),
>>>    GEN_HANDLER(bcctr, 0x13, 0x10, 0x10, 0x00000000, PPC_FLOW),
>>
Re: [PATCH v3] target/ppc: Implement new wait variants
Posted by Víctor Colombo 1 year, 9 months ago
CCing Daniel and Richard as they might be interested in taking a look

On 20/07/2022 10:33, Nicholas Piggin wrote:
> ISA v2.06 adds new variations of wait, specified by the WC field. These
> are not all compatible with the prior wait implementation, because they
> add additional conditions that cause the processor to resume, which can
> cause software to hang or run very slowly.
> 
> ISA v3.0 changed the wait opcode and removed the new variants (retaining
> the WC field but making non-zero values reserved).
> 
> ISA v3.1 added new WC values to the new wait opcode, and added a PL
> field.
> 
> This implements the new wait encoding and supports WC variants with
> no-op implementations, which provides basic correctness as explained in
> comments.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
> v3:
> - Add EXTRACT_HELPERs
> - Reserved fields should be ignored, not trap.
> - v3.1 defines special case of reserved PL values being treated as
>    a no-op when WC=2.
> - Change code organization to (hopefully) be easier to follow each
>    ISA / variation.

Looks better to me. Thanks!

> - Tested old wait variant with Linux e6500 boot and verify that
>    gen_wait is called and takes the expected path.
> 
> Thanks,
> Nick
> 
>   target/ppc/internal.h  |  3 ++
>   target/ppc/translate.c | 96 ++++++++++++++++++++++++++++++++++++++----
>   2 files changed, 91 insertions(+), 8 deletions(-)
> 
> diff --git a/target/ppc/internal.h b/target/ppc/internal.h
> index 2add128cd1..57c0a42a6b 100644
> --- a/target/ppc/internal.h
> +++ b/target/ppc/internal.h
> @@ -168,6 +168,9 @@ EXTRACT_HELPER_SPLIT_3(DX, 10, 6, 6, 5, 16, 1, 1, 0, 0)
>   /* darn */
>   EXTRACT_HELPER(L, 16, 2);
>   #endif
> +/* wait */
> +EXTRACT_HELPER(WC, 21, 2);
> +EXTRACT_HELPER(PL, 16, 2);
> 
>   /***                            Jump target decoding                       ***/
>   /* Immediate address */
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index 1d6daa4608..e0a835ac90 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -4066,12 +4066,91 @@ static void gen_sync(DisasContext *ctx)
>   /* wait */
>   static void gen_wait(DisasContext *ctx)
>   {
> -    TCGv_i32 t0 = tcg_const_i32(1);
> -    tcg_gen_st_i32(t0, cpu_env,
> -                   -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
> -    tcg_temp_free_i32(t0);
> -    /* Stop translation, as the CPU is supposed to sleep from now */
> -    gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
> +    uint32_t wc;
> +
> +    if (ctx->insns_flags & PPC_WAIT) {
> +        /* v2.03-v2.07 define an older incompatible 'wait' encoding. */
> +
> +        if (ctx->insns_flags2 & PPC2_PM_ISA206) {
> +            /* v2.06 introduced the WC field. WC > 0 may be treated as no-op. */
> +            wc = WC(ctx->opcode);
> +        } else {
> +            wc = 0;
> +        }
> +
> +    } else if (ctx->insns_flags2 & PPC2_ISA300) {
> +        /* v3.0 defines a new 'wait' encoding. */
> +        wc = WC(ctx->opcode);
> +        if (ctx->insns_flags2 & PPC2_ISA310) {
> +            uint32_t pl = PL(ctx->opcode);
> +
> +            /* WC 1,2 may be treated as no-op. WC 3 is reserved. */
> +            if (wc == 3) {
> +                gen_invalid(ctx);
> +                return;
> +            }
> +
> +            /* PL 1-3 are reserved. If WC=2 then the insn is treated as noop. */
> +            if (pl > 0 && wc != 2) {
> +                gen_invalid(ctx);
> +                return;
> +            }
> +
> +        } else { /* ISA300 */
> +            /* WC 1-3 are reserved */
> +            if (wc > 0) {
> +                gen_invalid(ctx);
> +                return;
> +            }
> +        }
> +
> +    } else {

I think this situation never occurs, as the definitions of wait with
GEN_HANDLER should be guaranteeing PPC_WAIT or PPC2_ISA300, and not
even call this handler otherwise.
I think it's ok to keep it anyway, if you want.

> +        warn_report("wait instruction decoded with wrong ISA flags.");
> +        gen_invalid(ctx);
> +        return;
> +    }
> +
> +    /*
> +     * wait without WC field or with WC=0 waits for an exception / interrupt
> +     * to occur.
> +     */
> +    if (wc == 0) {
> +        TCGv_i32 t0 = tcg_const_i32(1);
> +        tcg_gen_st_i32(t0, cpu_env,
> +                       -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
> +        tcg_temp_free_i32(t0);
> +        /* Stop translation, as the CPU is supposed to sleep from now */
> +        gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
> +    }
> +
> +    /*
> +     * Other wait types must not just wait until an exception occurs because
> +     * ignoring their other wake-up conditions could cause a hang.
> +     *
> +     * For v2.06 and 2.07, wc=1,2,3 are architected but may be implemented as
> +     * no-ops.
> +     *
> +     * wc=1 and wc=3 explicitly allow the instruction to be treated as a no-op.
> +     *
> +     * wc=2 waits for an implementation-specific condition, such could be
> +     * always true, so it can be implemented as a no-op.
> +     *
> +     * For v3.1, wc=1,2 are architected but may be implemented as no-ops.
> +     *
> +     * wc=1 (waitrsv) waits for an exception or a reservation to be lost.
> +     * Reservation-loss may have implementation-specific conditions, so it
> +     * can be implemented as a no-op.
> +     *
> +     * wc=2 waits for an exception or an amount of time to pass. This
> +     * amount is implementation-specific so it can be implemented as a
> +     * no-op.
> +     *
> +     * ISA v3.1 allows for execution to resume "in the rare case of
> +     * an implementation-dependent event", so in any case software must
> +     * not depend on the architected resumption condition to become
> +     * true, so no-op implementations should be architecturally correct
> +     * (if suboptimal).
> +     */
>   }
> 
>   #if defined(TARGET_PPC64)
> @@ -6852,8 +6931,9 @@ GEN_HANDLER2(stdcx_, "stdcx.", 0x1F, 0x16, 0x06, 0x00000000, PPC_64B),
>   GEN_HANDLER_E(stqcx_, 0x1F, 0x16, 0x05, 0, PPC_NONE, PPC2_LSQ_ISA207),
>   #endif
>   GEN_HANDLER(sync, 0x1F, 0x16, 0x12, 0x039FF801, PPC_MEM_SYNC),
> -GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x03FFF801, PPC_WAIT),
> -GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039FF801, PPC_NONE, PPC2_ISA300),
> +/* ISA v3.0 changed the extended opcode from 62 to 30 */
> +GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x039FF801, PPC_WAIT),
> +GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039CF801, PPC_NONE, PPC2_ISA300),
>   GEN_HANDLER(b, 0x12, 0xFF, 0xFF, 0x00000000, PPC_FLOW),
>   GEN_HANDLER(bc, 0x10, 0xFF, 0xFF, 0x00000000, PPC_FLOW),
>   GEN_HANDLER(bcctr, 0x13, 0x10, 0x10, 0x00000000, PPC_FLOW),
> --
> 2.35.1
> 

Looks correct with what the ISA says. I reviewed mostly the flow
expected for each ISA, and this v2 looks ok now.
I didn't dive deep on the 'waiting' behavior itself, but assuming the
code is the same as was before, and the new considerations regarding
noop seems to be correct when compared with what the ISA says, LGTM

Reviewed-by: Víctor Colombo <victor.colombo@eldorado.org.br>

-- 
Víctor Cora Colombo
Instituto de Pesquisas ELDORADO
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>