[PATCH] target/ppc: Implement new wait variants

Nicholas Piggin posted 1 patch 1 year, 9 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20220719113843.429600-1-npiggin@gmail.com
Maintainers: "Cédric Le Goater" <clg@kaod.org>, Daniel Henrique Barboza <danielhb413@gmail.com>, David Gibson <david@gibson.dropbear.id.au>, Greg Kurz <groug@kaod.org>
There is a newer version of this series
target/ppc/translate.c | 84 ++++++++++++++++++++++++++++++++++++++----
1 file changed, 76 insertions(+), 8 deletions(-)
[PATCH] target/ppc: Implement new wait variants
Posted by Nicholas Piggin 1 year, 9 months ago
ISA v2.06 adds new variations of wait, specified by the WC field. These
are not compatible with the wait 0 implementation, because they add
additional conditions that cause the processor to resume, which can
cause software to hang or run very slowly.

ISA v3.0 changed the wait opcode.

ISA v3.1 added new WC values to the new wait opcode, and added a PL
field.

This implements the new wait encoding and supports WC variants with
no-op implementations, which is provides basic correctness as explained.

Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
---
 target/ppc/translate.c | 84 ++++++++++++++++++++++++++++++++++++++----
 1 file changed, 76 insertions(+), 8 deletions(-)

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 1d6daa4608..ce4aa84f1d 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -4066,12 +4066,79 @@ static void gen_sync(DisasContext *ctx)
 /* wait */
 static void gen_wait(DisasContext *ctx)
 {
-    TCGv_i32 t0 = tcg_const_i32(1);
-    tcg_gen_st_i32(t0, cpu_env,
-                   -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
-    tcg_temp_free_i32(t0);
-    /* Stop translation, as the CPU is supposed to sleep from now */
-    gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
+    uint32_t wc = (ctx->opcode >> 21) & 3;
+    uint32_t pl = (ctx->opcode >> 16) & 3;
+
+    /* v3.0 and later use the ISA flag for wait rather than a PM flag */
+    if (!(ctx->insns_flags2 & PPC2_PM_ISA206) &&
+        !(ctx->insns_flags2 & PPC2_ISA300)) {
+        /* wc field was introduced in ISA v2.06 */
+        if (wc) {
+            gen_invalid(ctx);
+            return;
+        }
+    }
+
+    if (!(ctx->insns_flags2 & PPC2_ISA310)) {
+        /* pl field was introduced in ISA v3.1 */
+        if (pl) {
+            gen_invalid(ctx);
+            return;
+        }
+
+        if (ctx->insns_flags2 & PPC2_ISA300) {
+            /* wc > 0 is reserved in v3.0 */
+            if (wc > 0) {
+                gen_invalid(ctx);
+                return;
+            }
+        }
+    }
+
+    /* wc=3 is reserved and pl=1-3 are reserved in v3.1. */
+    if (wc == 3 || pl > 0) {
+        gen_invalid(ctx);
+        return;
+    }
+
+    /* wait 0 waits for an exception to occur. */
+    if (wc == 0) {
+        TCGv_i32 t0 = tcg_const_i32(1);
+        tcg_gen_st_i32(t0, cpu_env,
+                       -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
+        tcg_temp_free_i32(t0);
+        /* Stop translation, as the CPU is supposed to sleep from now */
+        gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
+    }
+
+    /*
+     * Other wait types must not just wait until an exception occurs because
+     * ignoring their other wake-up conditions could cause a hang.
+     *
+     * For v2.06 and 2.07, wc=1,2 are architected but may be implemented as
+     * no-ops.
+     *
+     * wc=1 (waitrsv) waits for an exception or a reservation to be lost.
+     * Reservation-loss may have implementation-specific conditions, so it
+     * can be implemented as a no-op.
+     *
+     * wc=2 waits for an implementation-specific condition which could be
+     * always true, so it can be implemented as a no-op.
+     *
+     * For v3.1, wc=1,2 are architected but may be implemented as no-ops.
+     *
+     * wc=1 similarly to v2.06 and v2.07.
+     *
+     * wc=2 waits for an exception or an amount of time to pass. This
+     * amount is implementation-specific so it can be implemented as a
+     * no-op.
+     *
+     * ISA v3.1 does allow for execution to resume "in the rare case of
+     * an implementation-dependent event", so in any case software must
+     * not depend on the architected resumption condition to become
+     * true, so no-op implementations should be architecturally correct
+     * (if suboptimal).
+     */
 }
 
 #if defined(TARGET_PPC64)
@@ -6852,8 +6919,9 @@ GEN_HANDLER2(stdcx_, "stdcx.", 0x1F, 0x16, 0x06, 0x00000000, PPC_64B),
 GEN_HANDLER_E(stqcx_, 0x1F, 0x16, 0x05, 0, PPC_NONE, PPC2_LSQ_ISA207),
 #endif
 GEN_HANDLER(sync, 0x1F, 0x16, 0x12, 0x039FF801, PPC_MEM_SYNC),
-GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x03FFF801, PPC_WAIT),
-GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039FF801, PPC_NONE, PPC2_ISA300),
+/* ISA v3.0 changed the extended opcode from 62 to 30 */
+GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x039FF801, PPC_WAIT),
+GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039CF801, PPC_NONE, PPC2_ISA300),
 GEN_HANDLER(b, 0x12, 0xFF, 0xFF, 0x00000000, PPC_FLOW),
 GEN_HANDLER(bc, 0x10, 0xFF, 0xFF, 0x00000000, PPC_FLOW),
 GEN_HANDLER(bcctr, 0x13, 0x10, 0x10, 0x00000000, PPC_FLOW),
-- 
2.35.1
Re: [PATCH] target/ppc: Implement new wait variants
Posted by Víctor Colombo 1 year, 9 months ago
Hello Nicholas,

On 19/07/2022 08:38, Nicholas Piggin wrote:
> ISA v2.06 adds new variations of wait, specified by the WC field. These
> are not compatible with the wait 0 implementation, because they add
> additional conditions that cause the processor to resume, which can
> cause software to hang or run very slowly.
> 
> ISA v3.0 changed the wait opcode.
> 
> ISA v3.1 added new WC values to the new wait opcode, and added a PL
> field.
> 
> This implements the new wait encoding and supports WC variants with
> no-op implementations, which is provides basic correctness as explained.
> 
> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
> ---
>   target/ppc/translate.c | 84 ++++++++++++++++++++++++++++++++++++++----
>   1 file changed, 76 insertions(+), 8 deletions(-)
> 
> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
> index 1d6daa4608..ce4aa84f1d 100644
> --- a/target/ppc/translate.c
> +++ b/target/ppc/translate.c
> @@ -4066,12 +4066,79 @@ static void gen_sync(DisasContext *ctx)
>   /* wait */
>   static void gen_wait(DisasContext *ctx)
>   {
> -    TCGv_i32 t0 = tcg_const_i32(1);
> -    tcg_gen_st_i32(t0, cpu_env,
> -                   -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
> -    tcg_temp_free_i32(t0);
> -    /* Stop translation, as the CPU is supposed to sleep from now */
> -    gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
> +    uint32_t wc = (ctx->opcode >> 21) & 3;
> +    uint32_t pl = (ctx->opcode >> 16) & 3;

I think the best here would be to move this instruction to decodetree.
However, this can be a bit of extra work and out of the scope you though
for this patch. What do you think about adding a EXTRACT_HELPER to
target/ppc/internal.h?

> +
> +    /* v3.0 and later use the ISA flag for wait rather than a PM flag */
> +    if (!(ctx->insns_flags2 & PPC2_PM_ISA206) &&
> +        !(ctx->insns_flags2 & PPC2_ISA300)) {
> +        /* wc field was introduced in ISA v2.06 */
> +        if (wc) {
> +            gen_invalid(ctx);
> +            return;
> +        }
> +    }
> +
> +    if (!(ctx->insns_flags2 & PPC2_ISA310)) {
> +        /* pl field was introduced in ISA v3.1 */
> +        if (pl) {
> +            gen_invalid(ctx);
> +            return;
> +        }

IIUC the ISA says that "Reserved fields in instructions are ignored by
the processor". So this check is incorrect, I guess, as we should allow
the instruction to continue.

> +
> +        if (ctx->insns_flags2 & PPC2_ISA300) {
> +            /* wc > 0 is reserved in v3.0 */
> +            if (wc > 0) {

This however is correct

> +                gen_invalid(ctx);
> +                return;
> +            }
> +        }
> +    }
> +
> +    /* wc=3 is reserved and pl=1-3 are reserved in v3.1. */
> +    if (wc == 3 || pl > 0) {

This can cause a situation where the field is reserve in a previous ISA
and should be ignored. I think the best option is to put these checks
inside a conditional for each different ISA. Otherwise it's getting a
bit hard to follow what should happen in each situation.

> +        gen_invalid(ctx);
> +        return;
> +    }
> +
> +    /* wait 0 waits for an exception to occur. */
> +    if (wc == 0) {
> +        TCGv_i32 t0 = tcg_const_i32(1);
> +        tcg_gen_st_i32(t0, cpu_env,
> +                       -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
> +        tcg_temp_free_i32(t0);
> +        /* Stop translation, as the CPU is supposed to sleep from now */
> +        gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
> +    }
> +
> +    /*
> +     * Other wait types must not just wait until an exception occurs because
> +     * ignoring their other wake-up conditions could cause a hang.
> +     *
> +     * For v2.06 and 2.07, wc=1,2 are architected but may be implemented as
> +     * no-ops.
> +     *
> +     * wc=1 (waitrsv) waits for an exception or a reservation to be lost.
> +     * Reservation-loss may have implementation-specific conditions, so it
> +     * can be implemented as a no-op.
> +     *
> +     * wc=2 waits for an implementation-specific condition which could be
> +     * always true, so it can be implemented as a no-op.
> +     *
> +     * For v3.1, wc=1,2 are architected but may be implemented as no-ops.
> +     *
> +     * wc=1 similarly to v2.06 and v2.07.
> +     *
> +     * wc=2 waits for an exception or an amount of time to pass. This
> +     * amount is implementation-specific so it can be implemented as a
> +     * no-op.
> +     *
> +     * ISA v3.1 does allow for execution to resume "in the rare case of
> +     * an implementation-dependent event", so in any case software must
> +     * not depend on the architected resumption condition to become
> +     * true, so no-op implementations should be architecturally correct
> +     * (if suboptimal).
> +     */
>   }
> 
>   #if defined(TARGET_PPC64)
> @@ -6852,8 +6919,9 @@ GEN_HANDLER2(stdcx_, "stdcx.", 0x1F, 0x16, 0x06, 0x00000000, PPC_64B),
>   GEN_HANDLER_E(stqcx_, 0x1F, 0x16, 0x05, 0, PPC_NONE, PPC2_LSQ_ISA207),
>   #endif
>   GEN_HANDLER(sync, 0x1F, 0x16, 0x12, 0x039FF801, PPC_MEM_SYNC),
> -GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x03FFF801, PPC_WAIT),
> -GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039FF801, PPC_NONE, PPC2_ISA300),
> +/* ISA v3.0 changed the extended opcode from 62 to 30 */
> +GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x039FF801, PPC_WAIT),
> +GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039CF801, PPC_NONE, PPC2_ISA300),

Does this continue to work for the previous ISAs? I'm having a hard time
testing this instruction for previous cpus, even without this patch

>   GEN_HANDLER(b, 0x12, 0xFF, 0xFF, 0x00000000, PPC_FLOW),
>   GEN_HANDLER(bc, 0x10, 0xFF, 0xFF, 0x00000000, PPC_FLOW),
>   GEN_HANDLER(bcctr, 0x13, 0x10, 0x10, 0x00000000, PPC_FLOW),
> --
> 2.35.1
> 
> 

Best regards,

-- 
Víctor Cora Colombo
Instituto de Pesquisas ELDORADO
Aviso Legal - Disclaimer <https://www.eldorado.org.br/disclaimer.html>
Re: [PATCH] target/ppc: Implement new wait variants
Posted by Nicholas Piggin 1 year, 9 months ago
Excerpts from Víctor Colombo's message of July 20, 2022 12:20 am:
> Hello Nicholas,
> 
> On 19/07/2022 08:38, Nicholas Piggin wrote:
>> ISA v2.06 adds new variations of wait, specified by the WC field. These
>> are not compatible with the wait 0 implementation, because they add
>> additional conditions that cause the processor to resume, which can
>> cause software to hang or run very slowly.
>> 
>> ISA v3.0 changed the wait opcode.
>> 
>> ISA v3.1 added new WC values to the new wait opcode, and added a PL
>> field.
>> 
>> This implements the new wait encoding and supports WC variants with
>> no-op implementations, which is provides basic correctness as explained.
>> 
>> Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
>> ---
>>   target/ppc/translate.c | 84 ++++++++++++++++++++++++++++++++++++++----
>>   1 file changed, 76 insertions(+), 8 deletions(-)
>> 
>> diff --git a/target/ppc/translate.c b/target/ppc/translate.c
>> index 1d6daa4608..ce4aa84f1d 100644
>> --- a/target/ppc/translate.c
>> +++ b/target/ppc/translate.c
>> @@ -4066,12 +4066,79 @@ static void gen_sync(DisasContext *ctx)
>>   /* wait */
>>   static void gen_wait(DisasContext *ctx)
>>   {
>> -    TCGv_i32 t0 = tcg_const_i32(1);
>> -    tcg_gen_st_i32(t0, cpu_env,
>> -                   -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
>> -    tcg_temp_free_i32(t0);
>> -    /* Stop translation, as the CPU is supposed to sleep from now */
>> -    gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
>> +    uint32_t wc = (ctx->opcode >> 21) & 3;
>> +    uint32_t pl = (ctx->opcode >> 16) & 3;
> 
> I think the best here would be to move this instruction to decodetree.
> However, this can be a bit of extra work and out of the scope you though
> for this patch.

Yeah you're probably right. I haven't looked into decodetree yet sorry,
if we could get this in first would be convenient.

> What do you think about adding a EXTRACT_HELPER to
> target/ppc/internal.h?

Just to avoid open coded extraction here? Probably a good idea, I'll try
it.

>> +
>> +    /* v3.0 and later use the ISA flag for wait rather than a PM flag */
>> +    if (!(ctx->insns_flags2 & PPC2_PM_ISA206) &&
>> +        !(ctx->insns_flags2 & PPC2_ISA300)) {
>> +        /* wc field was introduced in ISA v2.06 */
>> +        if (wc) {
>> +            gen_invalid(ctx);
>> +            return;
>> +        }
>> +    }
>> +
>> +    if (!(ctx->insns_flags2 & PPC2_ISA310)) {
>> +        /* pl field was introduced in ISA v3.1 */
>> +        if (pl) {
>> +            gen_invalid(ctx);
>> +            return;
>> +        }
> 
> IIUC the ISA says that "Reserved fields in instructions are ignored by
> the processor". So this check is incorrect, I guess, as we should allow
> the instruction to continue.

Hmm, I think you're right.

>> +
>> +        if (ctx->insns_flags2 & PPC2_ISA300) {
>> +            /* wc > 0 is reserved in v3.0 */
>> +            if (wc > 0) {
> 
> This however is correct
> 
>> +                gen_invalid(ctx);
>> +                return;
>> +            }
>> +        }
>> +    }
>> +
>> +    /* wc=3 is reserved and pl=1-3 are reserved in v3.1. */
>> +    if (wc == 3 || pl > 0) {
> 
> This can cause a situation where the field is reserve in a previous ISA
> and should be ignored. I think the best option is to put these checks
> inside a conditional for each different ISA. Otherwise it's getting a
> bit hard to follow what should happen in each situation.

Good idea.

> 
>> +        gen_invalid(ctx);
>> +        return;
>> +    }
>> +
>> +    /* wait 0 waits for an exception to occur. */
>> +    if (wc == 0) {
>> +        TCGv_i32 t0 = tcg_const_i32(1);
>> +        tcg_gen_st_i32(t0, cpu_env,
>> +                       -offsetof(PowerPCCPU, env) + offsetof(CPUState, halted));
>> +        tcg_temp_free_i32(t0);
>> +        /* Stop translation, as the CPU is supposed to sleep from now */
>> +        gen_exception_nip(ctx, EXCP_HLT, ctx->base.pc_next);
>> +    }
>> +
>> +    /*
>> +     * Other wait types must not just wait until an exception occurs because
>> +     * ignoring their other wake-up conditions could cause a hang.
>> +     *
>> +     * For v2.06 and 2.07, wc=1,2 are architected but may be implemented as
>> +     * no-ops.
>> +     *
>> +     * wc=1 (waitrsv) waits for an exception or a reservation to be lost.
>> +     * Reservation-loss may have implementation-specific conditions, so it
>> +     * can be implemented as a no-op.
>> +     *
>> +     * wc=2 waits for an implementation-specific condition which could be
>> +     * always true, so it can be implemented as a no-op.
>> +     *
>> +     * For v3.1, wc=1,2 are architected but may be implemented as no-ops.
>> +     *
>> +     * wc=1 similarly to v2.06 and v2.07.
>> +     *
>> +     * wc=2 waits for an exception or an amount of time to pass. This
>> +     * amount is implementation-specific so it can be implemented as a
>> +     * no-op.
>> +     *
>> +     * ISA v3.1 does allow for execution to resume "in the rare case of
>> +     * an implementation-dependent event", so in any case software must
>> +     * not depend on the architected resumption condition to become
>> +     * true, so no-op implementations should be architecturally correct
>> +     * (if suboptimal).
>> +     */
>>   }
>> 
>>   #if defined(TARGET_PPC64)
>> @@ -6852,8 +6919,9 @@ GEN_HANDLER2(stdcx_, "stdcx.", 0x1F, 0x16, 0x06, 0x00000000, PPC_64B),
>>   GEN_HANDLER_E(stqcx_, 0x1F, 0x16, 0x05, 0, PPC_NONE, PPC2_LSQ_ISA207),
>>   #endif
>>   GEN_HANDLER(sync, 0x1F, 0x16, 0x12, 0x039FF801, PPC_MEM_SYNC),
>> -GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x03FFF801, PPC_WAIT),
>> -GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039FF801, PPC_NONE, PPC2_ISA300),
>> +/* ISA v3.0 changed the extended opcode from 62 to 30 */
>> +GEN_HANDLER(wait, 0x1F, 0x1E, 0x01, 0x039FF801, PPC_WAIT),
>> +GEN_HANDLER_E(wait, 0x1F, 0x1E, 0x00, 0x039CF801, PPC_NONE, PPC2_ISA300),
> 
> Does this continue to work for the previous ISAs? I'm having a hard time
> testing this instruction for previous cpus, even without this patch

I don't think I tested that actually. I will.

Thanks for the review, I'll make updates and post a new vesion.

Thanks,
Nick