[PATCH 03/13] target/hexagon/cpu: add HVX IEEE FP extension

Matheus Tavares Bernardino posted 13 patches 1 week, 4 days ago
Maintainers: Brian Cain <brian.cain@oss.qualcomm.com>, "Alex Bennée" <alex.bennee@linaro.org>, "Philippe Mathieu-Daudé" <philmd@linaro.org>
There is a newer version of this series
[PATCH 03/13] target/hexagon/cpu: add HVX IEEE FP extension
Posted by Matheus Tavares Bernardino 1 week, 4 days ago
This flag will be used to control the HVX IEEE float instructions, which
are only available at some Hexagon cores. When unavailable, the
instruction is essentially treated as a no-op.

Signed-off-by: Matheus Tavares Bernardino <matheus.bernardino@oss.qualcomm.com>
---
 target/hexagon/cpu.h             |  1 +
 target/hexagon/translate.h       |  1 +
 target/hexagon/attribs_def.h.inc |  3 +++
 target/hexagon/cpu.c             |  1 +
 target/hexagon/decode.c          | 22 ++++++++++++++++++++++
 target/hexagon/translate.c       |  1 +
 6 files changed, 29 insertions(+)

diff --git a/target/hexagon/cpu.h b/target/hexagon/cpu.h
index 85afd59277..77822a48b6 100644
--- a/target/hexagon/cpu.h
+++ b/target/hexagon/cpu.h
@@ -127,6 +127,7 @@ struct ArchCPU {
     bool lldb_compat;
     target_ulong lldb_stack_adjust;
     bool short_circuit;
+    bool ieee_fp_extension;
 };
 
 #include "cpu_bits.h"
diff --git a/target/hexagon/translate.h b/target/hexagon/translate.h
index b37cb49238..516aab7038 100644
--- a/target/hexagon/translate.h
+++ b/target/hexagon/translate.h
@@ -70,6 +70,7 @@ typedef struct DisasContext {
     target_ulong branch_dest;
     bool is_tight_loop;
     bool short_circuit;
+    bool ieee_fp_extension;
     bool read_after_write;
     bool has_hvx_overlap;
     TCGv new_value[TOTAL_PER_THREAD_REGS];
diff --git a/target/hexagon/attribs_def.h.inc b/target/hexagon/attribs_def.h.inc
index 9e3a05f882..c85cd5d17c 100644
--- a/target/hexagon/attribs_def.h.inc
+++ b/target/hexagon/attribs_def.h.inc
@@ -173,5 +173,8 @@ DEF_ATTRIB(NOTE_SHIFT_RESOURCE, "Uses the HVX shift resource.", "", "")
 DEF_ATTRIB(RESTRICT_NOSLOT1_STORE, "Packet must not have slot 1 store", "", "")
 DEF_ATTRIB(RESTRICT_LATEPRED, "Predicate can not be used as a .new.", "", "")
 
+/* HVX IEEE FP extension attributes */
+DEF_ATTRIB(HVX_IEEE_FP, "HVX IEEE FP extension instruction", "", "")
+
 /* Keep this as the last attribute: */
 DEF_ATTRIB(ZZ_LASTATTRIB, "Last attribute in the file", "", "")
diff --git a/target/hexagon/cpu.c b/target/hexagon/cpu.c
index ffd14bb467..8b72a5d3c8 100644
--- a/target/hexagon/cpu.c
+++ b/target/hexagon/cpu.c
@@ -54,6 +54,7 @@ static const Property hexagon_cpu_properties[] = {
     DEFINE_PROP_UNSIGNED("lldb-stack-adjust", HexagonCPU, lldb_stack_adjust, 0,
                          qdev_prop_uint32, target_ulong),
     DEFINE_PROP_BOOL("short-circuit", HexagonCPU, short_circuit, true),
+    DEFINE_PROP_BOOL("ieee-fp", HexagonCPU, ieee_fp_extension, true),
 };
 
 const char * const hexagon_regnames[TOTAL_PER_THREAD_REGS] = {
diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
index dbc9c630e8..d832a64a17 100644
--- a/target/hexagon/decode.c
+++ b/target/hexagon/decode.c
@@ -696,6 +696,18 @@ static bool pkt_has_write_conflict(Packet *pkt)
     return !bitmap_empty(conflict, 32);
 }
 
+static void convert_to_nop(Insn *insn)
+{
+    bool is_endloop = insn->is_endloop;
+    memset(insn, 0, sizeof(*insn));
+    insn->opcode = A2_nop;
+    insn->new_read_idx = -1;
+    insn->dest_idx = -1;
+    insn->generate = opcode_genptr[insn->opcode];
+    insn->iclass = 0b111;
+    insn->is_endloop = is_endloop;
+}
+
 /*
  * decode_packet
  * Decodes packet with given words
@@ -746,6 +758,16 @@ int decode_packet(DisasContext *ctx, int max_words, const uint32_t *words,
         /* Ran out of words! */
         return 0;
     }
+
+    /* Disable HVX IEEE instruction if extension is disabled. */
+    if (!ctx->ieee_fp_extension) {
+        for (i = 0; i < num_insns; i++) {
+            if (GET_ATTRIB(pkt->insn[i].opcode, A_HVX_IEEE_FP)) {
+                convert_to_nop(&pkt->insn[i]);
+            }
+        }
+    }
+
     pkt->encod_pkt_size_in_bytes = words_read * 4;
     pkt->pkt_has_hvx = false;
     for (i = 0; i < num_insns; i++) {
diff --git a/target/hexagon/translate.c b/target/hexagon/translate.c
index 8a223f6e13..9f8104f949 100644
--- a/target/hexagon/translate.c
+++ b/target/hexagon/translate.c
@@ -988,6 +988,7 @@ static void hexagon_tr_init_disas_context(DisasContextBase *dcbase,
     ctx->branch_cond = TCG_COND_NEVER;
     ctx->is_tight_loop = FIELD_EX32(hex_flags, TB_FLAGS, IS_TIGHT_LOOP);
     ctx->short_circuit = hex_cpu->short_circuit;
+    ctx->ieee_fp_extension = hex_cpu->ieee_fp_extension;
 }
 
 static void hexagon_tr_tb_start(DisasContextBase *db, CPUState *cpu)
-- 
2.37.2
Re: [PATCH 03/13] target/hexagon/cpu: add HVX IEEE FP extension
Posted by Taylor Simpson 1 week, 4 days ago
On Mon, Mar 23, 2026 at 7:15 AM Matheus Tavares Bernardino <
matheus.bernardino@oss.qualcomm.com> wrote:

> This flag will be used to control the HVX IEEE float instructions, which
> are only available at some Hexagon cores. When unavailable, the
> instruction is essentially treated as a no-op.
>
> Signed-off-by: Matheus Tavares Bernardino <
> matheus.bernardino@oss.qualcomm.com>
> ---
>  target/hexagon/cpu.h             |  1 +
>  target/hexagon/translate.h       |  1 +
>  target/hexagon/attribs_def.h.inc |  3 +++
>  target/hexagon/cpu.c             |  1 +
>  target/hexagon/decode.c          | 22 ++++++++++++++++++++++
>  target/hexagon/translate.c       |  1 +
>  6 files changed, 29 insertions(+)
>
>
> diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
> index dbc9c630e8..d832a64a17 100644
> --- a/target/hexagon/decode.c
> +++ b/target/hexagon/decode.c
> @@ -696,6 +696,18 @@ static bool pkt_has_write_conflict(Packet *pkt)
>      return !bitmap_empty(conflict, 32);
>  }
>
> +static void convert_to_nop(Insn *insn)
> +{
> +    bool is_endloop = insn->is_endloop;
> +    memset(insn, 0, sizeof(*insn));
> +    insn->opcode = A2_nop;
> +    insn->new_read_idx = -1;
> +    insn->dest_idx = -1;
> +    insn->generate = opcode_genptr[insn->opcode];
> +    insn->iclass = 0b111;
> +    insn->is_endloop = is_endloop;
> +}
> +
>  /*
>   * decode_packet
>   * Decodes packet with given words
> @@ -746,6 +758,16 @@ int decode_packet(DisasContext *ctx, int max_words,
> const uint32_t *words,
>          /* Ran out of words! */
>          return 0;
>      }
> +
> +    /* Disable HVX IEEE instruction if extension is disabled. */
> +    if (!ctx->ieee_fp_extension) {
> +        for (i = 0; i < num_insns; i++) {
> +            if (GET_ATTRIB(pkt->insn[i].opcode, A_HVX_IEEE_FP)) {
> +                convert_to_nop(&pkt->insn[i]);
> +            }
> +        }
> +    }
> +
>

Better to leave the instruction alone and turn it into a nop by not
generating any TCG.

That way, the disassembly (-d in_asm) will still show what's actually in
the binary.  You could add the check in gen_tcg_funcs.py.

You could also consider adding some sort of marker in the disassembly to
indicate that the flag is needed for the instruction to do anything.

Thanks,
Taylor
Re: [PATCH 03/13] target/hexagon/cpu: add HVX IEEE FP extension
Posted by Matheus Bernardino 1 week, 3 days ago
On Mon, Mar 23, 2026 at 4:33 PM Taylor Simpson <ltaylorsimpson@gmail.com> wrote:
>
>
>
> On Mon, Mar 23, 2026 at 7:15 AM Matheus Tavares Bernardino <matheus.bernardino@oss.qualcomm.com> wrote:
>>
>> This flag will be used to control the HVX IEEE float instructions, which
>> are only available at some Hexagon cores. When unavailable, the
>> instruction is essentially treated as a no-op.
>>
>> Signed-off-by: Matheus Tavares Bernardino <matheus.bernardino@oss.qualcomm.com>
>> ---
>>  target/hexagon/cpu.h             |  1 +
>>  target/hexagon/translate.h       |  1 +
>>  target/hexagon/attribs_def.h.inc |  3 +++
>>  target/hexagon/cpu.c             |  1 +
>>  target/hexagon/decode.c          | 22 ++++++++++++++++++++++
>>  target/hexagon/translate.c       |  1 +
>>  6 files changed, 29 insertions(+)
>>
>>
>> diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
>> index dbc9c630e8..d832a64a17 100644
>> --- a/target/hexagon/decode.c
>> +++ b/target/hexagon/decode.c
>> @@ -696,6 +696,18 @@ static bool pkt_has_write_conflict(Packet *pkt)
>>      return !bitmap_empty(conflict, 32);
>>  }
>>
>> +static void convert_to_nop(Insn *insn)
>> +{
>> +    bool is_endloop = insn->is_endloop;
>> +    memset(insn, 0, sizeof(*insn));
>> +    insn->opcode = A2_nop;
>> +    insn->new_read_idx = -1;
>> +    insn->dest_idx = -1;
>> +    insn->generate = opcode_genptr[insn->opcode];
>> +    insn->iclass = 0b111;
>> +    insn->is_endloop = is_endloop;
>> +}
>> +
>>  /*
>>   * decode_packet
>>   * Decodes packet with given words
>> @@ -746,6 +758,16 @@ int decode_packet(DisasContext *ctx, int max_words, const uint32_t *words,
>>          /* Ran out of words! */
>>          return 0;
>>      }
>> +
>> +    /* Disable HVX IEEE instruction if extension is disabled. */
>> +    if (!ctx->ieee_fp_extension) {
>> +        for (i = 0; i < num_insns; i++) {
>> +            if (GET_ATTRIB(pkt->insn[i].opcode, A_HVX_IEEE_FP)) {
>> +                convert_to_nop(&pkt->insn[i]);
>> +            }
>> +        }
>> +    }
>> +
>
>
> Better to leave the instruction alone and turn it into a nop by not generating any TCG.
>
> That way, the disassembly (-d in_asm) will still show what's actually in the binary.  You could add the check in gen_tcg_funcs.py.
>
> You could also consider adding some sort of marker in the disassembly to indicate that the flag is needed for the instruction to do anything.

Ah, good idea. Will do both for the next round, thanks.
Re: [PATCH 03/13] target/hexagon/cpu: add HVX IEEE FP extension
Posted by Taylor Simpson 1 week, 3 days ago
On Tue, Mar 24, 2026 at 10:52 AM Matheus Bernardino <
matheus.bernardino@oss.qualcomm.com> wrote:

> On Mon, Mar 23, 2026 at 4:33 PM Taylor Simpson <ltaylorsimpson@gmail.com>
> wrote:
> >
> >
> >
> > On Mon, Mar 23, 2026 at 7:15 AM Matheus Tavares Bernardino <
> matheus.bernardino@oss.qualcomm.com> wrote:
> >>
> >> This flag will be used to control the HVX IEEE float instructions, which
> >> are only available at some Hexagon cores. When unavailable, the
> >> instruction is essentially treated as a no-op.
> >>
> >> Signed-off-by: Matheus Tavares Bernardino <
> matheus.bernardino@oss.qualcomm.com>
> >> ---
> >>  target/hexagon/cpu.h             |  1 +
> >>  target/hexagon/translate.h       |  1 +
> >>  target/hexagon/attribs_def.h.inc |  3 +++
> >>  target/hexagon/cpu.c             |  1 +
> >>  target/hexagon/decode.c          | 22 ++++++++++++++++++++++
> >>  target/hexagon/translate.c       |  1 +
> >>  6 files changed, 29 insertions(+)
> >>
> >>
> >> diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
> >> index dbc9c630e8..d832a64a17 100644
> >> --- a/target/hexagon/decode.c
> >> +++ b/target/hexagon/decode.c
> >> @@ -696,6 +696,18 @@ static bool pkt_has_write_conflict(Packet *pkt)
> >>      return !bitmap_empty(conflict, 32);
> >>  }
> >>
> >> +static void convert_to_nop(Insn *insn)
> >> +{
> >> +    bool is_endloop = insn->is_endloop;
> >> +    memset(insn, 0, sizeof(*insn));
> >> +    insn->opcode = A2_nop;
> >> +    insn->new_read_idx = -1;
> >> +    insn->dest_idx = -1;
> >> +    insn->generate = opcode_genptr[insn->opcode];
> >> +    insn->iclass = 0b111;
> >> +    insn->is_endloop = is_endloop;
> >> +}
> >> +
> >>  /*
> >>   * decode_packet
> >>   * Decodes packet with given words
> >> @@ -746,6 +758,16 @@ int decode_packet(DisasContext *ctx, int
> max_words, const uint32_t *words,
> >>          /* Ran out of words! */
> >>          return 0;
> >>      }
> >> +
> >> +    /* Disable HVX IEEE instruction if extension is disabled. */
> >> +    if (!ctx->ieee_fp_extension) {
> >> +        for (i = 0; i < num_insns; i++) {
> >> +            if (GET_ATTRIB(pkt->insn[i].opcode, A_HVX_IEEE_FP)) {
> >> +                convert_to_nop(&pkt->insn[i]);
> >> +            }
> >> +        }
> >> +    }
> >> +
> >
> >
> > Better to leave the instruction alone and turn it into a nop by not
> generating any TCG.
> >
> > That way, the disassembly (-d in_asm) will still show what's actually in
> the binary.  You could add the check in gen_tcg_funcs.py.
> >
> > You could also consider adding some sort of marker in the disassembly to
> indicate that the flag is needed for the instruction to do anything.
>
> Ah, good idea. Will do both for the next round, thanks.
>

Note that we'll need to be careful with packets that use the result vector
in a .new context.  For example
    { V0.sf = vadd(V1.sf,V2.sf)
      vmem(R19+#0x0) = V0.new }
The problem is that the store wants to read the value from future_VRegs.
However, if the vadd is  nop, there is junk in future_VRegs.  So, we'll
either have to get the store to read from the real VRegs or have the vadd
copy the old value of the destination into the future_VRegs value.  The
first option will be more efficient because it will avoid the vector copy.

We should also add a test to fp_hvx_disabled for this case.

HTH,
Taylor
Re: [PATCH 03/13] target/hexagon/cpu: add HVX IEEE FP extension
Posted by Brian Cain 1 week, 3 days ago
On Tue, Mar 24, 2026 at 1:48 PM Taylor Simpson <ltaylorsimpson@gmail.com>
wrote:

>
>
> On Tue, Mar 24, 2026 at 10:52 AM Matheus Bernardino <
> matheus.bernardino@oss.qualcomm.com> wrote:
>
>> On Mon, Mar 23, 2026 at 4:33 PM Taylor Simpson <ltaylorsimpson@gmail.com>
>> wrote:
>> >
>> >
>> >
>> > On Mon, Mar 23, 2026 at 7:15 AM Matheus Tavares Bernardino <
>> matheus.bernardino@oss.qualcomm.com> wrote:
>> >>
>> >> This flag will be used to control the HVX IEEE float instructions,
>> which
>> >> are only available at some Hexagon cores. When unavailable, the
>> >> instruction is essentially treated as a no-op.
>> >>
>> >> Signed-off-by: Matheus Tavares Bernardino <
>> matheus.bernardino@oss.qualcomm.com>
>> >> ---
>> >>  target/hexagon/cpu.h             |  1 +
>> >>  target/hexagon/translate.h       |  1 +
>> >>  target/hexagon/attribs_def.h.inc |  3 +++
>> >>  target/hexagon/cpu.c             |  1 +
>> >>  target/hexagon/decode.c          | 22 ++++++++++++++++++++++
>> >>  target/hexagon/translate.c       |  1 +
>> >>  6 files changed, 29 insertions(+)
>> >>
>> >>
>> >> diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
>> >> index dbc9c630e8..d832a64a17 100644
>> >> --- a/target/hexagon/decode.c
>> >> +++ b/target/hexagon/decode.c
>> >> @@ -696,6 +696,18 @@ static bool pkt_has_write_conflict(Packet *pkt)
>> >>      return !bitmap_empty(conflict, 32);
>> >>  }
>> >>
>> >> +static void convert_to_nop(Insn *insn)
>> >> +{
>> >> +    bool is_endloop = insn->is_endloop;
>> >> +    memset(insn, 0, sizeof(*insn));
>> >> +    insn->opcode = A2_nop;
>> >> +    insn->new_read_idx = -1;
>> >> +    insn->dest_idx = -1;
>> >> +    insn->generate = opcode_genptr[insn->opcode];
>> >> +    insn->iclass = 0b111;
>> >> +    insn->is_endloop = is_endloop;
>> >> +}
>> >> +
>> >>  /*
>> >>   * decode_packet
>> >>   * Decodes packet with given words
>> >> @@ -746,6 +758,16 @@ int decode_packet(DisasContext *ctx, int
>> max_words, const uint32_t *words,
>> >>          /* Ran out of words! */
>> >>          return 0;
>> >>      }
>> >> +
>> >> +    /* Disable HVX IEEE instruction if extension is disabled. */
>> >> +    if (!ctx->ieee_fp_extension) {
>> >> +        for (i = 0; i < num_insns; i++) {
>> >> +            if (GET_ATTRIB(pkt->insn[i].opcode, A_HVX_IEEE_FP)) {
>> >> +                convert_to_nop(&pkt->insn[i]);
>> >> +            }
>> >> +        }
>> >> +    }
>> >> +
>> >
>> >
>> > Better to leave the instruction alone and turn it into a nop by not
>> generating any TCG.
>> >
>> > That way, the disassembly (-d in_asm) will still show what's actually
>> in the binary.  You could add the check in gen_tcg_funcs.py.
>> >
>> > You could also consider adding some sort of marker in the disassembly
>> to indicate that the flag is needed for the instruction to do anything.
>>
>> Ah, good idea. Will do both for the next round, thanks.
>>
>
> Note that we'll need to be careful with packets that use the result vector
> in a .new context.  For example
>     { V0.sf = vadd(V1.sf,V2.sf)
>       vmem(R19+#0x0) = V0.new }
> The problem is that the store wants to read the value from future_VRegs.
> However, if the vadd is  nop, there is junk in future_VRegs.  So, we'll
> either have to get the store to read from the real VRegs or have the vadd
> copy the old value of the destination into the future_VRegs value.  The
> first option will be more efficient because it will avoid the vector copy.
>
>
For the sake of ease-of-verification we'll want to do whatever the ISS
does.  It's not very obvious to me what it would do in this packet context
based on the description of the nop-like behavior, but we'll follow the
ISS' lead.  In practical terms the garbage in future_VRegs is probably just
as bad or good as any other value - if you bothered to execute this packet
on the target w/o support for this opcode you probably don't care much
about the result.



> We should also add a test to fp_hvx_disabled for this case.
>
> HTH,
> Taylor
>
Re: [PATCH 03/13] target/hexagon/cpu: add HVX IEEE FP extension
Posted by Taylor Simpson 1 week, 3 days ago
On Tue, Mar 24, 2026 at 1:21 PM Brian Cain <brian.cain@oss.qualcomm.com>
wrote:

>
>
> On Tue, Mar 24, 2026 at 1:48 PM Taylor Simpson <ltaylorsimpson@gmail.com>
> wrote:
>
>>
>>
>> On Tue, Mar 24, 2026 at 10:52 AM Matheus Bernardino <
>> matheus.bernardino@oss.qualcomm.com> wrote:
>>
>>> On Mon, Mar 23, 2026 at 4:33 PM Taylor Simpson <ltaylorsimpson@gmail.com>
>>> wrote:
>>> >
>>> >
>>> >
>>> > On Mon, Mar 23, 2026 at 7:15 AM Matheus Tavares Bernardino <
>>> matheus.bernardino@oss.qualcomm.com> wrote:
>>> >>
>>> >> This flag will be used to control the HVX IEEE float instructions,
>>> which
>>> >> are only available at some Hexagon cores. When unavailable, the
>>> >> instruction is essentially treated as a no-op.
>>> >>
>>> >> Signed-off-by: Matheus Tavares Bernardino <
>>> matheus.bernardino@oss.qualcomm.com>
>>> >> ---
>>> >>  target/hexagon/cpu.h             |  1 +
>>> >>  target/hexagon/translate.h       |  1 +
>>> >>  target/hexagon/attribs_def.h.inc |  3 +++
>>> >>  target/hexagon/cpu.c             |  1 +
>>> >>  target/hexagon/decode.c          | 22 ++++++++++++++++++++++
>>> >>  target/hexagon/translate.c       |  1 +
>>> >>  6 files changed, 29 insertions(+)
>>> >>
>>> >>
>>> >> diff --git a/target/hexagon/decode.c b/target/hexagon/decode.c
>>> >> index dbc9c630e8..d832a64a17 100644
>>> >> --- a/target/hexagon/decode.c
>>> >> +++ b/target/hexagon/decode.c
>>> >> @@ -696,6 +696,18 @@ static bool pkt_has_write_conflict(Packet *pkt)
>>> >>      return !bitmap_empty(conflict, 32);
>>> >>  }
>>> >>
>>> >> +static void convert_to_nop(Insn *insn)
>>> >> +{
>>> >> +    bool is_endloop = insn->is_endloop;
>>> >> +    memset(insn, 0, sizeof(*insn));
>>> >> +    insn->opcode = A2_nop;
>>> >> +    insn->new_read_idx = -1;
>>> >> +    insn->dest_idx = -1;
>>> >> +    insn->generate = opcode_genptr[insn->opcode];
>>> >> +    insn->iclass = 0b111;
>>> >> +    insn->is_endloop = is_endloop;
>>> >> +}
>>> >> +
>>> >>  /*
>>> >>   * decode_packet
>>> >>   * Decodes packet with given words
>>> >> @@ -746,6 +758,16 @@ int decode_packet(DisasContext *ctx, int
>>> max_words, const uint32_t *words,
>>> >>          /* Ran out of words! */
>>> >>          return 0;
>>> >>      }
>>> >> +
>>> >> +    /* Disable HVX IEEE instruction if extension is disabled. */
>>> >> +    if (!ctx->ieee_fp_extension) {
>>> >> +        for (i = 0; i < num_insns; i++) {
>>> >> +            if (GET_ATTRIB(pkt->insn[i].opcode, A_HVX_IEEE_FP)) {
>>> >> +                convert_to_nop(&pkt->insn[i]);
>>> >> +            }
>>> >> +        }
>>> >> +    }
>>> >> +
>>> >
>>> >
>>> > Better to leave the instruction alone and turn it into a nop by not
>>> generating any TCG.
>>> >
>>> > That way, the disassembly (-d in_asm) will still show what's actually
>>> in the binary.  You could add the check in gen_tcg_funcs.py.
>>> >
>>> > You could also consider adding some sort of marker in the disassembly
>>> to indicate that the flag is needed for the instruction to do anything.
>>>
>>> Ah, good idea. Will do both for the next round, thanks.
>>>
>>
>> Note that we'll need to be careful with packets that use the result
>> vector in a .new context.  For example
>>     { V0.sf = vadd(V1.sf,V2.sf)
>>       vmem(R19+#0x0) = V0.new }
>> The problem is that the store wants to read the value from future_VRegs.
>> However, if the vadd is  nop, there is junk in future_VRegs.  So, we'll
>> either have to get the store to read from the real VRegs or have the vadd
>> copy the old value of the destination into the future_VRegs value.  The
>> first option will be more efficient because it will avoid the vector copy.
>>
>>
> For the sake of ease-of-verification we'll want to do whatever the ISS
> does.  It's not very obvious to me what it would do in this packet context
> based on the description of the nop-like behavior, but we'll follow the
> ISS' lead.  In practical terms the garbage in future_VRegs is probably just
> as bad or good as any other value - if you bothered to execute this packet
> on the target w/o support for this opcode you probably don't care much
> about the result.
>

I'll be interested to know what the ISS and hardware do in this case.

Thanks,
Taylor
Re: [PATCH 03/13] target/hexagon/cpu: add HVX IEEE FP extension
Posted by Matheus Bernardino 1 day, 23 hours ago
On Tue, Mar 24, 2026 at 4:46 PM Taylor Simpson <ltaylorsimpson@gmail.com> wrote:
>
>
>
> On Tue, Mar 24, 2026 at 1:21 PM Brian Cain <brian.cain@oss.qualcomm.com> wrote:
>>
>>
>>
>> On Tue, Mar 24, 2026 at 1:48 PM Taylor Simpson <ltaylorsimpson@gmail.com> wrote:
>>>
>>>
>>>
>>> On Tue, Mar 24, 2026 at 10:52 AM Matheus Bernardino <matheus.bernardino@oss.qualcomm.com> wrote:
>>>>
>>>> On Mon, Mar 23, 2026 at 4:33 PM Taylor Simpson <ltaylorsimpson@gmail.com> wrote:
>>>> >
>>>> > Better to leave the instruction alone and turn it into a nop by not generating any TCG.
>>>> >
>>>> > That way, the disassembly (-d in_asm) will still show what's actually in the binary.  You could add the check in gen_tcg_funcs.py.
>>>> >
>>>> > You could also consider adding some sort of marker in the disassembly to indicate that the flag is needed for the instruction to do anything.
>>>>
>>>> Ah, good idea. Will do both for the next round, thanks.
>>>
>>>
>>> Note that we'll need to be careful with packets that use the result vector in a .new context.  For example
>>>     { V0.sf = vadd(V1.sf,V2.sf)
>>>       vmem(R19+#0x0) = V0.new }
>>> The problem is that the store wants to read the value from future_VRegs.  However, if the vadd is  nop, there is junk in future_VRegs.  So, we'll either have to get the store to read from the real VRegs or have the vadd copy the old value of the destination into the future_VRegs value.  The first option will be more efficient because it will avoid the vector copy.
>>>
>>
>> For the sake of ease-of-verification we'll want to do whatever the ISS does.  It's not very obvious to me what it would do in this packet context based on the description of the nop-like behavior, but we'll follow the ISS' lead.  In practical terms the garbage in future_VRegs is probably just as bad or good as any other value - if you bothered to execute this packet on the target w/o support for this opcode you probably don't care much about the result.
>
>
> I'll be interested to know what the ISS and hardware do in this case.


Interesting, just found out that the hardware will assign zero to the
target register. Will change the code accordingly for v2.