[PATCH] accel/tcg: Fix guest instruction address in output assembly log

Matt Borgerson posted 1 patch 9 months, 2 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230718013531.1669100-1-contact@mborgerson.com
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Paolo Bonzini <pbonzini@redhat.com>
accel/tcg/translate-all.c | 22 ++++++++++++++++++++--
1 file changed, 20 insertions(+), 2 deletions(-)
[PATCH] accel/tcg: Fix guest instruction address in output assembly log
Posted by Matt Borgerson 9 months, 2 weeks ago
If CF_PCREL is enabled, generated host assembly logging (command line
option `-d out_asm`) may incorrectly report guest instruction virtual
addresses as page offsets instead of absolute addresses. This patch
corrects the reported guest address.

Signed-off-by: Matt Borgerson <contact@mborgerson.com>
---
 accel/tcg/translate-all.c | 22 ++++++++++++++++++++--
 1 file changed, 20 insertions(+), 2 deletions(-)

diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
index a1782db5dd..859db95cf7 100644
--- a/accel/tcg/translate-all.c
+++ b/accel/tcg/translate-all.c
@@ -283,6 +283,24 @@ static int setjmp_gen_code(CPUArchState *env, TranslationBlock *tb,
     return tcg_gen_code(tcg_ctx, tb, pc);
 }
 
+static vaddr get_guest_insn_vaddr(TranslationBlock *tb, vaddr pc, size_t insn)
+{
+    g_assert(insn < tb->icount);
+
+    /* FIXME: This replicates the restore_state_to_opc() logic. */
+    vaddr addr = tcg_ctx->gen_insn_data[insn * TARGET_INSN_START_WORDS];
+
+    if (tb_cflags(tb) & CF_PCREL) {
+        addr |= (pc & TARGET_PAGE_MASK);
+    } else {
+#if defined(TARGET_I386)
+        addr -= tb->cs_base;
+#endif
+    }
+
+    return addr;
+}
+
 /* Called with mmap_lock held for user mode emulation.  */
 TranslationBlock *tb_gen_code(CPUState *cpu,
                               vaddr pc, uint64_t cs_base,
@@ -458,7 +476,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
             fprintf(logfile, "OUT: [size=%d]\n", gen_code_size);
             fprintf(logfile,
                     "  -- guest addr 0x%016" PRIx64 " + tb prologue\n",
-                    tcg_ctx->gen_insn_data[insn * TARGET_INSN_START_WORDS]);
+                    get_guest_insn_vaddr(tb, pc, insn));
             chunk_start = tcg_ctx->gen_insn_end_off[insn];
             disas(logfile, tb->tc.ptr, chunk_start);
 
@@ -471,7 +489,7 @@ TranslationBlock *tb_gen_code(CPUState *cpu,
                 size_t chunk_end = tcg_ctx->gen_insn_end_off[insn];
                 if (chunk_end > chunk_start) {
                     fprintf(logfile, "  -- guest addr 0x%016" PRIx64 "\n",
-                            tcg_ctx->gen_insn_data[insn * TARGET_INSN_START_WORDS]);
+                            get_guest_insn_vaddr(tb, pc, insn));
                     disas(logfile, tb->tc.ptr + chunk_start,
                           chunk_end - chunk_start);
                     chunk_start = chunk_end;
-- 
2.34.1
Re: [PATCH] accel/tcg: Fix guest instruction address in output assembly log
Posted by Richard Henderson 9 months, 2 weeks ago
On 7/18/23 02:35, Matt Borgerson wrote:
> If CF_PCREL is enabled, generated host assembly logging (command line
> option `-d out_asm`) may incorrectly report guest instruction virtual
> addresses as page offsets instead of absolute addresses. This patch
> corrects the reported guest address.
> 
> Signed-off-by: Matt Borgerson <contact@mborgerson.com>
> ---
>   accel/tcg/translate-all.c | 22 ++++++++++++++++++++--
>   1 file changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index a1782db5dd..859db95cf7 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -283,6 +283,24 @@ static int setjmp_gen_code(CPUArchState *env, TranslationBlock *tb,
>       return tcg_gen_code(tcg_ctx, tb, pc);
>   }
>   
> +static vaddr get_guest_insn_vaddr(TranslationBlock *tb, vaddr pc, size_t insn)
> +{
> +    g_assert(insn < tb->icount);
> +
> +    /* FIXME: This replicates the restore_state_to_opc() logic. */
> +    vaddr addr = tcg_ctx->gen_insn_data[insn * TARGET_INSN_START_WORDS];
> +
> +    if (tb_cflags(tb) & CF_PCREL) {
> +        addr |= (pc & TARGET_PAGE_MASK);
> +    } else {
> +#if defined(TARGET_I386)
> +        addr -= tb->cs_base;
> +#endif
> +    }

I disagree with this.  The only bug I see is

>                       "  -- guest addr 0x%016" PRIx64 " + tb prologue\n",

"guest addr", which makes you believe this to be a guest virtual address.

I think it is important to log what is actually in the data structures, which is the page 
offset.

Why are you so keen to have the virtual address?  Why is this more reasonable than the 
physical address, or anything else?


r~
Re: [PATCH] accel/tcg: Fix guest instruction address in output assembly log
Posted by Matt Borgerson 9 months, 1 week ago
On Sat, Jul 22, 2023 at 3:11 AM Richard Henderson
<richard.henderson@linaro.org> wrote:
>
> On 7/18/23 02:35, Matt Borgerson wrote:
> > If CF_PCREL is enabled, generated host assembly logging (command line
> > option `-d out_asm`) may incorrectly report guest instruction virtual
> > addresses as page offsets instead of absolute addresses. This patch
> > corrects the reported guest address.
> >
> > Signed-off-by: Matt Borgerson <contact@mborgerson.com>
> > ---
> >   accel/tcg/translate-all.c | 22 ++++++++++++++++++++--
> >   1 file changed, 20 insertions(+), 2 deletions(-)
> >
> > diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> > index a1782db5dd..859db95cf7 100644
> > --- a/accel/tcg/translate-all.c
> > +++ b/accel/tcg/translate-all.c
> > @@ -283,6 +283,24 @@ static int setjmp_gen_code(CPUArchState *env, TranslationBlock *tb,
> >       return tcg_gen_code(tcg_ctx, tb, pc);
> >   }
> >
> > +static vaddr get_guest_insn_vaddr(TranslationBlock *tb, vaddr pc, size_t insn)
> > +{
> > +    g_assert(insn < tb->icount);
> > +
> > +    /* FIXME: This replicates the restore_state_to_opc() logic. */
> > +    vaddr addr = tcg_ctx->gen_insn_data[insn * TARGET_INSN_START_WORDS];
> > +
> > +    if (tb_cflags(tb) & CF_PCREL) {
> > +        addr |= (pc & TARGET_PAGE_MASK);
> > +    } else {
> > +#if defined(TARGET_I386)
> > +        addr -= tb->cs_base;
> > +#endif
> > +    }
>
> I disagree with this.  The only bug I see is
>
> >                       "  -- guest addr 0x%016" PRIx64 " + tb prologue\n",
>
> "guest addr", which makes you believe this to be a guest virtual address.
>
> I think it is important to log what is actually in the data structures, which is the page
> offset.
>
> Why are you so keen to have the virtual address?  Why is this more reasonable than the
> physical address, or anything else?
>
>
> r~

Hi Richard--

Thanks for the review.

> I disagree with this.  The only bug I see is
>
> >                       "  -- guest addr 0x%016" PRIx64 " + tb prologue\n",
>
> "guest addr", which makes you believe this to be a guest virtual address.

Yes, and because this was effectively the behavior of these log
messages prior to the introduction of PCREL.

> I think it is important to log what is actually in the data structures, which is the page
> offset.
>
> Why are you so keen to have the virtual address?

Printing virtual addresses with host translation allows me to plainly
see (and grep for) how translation-provoking guest instructions map to
corresponding translations without needing to cross-reference against
additional logging, in_asm or trace:translate_block for example. I
agree logging 'what is actually in the data structures' is important,
but page offset alone has limited utility when debugging translation
issues without additional logging to provide context.

> Why is this more reasonable than the physical address, or anything else?

For the same reason virtual addresses are used when logging input
block assembly, performance metrics, trace events, etc.


An oversight with this patch is that raw start words are still printed
with `OP*` log items--ideally those log items would be updated
accordingly, if that is the decision.

Ultimately I'm also fine with pulling the 'guest addr' message text
and only printing the raw start words, but would prefer not to require
extra cross-referencing. If this is the decision, feel free to drop
the patch.

Thanks,
Matt
Re: [PATCH] accel/tcg: Fix guest instruction address in output assembly log
Posted by Philippe Mathieu-Daudé 9 months, 2 weeks ago
Hi Matt,

On 18/7/23 03:35, Matt Borgerson wrote:
> If CF_PCREL is enabled, generated host assembly logging (command line
> option `-d out_asm`) may incorrectly report guest instruction virtual
> addresses as page offsets instead of absolute addresses. This patch
> corrects the reported guest address.
> 
> Signed-off-by: Matt Borgerson <contact@mborgerson.com>
> ---
>   accel/tcg/translate-all.c | 22 ++++++++++++++++++++--
>   1 file changed, 20 insertions(+), 2 deletions(-)
> 
> diff --git a/accel/tcg/translate-all.c b/accel/tcg/translate-all.c
> index a1782db5dd..859db95cf7 100644
> --- a/accel/tcg/translate-all.c
> +++ b/accel/tcg/translate-all.c
> @@ -283,6 +283,24 @@ static int setjmp_gen_code(CPUArchState *env, TranslationBlock *tb,
>       return tcg_gen_code(tcg_ctx, tb, pc);
>   }
>   
> +static vaddr get_guest_insn_vaddr(TranslationBlock *tb, vaddr pc, size_t insn)
> +{
> +    g_assert(insn < tb->icount);
> +
> +    /* FIXME: This replicates the restore_state_to_opc() logic. */

Hmmm maybe we could have a public helper defined in accel/tcg/cpu-exec.c
so we cat re-use it in accel/tcg/perf.c::perf_report_code().

> +    vaddr addr = tcg_ctx->gen_insn_data[insn * TARGET_INSN_START_WORDS];
> +
> +    if (tb_cflags(tb) & CF_PCREL) {
> +        addr |= (pc & TARGET_PAGE_MASK);
> +    } else {
> +#if defined(TARGET_I386)
> +        addr -= tb->cs_base;
> +#endif
> +    }
> +
> +    return addr;
> +}