ARM: ptw.c:S1_ptw_translate

Sid Manning posted 1 patch 1 year, 3 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/BYAPR02MB550905E891B95879D05846B9BEF59@BYAPR02MB5509.namprd02.prod.outlook.com
Maintainers: Peter Maydell <peter.maydell@linaro.org>
ARM: ptw.c:S1_ptw_translate
Posted by Sid Manning 1 year, 3 months ago
ptw.c:S1_ptw_translate

After migrating to v7.2.0, an issue was found where we were not getting the correct virtual address from a load insn.  Reading the address used in the load insn from the debugger resulted in the execution of the insn getting the correct value but simply stepping over the insn did not.

This is the instruction:
ldr           x0, [x1, #24]

The debug path varies based on the regime and if regime is NOT stage two out_phys is set to addr if the regime is stage 2 then out_phys is set to s2.f.phys_addr.  In the non-debug path out_phys is always set to full->phys_addr.

I got around this by only using full->phys_addr if regime_is_stage2 was true:

diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 3745ac9723..87bc6754a6 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -266,7 +266,12 @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
         if (unlikely(flags & TLB_INVALID_MASK)) {
             goto fail;
         }
-        ptw->out_phys = full->phys_addr;
+
+        if (regime_is_stage2(s2_mmu_idx)) {
+            ptw->out_phys = full->phys_addr;
+        } else {
+            ptw->out_phys = addr;
+        }
         ptw->out_rw = full->prot & PAGE_WRITE;
         pte_attrs = full->pte_attrs;
         pte_secure = full->attrs.secure;

This change got me the answer I wanted but I'm not familiar enough with the code to know if this is correct or not.

Re: ARM: ptw.c:S1_ptw_translate
Posted by Richard Henderson 1 year, 3 months ago
On 1/4/23 08:55, Sid Manning wrote:
> ptw.c:S1_ptw_translate
> 
> After migrating to v7.2.0, an issue was found where we were not getting the correct 
> virtual address from a load insn.  Reading the address used in the load insn from the 
> debugger resulted in the execution of the insn getting the correct value but simply 
> stepping over the insn did not.
> 
> This is the instruction:
> 
> ldr           x0, [x1, #24]
> 
> The debug path varies based on the regime and if regime is NOT stage two out_phys is set 
> to addr if the regime is stage 2 then out_phys is set to s2.f.phys_addr.  In the non-debug 
> path out_phys is always set to full->phys_addr.
> 
> I got around this by only using full->phys_addr if regime_is_stage2 was true:
> 
> diff --git a/target/arm/ptw.c b/target/arm/ptw.c
> 
> index 3745ac9723..87bc6754a6 100644
> 
> --- a/target/arm/ptw.c
> 
> +++ b/target/arm/ptw.c
> 
> @@ -266,7 +266,12 @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
> 
>           if (unlikely(flags & TLB_INVALID_MASK)) {
> 
>               goto fail;
> 
>           }
> 
> -        ptw->out_phys = full->phys_addr;
> 
> +
> 
> +        if (regime_is_stage2(s2_mmu_idx)) {
> 
> +            ptw->out_phys = full->phys_addr;
> 
> +        } else {
> 
> +            ptw->out_phys = addr;
> 
> +        }
> 
>           ptw->out_rw = full->prot & PAGE_WRITE;
> 
>           pte_attrs = full->pte_attrs;
> 
>           pte_secure = full->attrs.secure;
> 
> This change got me the answer I wanted but I’m not familiar enough with the code to know 
> if this is correct or not.

This is incorrect.  If you are getting the wrong value here, then something has gone wrong 
elsewhere, as the s2_mmu_idx result was logged.

Do you have a test case you can share?


r~

RE: ARM: ptw.c:S1_ptw_translate
Posted by Sid Manning 1 year, 3 months ago
> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Wednesday, January 4, 2023 11:42 PM
> To: Sid Manning <sidneym@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@linaro.org; Mark Burton <mburton@qti.qualcomm.com>
> Subject: Re: ARM: ptw.c:S1_ptw_translate
> 
> WARNING: This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
> 
> On 1/4/23 08:55, Sid Manning wrote:
> > ptw.c:S1_ptw_translate
> >
> > After migrating to v7.2.0, an issue was found where we were not
> > getting the correct virtual address from a load insn.  Reading the
> > address used in the load insn from the debugger resulted in the
> > execution of the insn getting the correct value but simply stepping over the
> insn did not.
> >
> > This is the instruction:
> >
> > ldr           x0, [x1, #24]
> >
> > The debug path varies based on the regime and if regime is NOT stage
> > two out_phys is set to addr if the regime is stage 2 then out_phys is
> > set to s2.f.phys_addr.  In the non-debug path out_phys is always set to full-
> >phys_addr.
> >
> > I got around this by only using full->phys_addr if regime_is_stage2 was
> true:
> >
> > diff --git a/target/arm/ptw.c b/target/arm/ptw.c
> >
> > index 3745ac9723..87bc6754a6 100644
> >
> > --- a/target/arm/ptw.c
> >
> > +++ b/target/arm/ptw.c
> >
> > @@ -266,7 +266,12 @@ static bool S1_ptw_translate(CPUARMState *env,
> > S1Translate *ptw,
> >
> >           if (unlikely(flags & TLB_INVALID_MASK)) {
> >
> >               goto fail;
> >
> >           }
> >
> > -        ptw->out_phys = full->phys_addr;
> >
> > +
> >
> > +        if (regime_is_stage2(s2_mmu_idx)) {
> >
> > +            ptw->out_phys = full->phys_addr;
> >
> > +        } else {
> >
> > +            ptw->out_phys = addr;
> >
> > +        }
> >
> >           ptw->out_rw = full->prot & PAGE_WRITE;
> >
> >           pte_attrs = full->pte_attrs;
> >
> >           pte_secure = full->attrs.secure;
> >
> > This change got me the answer I wanted but I’m not familiar enough
> > with the code to know if this is correct or not.
> 
> This is incorrect.  If you are getting the wrong value here, then something has
> gone wrong elsewhere, as the s2_mmu_idx result was logged.
> 
> Do you have a test case you can share?

This happens while booting QNX so I can't share it.  I don't have the source code either just the object code.  A number of cores are being started and the address happens to be what will eventually become the stack.

I'll see what I can come up with to better characterize is problem.

Thanks,
> 
> 
> r~
RE: ARM: ptw.c:S1_ptw_translate
Posted by Sid Manning 1 year, 3 months ago

> -----Original Message-----
> From: qemu-devel-bounces+sidneym=quicinc.com@nongnu.org <qemu-
> devel-bounces+sidneym=quicinc.com@nongnu.org> On Behalf Of Sid
> Manning
> Sent: Thursday, January 5, 2023 7:08 PM
> To: 'Richard Henderson' <richard.henderson@linaro.org>; qemu-
> devel@nongnu.org
> Cc: philmd@linaro.org; Mark Burton <mburton@qti.qualcomm.com>
> Subject: RE: ARM: ptw.c:S1_ptw_translate
> 
> WARNING: This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
> 
> > -----Original Message-----
> > From: Richard Henderson <richard.henderson@linaro.org>
> > Sent: Wednesday, January 4, 2023 11:42 PM
> > To: Sid Manning <sidneym@quicinc.com>; qemu-devel@nongnu.org
> > Cc: philmd@linaro.org; Mark Burton <mburton@qti.qualcomm.com>
> > Subject: Re: ARM: ptw.c:S1_ptw_translate
> >
> > WARNING: This email originated from outside of Qualcomm. Please be
> > wary of any links or attachments, and do not enable macros.
> >
> > On 1/4/23 08:55, Sid Manning wrote:
> > > ptw.c:S1_ptw_translate
> > >
> > > After migrating to v7.2.0, an issue was found where we were not
> > > getting the correct virtual address from a load insn.  Reading the
> > > address used in the load insn from the debugger resulted in the
> > > execution of the insn getting the correct value but simply stepping
> > > over the
> > insn did not.
> > >
> > > This is the instruction:
> > >
> > > ldr           x0, [x1, #24]
> > >
> > > The debug path varies based on the regime and if regime is NOT stage
> > >two out_phys is set to addr if the regime is stage 2 then out_phys is
> > >set to s2.f.phys_addr.  In the non-debug path out_phys is always set
> > >to full- phys_addr.
> > >
> > > I got around this by only using full->phys_addr if regime_is_stage2
> > > was
> > true:
> > >
> > > diff --git a/target/arm/ptw.c b/target/arm/ptw.c
> > >
> > > index 3745ac9723..87bc6754a6 100644
> > >
> > > --- a/target/arm/ptw.c
> > >
> > > +++ b/target/arm/ptw.c
> > >
> > > @@ -266,7 +266,12 @@ static bool S1_ptw_translate(CPUARMState
> *env,
> > > S1Translate *ptw,
> > >
> > >           if (unlikely(flags & TLB_INVALID_MASK)) {
> > >
> > >               goto fail;
> > >
> > >           }
> > >
> > > -        ptw->out_phys = full->phys_addr;
> > >
> > > +
> > >
> > > +        if (regime_is_stage2(s2_mmu_idx)) {
> > >
> > > +            ptw->out_phys = full->phys_addr;
> > >
> > > +        } else {
> > >
> > > +            ptw->out_phys = addr;
> > >
> > > +        }
> > >
> > >           ptw->out_rw = full->prot & PAGE_WRITE;
> > >
> > >           pte_attrs = full->pte_attrs;
> > >
> > >           pte_secure = full->attrs.secure;
> > >
> > > This change got me the answer I wanted but I’m not familiar enough
> > > with the code to know if this is correct or not.
> >
> > This is incorrect.  If you are getting the wrong value here, then
> > something has gone wrong elsewhere, as the s2_mmu_idx result was
> logged.
> >
> > Do you have a test case you can share?
> 
> This happens while booting QNX so I can't share it.  I don't have the source
> code either just the object code.  A number of cores are being started and
> the address happens to be what will eventually become the stack.
> 
> I'll see what I can come up with to better characterize is problem.

I have not been able to isolate the cause of this issue.  I have pulled in recent updates to ptw.c/cputlb.c/tlb_helper.c to see if one of the recent changes would help but they have not.

I'm running the same QNX images between a version of QEMU based on 7.1 and another based on 7.2.  QEMU has been patched to allow it to integrate into a System-C Virtual Platform but this problem seems to be contained in QEMU.

I defined DEBUG_TLB cputlb.c and set a breakpoint in tlb_add_large_page.  

On 7.1 I see consistent PA to VA mappings:

Thread 10 "vp" hit Breakpoint 1, tlb_add_large_page (env=0xeb3360, mmu_idx=0xa, vaddr=0xffffff809975f000, size=0x1000) at ../../../../../../src/qemu/accel/tcg/cputlb.c:1079
tlb_set_page_with_attrs: vaddr=ffffff809975f000 paddr=0x0000000f35f3a000 prot=3 idx=10
Thread 13 "vp" hit Breakpoint 1, tlb_add_large_page (env=0xee26e0, mmu_idx=0xa, vaddr=0xffffff809975f000, size=0x1000) at ../../../../../../src/qemu/accel/tcg/cputlb.c:1079
tlb_set_page_with_attrs: vaddr=ffffff809975f000 paddr=0x0000000f35f3a000 prot=3 idx=10


On 7.2 VA to PA mappings are not consistent:

 Thread 10 "vp" hit Breakpoint 1, tlb_add_large_page (env=0xeb7ac0, mmu_idx=0x2, vaddr=0xffffff809977f000, size=0x1000) at ../../../../../../src/qemu/accel/tcg/cputlb.c:1090
tlb_set_page_full: vaddr=ffffff809977f000 paddr=0x0000000f35f32000 prot=3 idx=2
Thread 14 "vp" hit Breakpoint 1, tlb_add_large_page (env=0xf185e0, mmu_idx=0x2, vaddr=0xffffff809977f000, size=0x1000) at ../../../../../../src/qemu/accel/tcg/cputlb.c:1090
tlb_set_page_full: vaddr=ffffff809977f000 paddr=0x0000000f42a16000 prot=3 idx=2

Using the monitor to view the memory I see that on 7.2 the first entry appears to be accurate.
xp /2x 0x0000000f35f32018
0000000f35f32018: 0x9977eff0 0xffffff80

And the second is not:
xp /2x 0x0000000f42a16018
0000000f42a16018: 0x00000000 0x00000000

7.2 is calling arm_cpu_tlb_fill more often now and I don't know if that is related to the problem I'm seeing or a natural result of the changes made to S1_ptw_translate between the releases.


> 
> Thanks,
> >
> >
> > r~
Re: ARM: ptw.c:S1_ptw_translate
Posted by Richard Henderson 1 year, 3 months ago
Please try the following.  It's essentially the same bug I had for mte.
I've just realized that the testing I did under Linux with virtualization=on
was insufficient -- this path won't be exercised without KVM under TCG.


diff --git a/target/arm/ptw.c b/target/arm/ptw.c
index 57f3615a66..2b125fff44 100644
--- a/target/arm/ptw.c
+++ b/target/arm/ptw.c
@@ -266,7 +266,7 @@ static bool S1_ptw_translate(CPUARMState *env, S1Translate *ptw,
          if (unlikely(flags & TLB_INVALID_MASK)) {
              goto fail;
          }
-        ptw->out_phys = full->phys_addr;
+        ptw->out_phys = full->phys_addr | (addr & ~TARGET_PAGE_MASK);
          ptw->out_rw = full->prot & PAGE_WRITE;
          pte_attrs = full->pte_attrs;
          pte_secure = full->attrs.secure;



r~
RE: ARM: ptw.c:S1_ptw_translate
Posted by Sid Manning 1 year, 3 months ago

> -----Original Message-----
> From: Richard Henderson <richard.henderson@linaro.org>
> Sent: Thursday, January 26, 2023 3:48 PM
> To: Sid Manning <sidneym@quicinc.com>; qemu-devel@nongnu.org
> Cc: philmd@linaro.org; Mark Burton <mburton@qti.qualcomm.com>; Alex
> Bennée <alex.bennee@linaro.org>
> Subject: Re: ARM: ptw.c:S1_ptw_translate
> 
> WARNING: This email originated from outside of Qualcomm. Please be wary
> of any links or attachments, and do not enable macros.
> 
> Please try the following.  It's essentially the same bug I had for mte.
> I've just realized that the testing I did under Linux with virtualization=on was
> insufficient -- this path won't be exercised without KVM under TCG.
> 
> 
> diff --git a/target/arm/ptw.c b/target/arm/ptw.c index
> 57f3615a66..2b125fff44 100644
> --- a/target/arm/ptw.c
> +++ b/target/arm/ptw.c
> @@ -266,7 +266,7 @@ static bool S1_ptw_translate(CPUARMState *env,
> S1Translate *ptw,
>           if (unlikely(flags & TLB_INVALID_MASK)) {
>               goto fail;
>           }
> -        ptw->out_phys = full->phys_addr;
> +        ptw->out_phys = full->phys_addr | (addr & ~TARGET_PAGE_MASK);
>           ptw->out_rw = full->prot & PAGE_WRITE;
>           pte_attrs = full->pte_attrs;
>           pte_secure = full->attrs.secure;
> 
This change works!  Thank you for taking the time to look into it.

> 
> 
> r~
Re: ARM: ptw.c:S1_ptw_translate
Posted by Richard Henderson 1 year, 3 months ago
On 1/25/23 13:27, Sid Manning wrote:
> On 7.2 VA to PA mappings are not consistent:
> 
>   Thread 10 "vp" hit Breakpoint 1, tlb_add_large_page (env=0xeb7ac0, mmu_idx=0x2, vaddr=0xffffff809977f000, size=0x1000) at ../../../../../../src/qemu/accel/tcg/cputlb.c:1090
> tlb_set_page_full: vaddr=ffffff809977f000 paddr=0x0000000f35f32000 prot=3 idx=2
> Thread 14 "vp" hit Breakpoint 1, tlb_add_large_page (env=0xf185e0, mmu_idx=0x2, vaddr=0xffffff809977f000, size=0x1000) at ../../../../../../src/qemu/accel/tcg/cputlb.c:1090
> tlb_set_page_full: vaddr=ffffff809977f000 paddr=0x0000000f42a16000 prot=3 idx=2
> 
> Using the monitor to view the memory I see that on 7.2 the first entry appears to be accurate.
> xp /2x 0x0000000f35f32018
> 0000000f35f32018: 0x9977eff0 0xffffff80
> 
> And the second is not:
> xp /2x 0x0000000f42a16018
> 0000000f42a16018: 0x00000000 0x00000000
> 
> 7.2 is calling arm_cpu_tlb_fill more often now and I don't know if that is related to the problem I'm seeing or a natural result of the changes made to S1_ptw_translate between the releases.

Well, there are more calls to tlb_fill, since we're now also using tlb_fill for the stage2 
translation, and for the translation tables themselves.  It's possible that there's a bug 
in the stage2 tlb flushing that wouldn't have been visible before (and also not visible 
from the monitor, since that avoids tlb_fill entirely).

While it would still be handier to have a test case, the next best thing may be for me to 
add some tracepoints within ptw.c.  I'll work on that later today or tomorrow.


r~
Re: ARM: ptw.c:S1_ptw_translate
Posted by Philippe Mathieu-Daudé 1 year, 3 months ago
Cc'ing Richard & qemu-arm list.

On 4/1/23 17:55, Sid Manning wrote:
> ptw.c:S1_ptw_translate
> 
> After migrating to v7.2.0, an issue was found where we were not getting 
> the correct virtual address from a load insn.  Reading the address used 
> in the load insn from the debugger resulted in the execution of the insn 
> getting the correct value but simply stepping over the insn did not.
> 
> This is the instruction:
> 
> ldr           x0, [x1, #24]
> 
> The debug path varies based on the regime and if regime is NOT stage two 
> out_phys is set to addr if the regime is stage 2 then out_phys is set to 
> s2.f.phys_addr.  In the non-debug path out_phys is always set to 
> full->phys_addr.
> 
> I got around this by only using full->phys_addr if regime_is_stage2 was 
> true:
> 
> diff --git a/target/arm/ptw.c b/target/arm/ptw.c
> 
> index 3745ac9723..87bc6754a6 100644
> 
> --- a/target/arm/ptw.c
> 
> +++ b/target/arm/ptw.c
> 
> @@ -266,7 +266,12 @@ static bool S1_ptw_translate(CPUARMState *env, 
> S1Translate *ptw,
> 
>           if (unlikely(flags & TLB_INVALID_MASK)) {
> 
>               goto fail;
> 
>           }
> 
> -        ptw->out_phys = full->phys_addr;
> 
> +
> 
> +        if (regime_is_stage2(s2_mmu_idx)) {
> 
> +            ptw->out_phys = full->phys_addr;
> 
> +        } else {
> 
> +            ptw->out_phys = addr;
> 
> +        }
> 
>           ptw->out_rw = full->prot & PAGE_WRITE;
> 
>           pte_attrs = full->pte_attrs;
> 
>           pte_secure = full->attrs.secure;
> 
> This change got me the answer I wanted but I’m not familiar enough with 
> the code to know if this is correct or not.
>