[PATCH 0/3] target/ppc: fix tlb flushing race

Nicholas Piggin posted 3 patches 1 month ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20240328053131.2604454-1-npiggin@gmail.com
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Paolo Bonzini <pbonzini@redhat.com>, Nicholas Piggin <npiggin@gmail.com>, Daniel Henrique Barboza <danielhb413@gmail.com>
docs/devel/multi-thread-tcg.rst |  13 +--
include/exec/exec-all.h         |  97 ++++-----------------
accel/tcg/cputlb.c              | 145 ++------------------------------
target/ppc/helper_regs.c        |   2 +-
target/ppc/mmu_helper.c         |   2 +-
5 files changed, 30 insertions(+), 229 deletions(-)
[PATCH 0/3] target/ppc: fix tlb flushing race
Posted by Nicholas Piggin 1 month ago
ppc broadcast tlb flushes should be synchronised with other vCPUs,
like all other architectures that support such operations seem to
be doing.

Fixing ppc removes the last caller of the non-synced TLB flush
variants, we can remove some dead code. I'd like to merge patch 1
for 9.0, and hold patches 2 and 3 until 9.1 to avoid churn (unless
someone prefers to remove the dead code asap).

Thanks,
Nick

Nicholas Piggin (3):
  target/ppc: Fix broadcast tlbie synchronisation
  tcg/cputlb: Remove non-synced variants of global TLB flushes
  tcg/cputlb: remove other-cpu capability from TLB flushing

 docs/devel/multi-thread-tcg.rst |  13 +--
 include/exec/exec-all.h         |  97 ++++-----------------
 accel/tcg/cputlb.c              | 145 ++------------------------------
 target/ppc/helper_regs.c        |   2 +-
 target/ppc/mmu_helper.c         |   2 +-
 5 files changed, 30 insertions(+), 229 deletions(-)

-- 
2.42.0
Re: [PATCH 0/3] target/ppc: fix tlb flushing race
Posted by Nicholas Piggin 1 month ago
On Thu Mar 28, 2024 at 3:31 PM AEST, Nicholas Piggin wrote:
> ppc broadcast tlb flushes should be synchronised with other vCPUs,
> like all other architectures that support such operations seem to
> be doing.
>
> Fixing ppc removes the last caller of the non-synced TLB flush
> variants, we can remove some dead code. I'd like to merge patch 1
> for 9.0, and hold patches 2 and 3 until 9.1 to avoid churn (unless
> someone prefers to remove the dead code asap).

Hmm, turns out to not be so simple, this in parts reverts
the fix in commit 4ddc104689b. Do other architectures
that use the _synced TLB flush variants have that same problem
with the TLB flush not actually flushing until the TB ends,
I wonder?

AFAIKS it seems like the right fix would be to use _synced, but
force a new TB at the end of the TLB flush instruction so the
flush will take effect on all CPUs before the next instruction?

In any case this is tricky enough and I only hit it with a
test program, so I'll leave it out of 9.0.

Thanks,
Nick
Re: [PATCH 0/3] target/ppc: fix tlb flushing race
Posted by Nicholas Piggin 1 month ago
On Thu Mar 28, 2024 at 6:12 PM AEST, Nicholas Piggin wrote:
> On Thu Mar 28, 2024 at 3:31 PM AEST, Nicholas Piggin wrote:
> > ppc broadcast tlb flushes should be synchronised with other vCPUs,
> > like all other architectures that support such operations seem to
> > be doing.
> >
> > Fixing ppc removes the last caller of the non-synced TLB flush
> > variants, we can remove some dead code. I'd like to merge patch 1
> > for 9.0, and hold patches 2 and 3 until 9.1 to avoid churn (unless
> > someone prefers to remove the dead code asap).
>
> Hmm, turns out to not be so simple, this in parts reverts
> the fix in commit 4ddc104689b. Do other architectures
> that use the _synced TLB flush variants have that same problem
> with the TLB flush not actually flushing until the TB ends,
> I wonder?

Huh, I can reproduce that original problem with a little test
case (which I will upstream into kvm-unit-tests).

async_run_on_cpu(this_cpu) seems to flush before the next TB, but
async_safe_run_on_cpu(this_cpu) does not? How does it execute it
without exiting from the TB?

In any case, patch 1 to make it _synced, plus the following,
seems to close both races.

Thanks,
Nick

---

diff --git a/target/ppc/translate.c b/target/ppc/translate.c
index 93ffec787c..c44e0ce687 100644
--- a/target/ppc/translate.c
+++ b/target/ppc/translate.c
@@ -3495,6 +3495,7 @@ static inline void gen_check_tlb_flush(DisasContext *ctx, bool global)
         gen_helper_check_tlb_flush_local(tcg_env);
     }
     gen_set_label(l);
+    ctx->base.is_jmp = DISAS_EXIT_UPDATE;
 }
 #else
 static inline void gen_check_tlb_flush(DisasContext *ctx, bool global) { }
diff --git a/target/ppc/translate/storage-ctrl-impl.c.inc b/target/ppc/translate/storage-ctrl-impl.c.inc
index 74c23a4191..673e754404 100644
--- a/target/ppc/translate/storage-ctrl-impl.c.inc
+++ b/target/ppc/translate/storage-ctrl-impl.c.inc
@@ -224,6 +224,9 @@ static bool do_tlbie(DisasContext *ctx, arg_X_tlbie *a, bool local)
                                  a->prs << TLBIE_F_PRS_SHIFT |
                                  a->r << TLBIE_F_R_SHIFT |
                                  local << TLBIE_F_LOCAL_SHIFT));
+        if (!local) {
+            ctx->base.is_jmp = DISAS_EXIT_UPDATE;
+        }
         return true;
 #endif
Re: [PATCH 0/3] target/ppc: fix tlb flushing race
Posted by Nicholas Piggin 1 month ago
On Thu Mar 28, 2024 at 8:15 PM AEST, Nicholas Piggin wrote:
> On Thu Mar 28, 2024 at 6:12 PM AEST, Nicholas Piggin wrote:
> > On Thu Mar 28, 2024 at 3:31 PM AEST, Nicholas Piggin wrote:
> > > ppc broadcast tlb flushes should be synchronised with other vCPUs,
> > > like all other architectures that support such operations seem to
> > > be doing.
> > >
> > > Fixing ppc removes the last caller of the non-synced TLB flush
> > > variants, we can remove some dead code. I'd like to merge patch 1
> > > for 9.0, and hold patches 2 and 3 until 9.1 to avoid churn (unless
> > > someone prefers to remove the dead code asap).
> >
> > Hmm, turns out to not be so simple, this in parts reverts
> > the fix in commit 4ddc104689b. Do other architectures
> > that use the _synced TLB flush variants have that same problem
> > with the TLB flush not actually flushing until the TB ends,
> > I wonder?
>
> Huh, I can reproduce that original problem with a little test
> case (which I will upstream into kvm-unit-tests).
>
> async_run_on_cpu(this_cpu) seems to flush before the next TB, but
> async_safe_run_on_cpu(this_cpu) does not? How does it execute it
> without exiting from the TB?

Duh, it's because the non-_synced tlb flush variants don't use
that for running on this CPU, they just call it directly.

Okay that all makes sense now. I think this series plus the
below are good then. Also it's possible some other archs that
use _all_cpus_synced() (arm, riscv, s390x) _may_ be racy. I
had a quick look at sfence.vma and ipte, and AFAIKS they're
supposed to take immediate effect after they execute.

Thanks,
Nick
Re: [PATCH 0/3] target/ppc: fix tlb flushing race
Posted by Philippe Mathieu-Daudé 1 month ago
On 28/3/24 11:37, Nicholas Piggin wrote:
> On Thu Mar 28, 2024 at 8:15 PM AEST, Nicholas Piggin wrote:
>> On Thu Mar 28, 2024 at 6:12 PM AEST, Nicholas Piggin wrote:
>>> On Thu Mar 28, 2024 at 3:31 PM AEST, Nicholas Piggin wrote:
>>>> ppc broadcast tlb flushes should be synchronised with other vCPUs,
>>>> like all other architectures that support such operations seem to
>>>> be doing.
>>>>
>>>> Fixing ppc removes the last caller of the non-synced TLB flush
>>>> variants, we can remove some dead code. I'd like to merge patch 1
>>>> for 9.0, and hold patches 2 and 3 until 9.1 to avoid churn (unless
>>>> someone prefers to remove the dead code asap).
>>>
>>> Hmm, turns out to not be so simple, this in parts reverts
>>> the fix in commit 4ddc104689b.

Please mention that in the patch.

> Do other architectures
>>> that use the _synced TLB flush variants have that same problem
>>> with the TLB flush not actually flushing until the TB ends,
>>> I wonder?
>>
>> Huh, I can reproduce that original problem with a little test
>> case (which I will upstream into kvm-unit-tests).
>>
>> async_run_on_cpu(this_cpu) seems to flush before the next TB, but
>> async_safe_run_on_cpu(this_cpu) does not? How does it execute it
>> without exiting from the TB?
> 
> Duh, it's because the non-_synced tlb flush variants don't use
> that for running on this CPU, they just call it directly.
> 
> Okay that all makes sense now. I think this series plus the
> below are good then. Also it's possible some other archs that
> use _all_cpus_synced() (arm, riscv, s390x) _may_ be racy. I
> had a quick look at sfence.vma and ipte, and AFAIKS they're
> supposed to take immediate effect after they execute.
> 
> Thanks,
> Nick
>