RISC-V spec explicitly calls out that a local fence.i is not enough for
the code modification to be visble from a remote hart. In fact, it
states:
To make a store to instruction memory visible to all RISC-V harts, the
writing hart also has to execute a data FENCE before requesting that all
remote RISC-V harts execute a FENCE.I.
Thus, add a fence here to order data writes before making the IPI.
Signed-off-by: Andy Chiu <andybnac@gmail.com>
---
arch/riscv/mm/cacheflush.c | 15 ++++++++++++++-
1 file changed, 14 insertions(+), 1 deletion(-)
diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index b81672729887..b2e4b81763f8 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -24,7 +24,20 @@ void flush_icache_all(void)
if (num_online_cpus() < 2)
return;
- else if (riscv_use_sbi_for_rfence())
+
+ /*
+ * Make sure all previous writes to the D$ are ordered before making
+ * the IPI. The RISC-V spec states that a hart must execute a data fence
+ * before triggering a remote fence.i in order to make the modification
+ * visable for remote harts.
+ *
+ * IPIs on RISC-V are triggered by MMIO writes to either CLINT or
+ * S-IMSIC, so the fence ensures previous data writes "happen before"
+ * the MMIO.
+ */
+ RISCV_FENCE(w, o);
+
+ if (riscv_use_sbi_for_rfence())
sbi_remote_fence_i(NULL);
else
on_each_cpu(ipi_remote_fence_i, NULL, 1);
--
2.39.3 (Apple Git-145)
Andy Chiu <andybnac@gmail.com> writes: > RISC-V spec explicitly calls out that a local fence.i is not enough for > the code modification to be visble from a remote hart. In fact, it > states: > > To make a store to instruction memory visible to all RISC-V harts, the > writing hart also has to execute a data FENCE before requesting that all > remote RISC-V harts execute a FENCE.I. > > Thus, add a fence here to order data writes before making the IPI. > > Signed-off-by: Andy Chiu <andybnac@gmail.com> > --- > arch/riscv/mm/cacheflush.c | 15 ++++++++++++++- > 1 file changed, 14 insertions(+), 1 deletion(-) > > diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c > index b81672729887..b2e4b81763f8 100644 > --- a/arch/riscv/mm/cacheflush.c > +++ b/arch/riscv/mm/cacheflush.c > @@ -24,7 +24,20 @@ void flush_icache_all(void) > > if (num_online_cpus() < 2) > return; > - else if (riscv_use_sbi_for_rfence()) > + > + /* > + * Make sure all previous writes to the D$ are ordered before making > + * the IPI. The RISC-V spec states that a hart must execute a data fence > + * before triggering a remote fence.i in order to make the modification > + * visable for remote harts. > + * > + * IPIs on RISC-V are triggered by MMIO writes to either CLINT or > + * S-IMSIC, so the fence ensures previous data writes "happen before" > + * the MMIO. > + */ > + RISCV_FENCE(w, o); (I love the submit/review latency here! ;-)) FWIW, the for S-IMSIC the write is already writel(), so we'll have the text patching and IPI ordered. Regardless, there's more than one flavor of IPI on RISC-V! Reviewed-by: Björn Töpel <bjorn@rivosinc.com> > + if (riscv_use_sbi_for_rfence()) > sbi_remote_fence_i(NULL); > else > on_each_cpu(ipi_remote_fence_i, NULL, 1); > -- > 2.39.3 (Apple Git-145)
> FWIW, the for S-IMSIC the write is already writel(), so we'll have the > text patching and IPI ordered. Regardless, there's more than one flavor > of IPI on RISC-V! AFAIU, this writel() is intended to order the insertion (and the initialization) of the CSD object before the MMIO writes; so, the "right fix" seems to turn the "other flavors" into using a writel() or providing a similar ordering guarantee. As a bonus, such change should address/fix all current and future occurrences of the message-passing scenario in question (the patch addressed the occurrence in flush_icache_all(), but there appears to be a similar one in flush_icache_mm()). Or am I misunderstanding your previous comment? Andrea
Andrea Parri <parri.andrea@gmail.com> writes: >> FWIW, the for S-IMSIC the write is already writel(), so we'll have the >> text patching and IPI ordered. Regardless, there's more than one flavor >> of IPI on RISC-V! > > AFAIU, this writel() is intended to order the insertion (and the initialization) > of the CSD object before the MMIO writes; so, the "right fix" seems to turn the > "other flavors" into using a writel() or providing a similar ordering guarantee. Yes, that's probably the right approach, or maybe follow-up! > As a bonus, such change should address/fix all current and future occurrences of > the message-passing scenario in question (the patch addressed the occurrence in > flush_icache_all(), but there appears to be a similar one in flush_icache_mm()). Indeed. I wonder if the SBI remote fence.i needs it? Cheers, Björn
On Tue, Mar 11, 2025 at 03:53:36PM +0100, Björn Töpel wrote: > Andrea Parri <parri.andrea@gmail.com> writes: > > >> FWIW, the for S-IMSIC the write is already writel(), so we'll have the > >> text patching and IPI ordered. Regardless, there's more than one flavor > >> of IPI on RISC-V! > > > > AFAIU, this writel() is intended to order the insertion (and the initialization) > > of the CSD object before the MMIO writes; so, the "right fix" seems to turn the > > "other flavors" into using a writel() or providing a similar ordering guarantee. > > Yes, that's probably the right approach, or maybe follow-up! > > > As a bonus, such change should address/fix all current and future occurrences of > > the message-passing scenario in question (the patch addressed the occurrence in > > flush_icache_all(), but there appears to be a similar one in flush_icache_mm()). > > Indeed. I wonder if the SBI remote fence.i needs it? Ah! So I am not alone: AFAICT, that remains a matter of discussions with your SEE team/developers. :-| Andrea
Andrea Parri <parri.andrea@gmail.com> 於 2025年3月12日 週三 上午2:11寫道: > > On Tue, Mar 11, 2025 at 03:53:36PM +0100, Björn Töpel wrote: > > Andrea Parri <parri.andrea@gmail.com> writes: > > > > >> FWIW, the for S-IMSIC the write is already writel(), so we'll have the > > >> text patching and IPI ordered. Regardless, there's more than one flavor > > >> of IPI on RISC-V! > > > > > > AFAIU, this writel() is intended to order the insertion (and the initialization) > > > of the CSD object before the MMIO writes; so, the "right fix" seems to turn the > > > "other flavors" into using a writel() or providing a similar ordering guarantee. I found that Apple's aic irqchip uses writel_relaxed for sending IPIs. I am not sure if it is a practice using relaxed mmio in the driver to deal with IPIs. I am more convinced that driver should use the relaxed version if there is no data/io dependency for the driver itself. But it is true that a fence in the driver makes programming easier. > > > > Yes, that's probably the right approach, or maybe follow-up! > > > > > As a bonus, such change should address/fix all current and future occurrences of > > > the message-passing scenario in question (the patch addressed the occurrence in > > > flush_icache_all(), but there appears to be a similar one in flush_icache_mm()). > > > > Indeed. I wonder if the SBI remote fence.i needs it? > > Ah! So I am not alone: AFAICT, that remains a matter of discussions with your SEE > team/developers. :-| As far as OpenSBI is concerned, there is a wmb(), which translated to fence ow, ow, in the generic code path. Regardless, there may be more than one flavor of SBIs, should we also consider that? Thanks, Andy
> I found that Apple's aic irqchip uses writel_relaxed for sending IPIs.
> I am not sure if it is a practice using relaxed mmio in the driver to
> deal with IPIs. I am more convinced that driver should use the relaxed
> version if there is no data/io dependency for the driver itself. But
> it is true that a fence in the driver makes programming easier.
I emphatize with this viewpoint.
Perhaps a first counterargument/remark is that lifting those fences (e.g.,
irq-gic-v3) out of the various drivers into core/more generic code would
mean having some generic primitives to "order" the stores vs send_ipis;
however, we (kernel developers) don't have such an API today. Perhaps
unsurprisingly, considered that (as already recalled in this thread) even
on a same architecture send_ipis can mean things/operations as different
as "do an MMIO write", "write a system register", "execute an environment
call instruction" and what more; as a consequence, such matters tend to
become quite tricky even within a given/single driver (e.g., 80e4e1f472889
("irqchip/gic-v3: Use dsb(ishst) to order writes with ICC_SGI1R_EL1
accesses"), more so at "Linux level".
> As far as OpenSBI is concerned, there is a wmb(), which translated to
> fence ow, ow, in the generic code path. Regardless, there may be more
> than one flavor of SBIs, should we also consider that?
For the sake of argument, how would you proceed to do that?
Let me put it this way. If the answer to your question is "no, we should
not", then you have just showed that the fence w, o added by the patch is
redundant if riscv_use_sbi_for_rfence(). If the answer is "yes", then I
think the patch could use some words to describe why the newly added fence
suffices to order the explicit writes before the ecall at stake _for each_
of the relevant implementations. IIRC, the ISA wordings for ecall (and
fences) do not appear to provide much help to that end. Hence, to iterate,
I really don't see other ways other than digging into the implementations
and asking the developers or the experts of interest to achieve that.
After all, what I'm saying seems to align with what you've done for (some
version of) OpenSBI.
Andrea
[1] https://lore.kernel.org/all/6432e7e97b828d887da8794c150161c4@kernel.org/T/#mc90f2a2eb423ce1ba579fc4f566ad49a16825041
© 2016 - 2025 Red Hat, Inc.