[PATCH RESEND] irqchip: riscv: Order normal writes and IPI writes

Xu Lu posted 1 patch 2 days, 22 hours ago
drivers/irqchip/irq-riscv-imsic-early.c      | 2 +-
drivers/irqchip/irq-thead-c900-aclint-sswi.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
[PATCH RESEND] irqchip: riscv: Order normal writes and IPI writes
Posted by Xu Lu 2 days, 22 hours ago
RISC-V distinguishes between normal memory accesses and device I/O and
uses FENCE instruction to order them as viewed by othe RISC-V harts and
external devices or coprocessors. The FENCE instruction can order any
combination of device input(I), device output(O), memory reads(R) and
memory writes(W). For example, 'fence w, o' can be used to ensure all
memory writes from instructions preceding the FENCE instruction appear
earlier in the global memory order than device output writes from
instructions after the FENCE instruction.

RISC-V issues IPI by writing certain value to IMSIC/ACLINT MMIO
registers, which is regarded as device output operation. However, the
existing implementation of IMSIC/ACLINT driver issues IPI via
writel_relaxed(), which does not guarantee the order of device output
operation and preceding memory writes. Then the hart receiving IPI may
not have seen the latest data yet.

This commit fixes this by replacing writel_relaxed() with writel() when
issuing IPI, which will use 'fence w, o' to ensure all previous writes
made by current hart are visible to other harts before they receive
the IPI.

Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
---
 drivers/irqchip/irq-riscv-imsic-early.c      | 2 +-
 drivers/irqchip/irq-thead-c900-aclint-sswi.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-riscv-imsic-early.c b/drivers/irqchip/irq-riscv-imsic-early.c
index c5c2e6929a2f..275df5005705 100644
--- a/drivers/irqchip/irq-riscv-imsic-early.c
+++ b/drivers/irqchip/irq-riscv-imsic-early.c
@@ -27,7 +27,7 @@ static void imsic_ipi_send(unsigned int cpu)
 {
 	struct imsic_local_config *local = per_cpu_ptr(imsic->global.local, cpu);
 
-	writel_relaxed(IMSIC_IPI_ID, local->msi_va);
+	writel(IMSIC_IPI_ID, local->msi_va);
 }
 
 static void imsic_ipi_starting_cpu(void)
diff --git a/drivers/irqchip/irq-thead-c900-aclint-sswi.c b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
index b0e366ade427..8ff6e7a1363b 100644
--- a/drivers/irqchip/irq-thead-c900-aclint-sswi.c
+++ b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
@@ -31,7 +31,7 @@ static DEFINE_PER_CPU(void __iomem *, sswi_cpu_regs);
 
 static void thead_aclint_sswi_ipi_send(unsigned int cpu)
 {
-	writel_relaxed(0x1, per_cpu(sswi_cpu_regs, cpu));
+	writel(0x1, per_cpu(sswi_cpu_regs, cpu));
 }
 
 static void thead_aclint_sswi_ipi_clear(void)
-- 
2.20.1
Re: [PATCH RESEND] irqchip: riscv: Order normal writes and IPI writes
Posted by Arnd Bergmann 2 days, 16 hours ago
On Mon, Jan 27, 2025, at 10:38, Xu Lu wrote:
> RISC-V distinguishes between normal memory accesses and device I/O and
> uses FENCE instruction to order them as viewed by othe RISC-V harts and
> external devices or coprocessors. The FENCE instruction can order any
> combination of device input(I), device output(O), memory reads(R) and
> memory writes(W). For example, 'fence w, o' can be used to ensure all
> memory writes from instructions preceding the FENCE instruction appear
> earlier in the global memory order than device output writes from
> instructions after the FENCE instruction.

There is nothing risc-v specific in here really, it's just a bug
in the driver: writel() means access the mmio register with appropriate
barriers, while writel_relaxed() is a special case that should
only ever be used if a particular function is sensitive to
performance and never needs to be serialized.

> diff --git a/drivers/irqchip/irq-thead-c900-aclint-sswi.c 
> b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> index b0e366ade427..8ff6e7a1363b 100644
> --- a/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> +++ b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> @@ -31,7 +31,7 @@ static DEFINE_PER_CPU(void __iomem *, sswi_cpu_regs);
> 
>  static void thead_aclint_sswi_ipi_send(unsigned int cpu)
>  {
> -	writel_relaxed(0x1, per_cpu(sswi_cpu_regs, cpu));
> +	writel(0x1, per_cpu(sswi_cpu_regs, cpu));
>  }
> 
>  static void thead_aclint_sswi_ipi_clear(void)
> -- 

thead_aclint_sswi_ipi_clear() seems to have the same bug,
it also uses the _relaxed() version for no apparent reason.

     Arnd
Re: [External] Re: [PATCH RESEND] irqchip: riscv: Order normal writes and IPI writes
Posted by Xu Lu 2 days, 15 hours ago
On Tue, Jan 28, 2025 at 12:23 AM Arnd Bergmann <arnd@arndb.de> wrote:
>
> On Mon, Jan 27, 2025, at 10:38, Xu Lu wrote:
> > RISC-V distinguishes between normal memory accesses and device I/O and
> > uses FENCE instruction to order them as viewed by othe RISC-V harts and
> > external devices or coprocessors. The FENCE instruction can order any
> > combination of device input(I), device output(O), memory reads(R) and
> > memory writes(W). For example, 'fence w, o' can be used to ensure all
> > memory writes from instructions preceding the FENCE instruction appear
> > earlier in the global memory order than device output writes from
> > instructions after the FENCE instruction.
>
> There is nothing risc-v specific in here really, it's just a bug
> in the driver: writel() means access the mmio register with appropriate
> barriers, while writel_relaxed() is a special case that should
> only ever be used if a particular function is sensitive to
> performance and never needs to be serialized.
>
> > diff --git a/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> > b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> > index b0e366ade427..8ff6e7a1363b 100644
> > --- a/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> > +++ b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
> > @@ -31,7 +31,7 @@ static DEFINE_PER_CPU(void __iomem *, sswi_cpu_regs);
> >
> >  static void thead_aclint_sswi_ipi_send(unsigned int cpu)
> >  {
> > -     writel_relaxed(0x1, per_cpu(sswi_cpu_regs, cpu));
> > +     writel(0x1, per_cpu(sswi_cpu_regs, cpu));
> >  }
> >
> >  static void thead_aclint_sswi_ipi_clear(void)
> > --
>
> thead_aclint_sswi_ipi_clear() seems to have the same bug,
> it also uses the _relaxed() version for no apparent reason.

Hi Arnd,

There seems no need to modify thead_aclint_sswi_ipi_clear() as it only
clears pending IPI on current hart. No other harts require to see
strict order between preceding memory writes and this ACLINT MMIO
write. Please correct me if I missed anything.

Thanks,

Xu Lu

>
>      Arnd
Re: [PATCH RESEND] irqchip: riscv: Order normal writes and IPI writes
Posted by Thomas Gleixner 2 days, 22 hours ago
On Mon, Jan 27 2025 at 17:38, Xu Lu wrote:

This is not a RESEND. The change log has been modified, no?

The prefix is incorrect. See

  https://www.kernel.org/doc/html/latest/process/maintainer-tip.html

> RISC-V distinguishes between normal memory accesses and device I/O and

What is a normal memory write? Are there abnormal memory writes too?

> uses FENCE instruction to order them as viewed by othe RISC-V harts and
> external devices or coprocessors. The FENCE instruction can order any
> combination of device input(I), device output(O), memory reads(R) and
> memory writes(W). For example, 'fence w, o' can be used to ensure all

Can be? It _is_ used, no?

> memory writes from instructions preceding the FENCE instruction appear
> earlier in the global memory order than device output writes from
> instructions after the FENCE instruction.
>
> RISC-V issues IPI by writing certain value to IMSIC/ACLINT MMIO
> registers, which is regarded as device output operation. However, the
> existing implementation of IMSIC/ACLINT driver issues IPI via
> writel_relaxed(), which does not guarantee the order of device output
> operation and preceding memory writes. Then the hart receiving IPI may
> not have seen the latest data yet.
>
> This commit fixes this by replacing writel_relaxed() with writel()
> when

'This commit' is equally wrong as 'This patch'. See Documentation/process/

> issuing IPI, which will use 'fence w, o' to ensure all previous writes
> made by current hart are visible to other harts before they receive
> the IPI.

I've fixed it up for you.

Thanks,

        tglx
Re: [External] Re: [PATCH RESEND] irqchip: riscv: Order normal writes and IPI writes
Posted by Xu Lu 2 days, 18 hours ago
On Mon, Jan 27, 2025 at 6:33 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Mon, Jan 27 2025 at 17:38, Xu Lu wrote:
>
> This is not a RESEND. The change log has been modified, no?

Sure, the change log has been modified. I will pay attention next time.

>
> The prefix is incorrect. See
>
>   https://www.kernel.org/doc/html/latest/process/maintainer-tip.html
>
> > RISC-V distinguishes between normal memory accesses and device I/O and
>
> What is a normal memory write? Are there abnormal memory writes too?

Sorry for the misleading. By normal memory write, I mean memory writes
and want to distinguish it from MMIO writes.

>
> > uses FENCE instruction to order them as viewed by othe RISC-V harts and
> > external devices or coprocessors. The FENCE instruction can order any
> > combination of device input(I), device output(O), memory reads(R) and
> > memory writes(W). For example, 'fence w, o' can be used to ensure all
>
> Can be? It _is_ used, no?

Yes, it _is_ used. 'Can be' is not accurate.

>
> > memory writes from instructions preceding the FENCE instruction appear
> > earlier in the global memory order than device output writes from
> > instructions after the FENCE instruction.
> >
> > RISC-V issues IPI by writing certain value to IMSIC/ACLINT MMIO
> > registers, which is regarded as device output operation. However, the
> > existing implementation of IMSIC/ACLINT driver issues IPI via
> > writel_relaxed(), which does not guarantee the order of device output
> > operation and preceding memory writes. Then the hart receiving IPI may
> > not have seen the latest data yet.
> >
> > This commit fixes this by replacing writel_relaxed() with writel()
> > when
>
> 'This commit' is equally wrong as 'This patch'. See Documentation/process/

Thanks very much. I will check the documents.

Best Regards,

Xu Lu

>
> > issuing IPI, which will use 'fence w, o' to ensure all previous writes
> > made by current hart are visible to other harts before they receive
> > the IPI.
>
> I've fixed it up for you.
>
> Thanks,
>
>         tglx
[tip: irq/urgent] irqchip/riscv: Ensure ordering of memory writes and IPI writes
Posted by tip-bot2 for Xu Lu 2 days, 22 hours ago
The following commit has been merged into the irq/urgent branch of tip:

Commit-ID:     825c78e6a60c309a59d18d5ac5968aa79cef0bd6
Gitweb:        https://git.kernel.org/tip/825c78e6a60c309a59d18d5ac5968aa79cef0bd6
Author:        Xu Lu <luxu.kernel@bytedance.com>
AuthorDate:    Mon, 27 Jan 2025 17:38:46 +08:00
Committer:     Thomas Gleixner <tglx@linutronix.de>
CommitterDate: Mon, 27 Jan 2025 11:07:03 +01:00

irqchip/riscv: Ensure ordering of memory writes and IPI writes

RISC-V distinguishes between memory accesses and device I/O and uses FENCE
instruction to order them as viewed by other RISC-V harts and external
devices or coprocessors. The FENCE instruction can order any combination of
device input(I), device output(O), memory reads(R) and memory
writes(W). For example, 'fence w, o' is used to ensure all memory writes
from instructions preceding the FENCE instruction appear earlier in the
global memory order than device output writes from instructions after the
FENCE instruction.

RISC-V issues IPIs by writing to the IMSIC/ACLINT MMIO registers, which is
regarded as device output operation. However, the existing implementation
of the IMSIC/ACLINT drivers issue the IPI via writel_relaxed(), which does
not guarantee the order of device output operation and preceding memory
writes. As a consequence the hart receiving the IPI might not observe the
IPI related data.

Fix this by replacing writel_relaxed() with writel() when issuing IPIs,
which uses 'fence w, o' to ensure all previous writes made by the current
hart are visible to other harts before they receive the IPI.

Signed-off-by: Xu Lu <luxu.kernel@bytedance.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/all/20250127093846.98625-1-luxu.kernel@bytedance.com
---
 drivers/irqchip/irq-riscv-imsic-early.c      | 2 +-
 drivers/irqchip/irq-thead-c900-aclint-sswi.c | 2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/irqchip/irq-riscv-imsic-early.c b/drivers/irqchip/irq-riscv-imsic-early.c
index c5c2e69..275df50 100644
--- a/drivers/irqchip/irq-riscv-imsic-early.c
+++ b/drivers/irqchip/irq-riscv-imsic-early.c
@@ -27,7 +27,7 @@ static void imsic_ipi_send(unsigned int cpu)
 {
 	struct imsic_local_config *local = per_cpu_ptr(imsic->global.local, cpu);
 
-	writel_relaxed(IMSIC_IPI_ID, local->msi_va);
+	writel(IMSIC_IPI_ID, local->msi_va);
 }
 
 static void imsic_ipi_starting_cpu(void)
diff --git a/drivers/irqchip/irq-thead-c900-aclint-sswi.c b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
index b0e366a..8ff6e7a 100644
--- a/drivers/irqchip/irq-thead-c900-aclint-sswi.c
+++ b/drivers/irqchip/irq-thead-c900-aclint-sswi.c
@@ -31,7 +31,7 @@ static DEFINE_PER_CPU(void __iomem *, sswi_cpu_regs);
 
 static void thead_aclint_sswi_ipi_send(unsigned int cpu)
 {
-	writel_relaxed(0x1, per_cpu(sswi_cpu_regs, cpu));
+	writel(0x1, per_cpu(sswi_cpu_regs, cpu));
 }
 
 static void thead_aclint_sswi_ipi_clear(void)