[PATCH v3 0/8] riscv: Add Zalasr ISA extension support

Xu Lu posted 8 patches 1 week, 5 days ago
Documentation/arch/riscv/hwprobe.rst          |   5 +-
.../devicetree/bindings/riscv/extensions.yaml |   5 +
arch/riscv/include/asm/atomic.h               |   6 -
arch/riscv/include/asm/barrier.h              |  91 ++++++++++--
arch/riscv/include/asm/cmpxchg.h              | 136 ++++++++----------
arch/riscv/include/asm/hwcap.h                |   1 +
arch/riscv/include/asm/insn-def.h             |  79 ++++++++++
arch/riscv/include/uapi/asm/hwprobe.h         |   1 +
arch/riscv/include/uapi/asm/kvm.h             |   1 +
arch/riscv/kernel/cpufeature.c                |   1 +
arch/riscv/kernel/sys_hwprobe.c               |   1 +
arch/riscv/kvm/vcpu_onereg.c                  |   2 +
.../selftests/kvm/riscv/get-reg-list.c        |   4 +
13 files changed, 242 insertions(+), 91 deletions(-)
[PATCH v3 0/8] riscv: Add Zalasr ISA extension support
Posted by Xu Lu 1 week, 5 days ago
This patch adds support for the Zalasr ISA extension, which supplies the
real load acquire/store release instructions.

The specification can be found here:
https://github.com/riscv/riscv-zalasr/blob/main/chapter2.adoc

This patch seires has been tested with ltp on Qemu with Brensan's zalasr
support patch[1].

Some false positive spacing error happens during patch checking. Thus I
CCed maintainers of checkpatch.pl as well.

[1] https://lore.kernel.org/all/CAGPSXwJEdtqW=nx71oufZp64nK6tK=0rytVEcz4F-gfvCOXk2w@mail.gmail.com/

v3:
 - Apply acquire/release semantics to arch_xchg/arch_cmpxchg operations
 so as to ensure FENCE.TSO ordering between operations which precede the
 UNLOCK+LOCK sequence and operations which follow the sequence. Thanks
 to Andrea.
 - Support hwprobe of Zalasr.
 - Allow Zalasr extensions for Guest/VM.

v2:
 - Adjust the order of Zalasr and Zalrsc in dt-bindings. Thanks to
 Conor.

Xu Lu (8):
  riscv: add ISA extension parsing for Zalasr
  dt-bindings: riscv: Add Zalasr ISA extension description
  riscv: hwprobe: Export Zalasr extension
  riscv: Introduce Zalasr instructions
  riscv: Use Zalasr for smp_load_acquire/smp_store_release
  riscv: Apply acquire/release semantics to arch_xchg/arch_cmpxchg
    operations
  RISC-V: KVM: Allow Zalasr extensions for Guest/VM
  KVM: riscv: selftests: Add Zalasr extensions to get-reg-list test

 Documentation/arch/riscv/hwprobe.rst          |   5 +-
 .../devicetree/bindings/riscv/extensions.yaml |   5 +
 arch/riscv/include/asm/atomic.h               |   6 -
 arch/riscv/include/asm/barrier.h              |  91 ++++++++++--
 arch/riscv/include/asm/cmpxchg.h              | 136 ++++++++----------
 arch/riscv/include/asm/hwcap.h                |   1 +
 arch/riscv/include/asm/insn-def.h             |  79 ++++++++++
 arch/riscv/include/uapi/asm/hwprobe.h         |   1 +
 arch/riscv/include/uapi/asm/kvm.h             |   1 +
 arch/riscv/kernel/cpufeature.c                |   1 +
 arch/riscv/kernel/sys_hwprobe.c               |   1 +
 arch/riscv/kvm/vcpu_onereg.c                  |   2 +
 .../selftests/kvm/riscv/get-reg-list.c        |   4 +
 13 files changed, 242 insertions(+), 91 deletions(-)

-- 
2.20.1
Re: [PATCH v3 0/8] riscv: Add Zalasr ISA extension support
Posted by Andrea Parri 1 week, 5 days ago
On Fri, Sep 19, 2025 at 03:37:06PM +0800, Xu Lu wrote:
> This patch adds support for the Zalasr ISA extension, which supplies the
> real load acquire/store release instructions.
> 
> The specification can be found here:
> https://github.com/riscv/riscv-zalasr/blob/main/chapter2.adoc
> 
> This patch seires has been tested with ltp on Qemu with Brensan's zalasr
> support patch[1].
> 
> Some false positive spacing error happens during patch checking. Thus I
> CCed maintainers of checkpatch.pl as well.
> 
> [1] https://lore.kernel.org/all/CAGPSXwJEdtqW=nx71oufZp64nK6tK=0rytVEcz4F-gfvCOXk2w@mail.gmail.com/
> 
> v3:
>  - Apply acquire/release semantics to arch_xchg/arch_cmpxchg operations
>  so as to ensure FENCE.TSO ordering between operations which precede the
>  UNLOCK+LOCK sequence and operations which follow the sequence. Thanks
>  to Andrea.
>  - Support hwprobe of Zalasr.
>  - Allow Zalasr extensions for Guest/VM.
> 
> v2:
>  - Adjust the order of Zalasr and Zalrsc in dt-bindings. Thanks to
>  Conor.
> 
> Xu Lu (8):
>   riscv: add ISA extension parsing for Zalasr
>   dt-bindings: riscv: Add Zalasr ISA extension description
>   riscv: hwprobe: Export Zalasr extension
>   riscv: Introduce Zalasr instructions
>   riscv: Use Zalasr for smp_load_acquire/smp_store_release
>   riscv: Apply acquire/release semantics to arch_xchg/arch_cmpxchg
>     operations
>   RISC-V: KVM: Allow Zalasr extensions for Guest/VM
>   KVM: riscv: selftests: Add Zalasr extensions to get-reg-list test
> 
>  Documentation/arch/riscv/hwprobe.rst          |   5 +-
>  .../devicetree/bindings/riscv/extensions.yaml |   5 +
>  arch/riscv/include/asm/atomic.h               |   6 -
>  arch/riscv/include/asm/barrier.h              |  91 ++++++++++--
>  arch/riscv/include/asm/cmpxchg.h              | 136 ++++++++----------
>  arch/riscv/include/asm/hwcap.h                |   1 +
>  arch/riscv/include/asm/insn-def.h             |  79 ++++++++++
>  arch/riscv/include/uapi/asm/hwprobe.h         |   1 +
>  arch/riscv/include/uapi/asm/kvm.h             |   1 +
>  arch/riscv/kernel/cpufeature.c                |   1 +
>  arch/riscv/kernel/sys_hwprobe.c               |   1 +
>  arch/riscv/kvm/vcpu_onereg.c                  |   2 +
>  .../selftests/kvm/riscv/get-reg-list.c        |   4 +
>  13 files changed, 242 insertions(+), 91 deletions(-)

I wouldn't have rushed this submission while the discussion on v2 seems
so much alive;  IAC, to add and link to that discussion, this version
(not a review, just looking at this diff stat) is changing the fastpath

  read_unlock()
  read_lock()

from something like

  fence rw,w
  amodadd.w
  amoadd.w
  fence r,rw

to

  fence rw,rw
  amoadd.w
  amoadd.w
  fence rw,rw

no matter Zalasr or !Zalasr.  Similarly for other atomic operations with
release or acquire semantics.  I guess the change was not intentional?
If it was intentional, it should be properly mentioned in the changelog.

  Andrea
Re: [External] Re: [PATCH v3 0/8] riscv: Add Zalasr ISA extension support
Posted by Xu Lu 1 week, 5 days ago
Hi Andrea,

On Fri, Sep 19, 2025 at 6:04 PM Andrea Parri <parri.andrea@gmail.com> wrote:
>
> On Fri, Sep 19, 2025 at 03:37:06PM +0800, Xu Lu wrote:
> > This patch adds support for the Zalasr ISA extension, which supplies the
> > real load acquire/store release instructions.
> >
> > The specification can be found here:
> > https://github.com/riscv/riscv-zalasr/blob/main/chapter2.adoc
> >
> > This patch seires has been tested with ltp on Qemu with Brensan's zalasr
> > support patch[1].
> >
> > Some false positive spacing error happens during patch checking. Thus I
> > CCed maintainers of checkpatch.pl as well.
> >
> > [1] https://lore.kernel.org/all/CAGPSXwJEdtqW=nx71oufZp64nK6tK=0rytVEcz4F-gfvCOXk2w@mail.gmail.com/
> >
> > v3:
> >  - Apply acquire/release semantics to arch_xchg/arch_cmpxchg operations
> >  so as to ensure FENCE.TSO ordering between operations which precede the
> >  UNLOCK+LOCK sequence and operations which follow the sequence. Thanks
> >  to Andrea.
> >  - Support hwprobe of Zalasr.
> >  - Allow Zalasr extensions for Guest/VM.
> >
> > v2:
> >  - Adjust the order of Zalasr and Zalrsc in dt-bindings. Thanks to
> >  Conor.
> >
> > Xu Lu (8):
> >   riscv: add ISA extension parsing for Zalasr
> >   dt-bindings: riscv: Add Zalasr ISA extension description
> >   riscv: hwprobe: Export Zalasr extension
> >   riscv: Introduce Zalasr instructions
> >   riscv: Use Zalasr for smp_load_acquire/smp_store_release
> >   riscv: Apply acquire/release semantics to arch_xchg/arch_cmpxchg
> >     operations
> >   RISC-V: KVM: Allow Zalasr extensions for Guest/VM
> >   KVM: riscv: selftests: Add Zalasr extensions to get-reg-list test
> >
> >  Documentation/arch/riscv/hwprobe.rst          |   5 +-
> >  .../devicetree/bindings/riscv/extensions.yaml |   5 +
> >  arch/riscv/include/asm/atomic.h               |   6 -
> >  arch/riscv/include/asm/barrier.h              |  91 ++++++++++--
> >  arch/riscv/include/asm/cmpxchg.h              | 136 ++++++++----------
> >  arch/riscv/include/asm/hwcap.h                |   1 +
> >  arch/riscv/include/asm/insn-def.h             |  79 ++++++++++
> >  arch/riscv/include/uapi/asm/hwprobe.h         |   1 +
> >  arch/riscv/include/uapi/asm/kvm.h             |   1 +
> >  arch/riscv/kernel/cpufeature.c                |   1 +
> >  arch/riscv/kernel/sys_hwprobe.c               |   1 +
> >  arch/riscv/kvm/vcpu_onereg.c                  |   2 +
> >  .../selftests/kvm/riscv/get-reg-list.c        |   4 +
> >  13 files changed, 242 insertions(+), 91 deletions(-)
>
> I wouldn't have rushed this submission while the discussion on v2 seems
> so much alive;  IAC, to add and link to that discussion, this version

Thanks. This version is sent out to show my solution to the FENCE.TSO
problem you pointed out before. I will continue to improve it. Look
forward to more suggestions from you.

> (not a review, just looking at this diff stat) is changing the fastpath
>
>   read_unlock()
>   read_lock()
>
> from something like
>
>   fence rw,w
>   amodadd.w
>   amoadd.w
>   fence r,rw
>
> to
>
>   fence rw,rw
>   amoadd.w
>   amoadd.w
>   fence rw,rw
>
> no matter Zalasr or !Zalasr.  Similarly for other atomic operations with
> release or acquire semantics.  I guess the change was not intentional?
> If it was intentional, it should be properly mentioned in the changelog.

Sorry about that. It is intended. The atomic operation before
__atomic_acquire_fence or operation after __atomic_release_fence can
be just a single ld or sd instruction instead of amocas or amoswap. In
such cases, when the store release operation becomes 'sd.rl', the
__atomic_acquire_fence via 'fence r, rw' can not ensure FENCE.TSO
anymore. Thus I replace it with 'fence rw, rw'.

I will make it a separate commit and provide more messages in the
changelog. Maybe alternative mechanism can be applied to accelerate
it.

Best Regards,
Xu Lu

>
>   Andrea
Re: [External] Re: [PATCH v3 0/8] riscv: Add Zalasr ISA extension support
Posted by Xu Lu 1 week, 5 days ago
On Fri, Sep 19, 2025 at 6:39 PM Xu Lu <luxu.kernel@bytedance.com> wrote:
>
> Hi Andrea,
>
> On Fri, Sep 19, 2025 at 6:04 PM Andrea Parri <parri.andrea@gmail.com> wrote:
> >
> > On Fri, Sep 19, 2025 at 03:37:06PM +0800, Xu Lu wrote:
> > > This patch adds support for the Zalasr ISA extension, which supplies the
> > > real load acquire/store release instructions.
> > >
> > > The specification can be found here:
> > > https://github.com/riscv/riscv-zalasr/blob/main/chapter2.adoc
> > >
> > > This patch seires has been tested with ltp on Qemu with Brensan's zalasr
> > > support patch[1].
> > >
> > > Some false positive spacing error happens during patch checking. Thus I
> > > CCed maintainers of checkpatch.pl as well.
> > >
> > > [1] https://lore.kernel.org/all/CAGPSXwJEdtqW=nx71oufZp64nK6tK=0rytVEcz4F-gfvCOXk2w@mail.gmail.com/
> > >
> > > v3:
> > >  - Apply acquire/release semantics to arch_xchg/arch_cmpxchg operations
> > >  so as to ensure FENCE.TSO ordering between operations which precede the
> > >  UNLOCK+LOCK sequence and operations which follow the sequence. Thanks
> > >  to Andrea.
> > >  - Support hwprobe of Zalasr.
> > >  - Allow Zalasr extensions for Guest/VM.
> > >
> > > v2:
> > >  - Adjust the order of Zalasr and Zalrsc in dt-bindings. Thanks to
> > >  Conor.
> > >
> > > Xu Lu (8):
> > >   riscv: add ISA extension parsing for Zalasr
> > >   dt-bindings: riscv: Add Zalasr ISA extension description
> > >   riscv: hwprobe: Export Zalasr extension
> > >   riscv: Introduce Zalasr instructions
> > >   riscv: Use Zalasr for smp_load_acquire/smp_store_release
> > >   riscv: Apply acquire/release semantics to arch_xchg/arch_cmpxchg
> > >     operations
> > >   RISC-V: KVM: Allow Zalasr extensions for Guest/VM
> > >   KVM: riscv: selftests: Add Zalasr extensions to get-reg-list test
> > >
> > >  Documentation/arch/riscv/hwprobe.rst          |   5 +-
> > >  .../devicetree/bindings/riscv/extensions.yaml |   5 +
> > >  arch/riscv/include/asm/atomic.h               |   6 -
> > >  arch/riscv/include/asm/barrier.h              |  91 ++++++++++--
> > >  arch/riscv/include/asm/cmpxchg.h              | 136 ++++++++----------
> > >  arch/riscv/include/asm/hwcap.h                |   1 +
> > >  arch/riscv/include/asm/insn-def.h             |  79 ++++++++++
> > >  arch/riscv/include/uapi/asm/hwprobe.h         |   1 +
> > >  arch/riscv/include/uapi/asm/kvm.h             |   1 +
> > >  arch/riscv/kernel/cpufeature.c                |   1 +
> > >  arch/riscv/kernel/sys_hwprobe.c               |   1 +
> > >  arch/riscv/kvm/vcpu_onereg.c                  |   2 +
> > >  .../selftests/kvm/riscv/get-reg-list.c        |   4 +
> > >  13 files changed, 242 insertions(+), 91 deletions(-)
> >
> > I wouldn't have rushed this submission while the discussion on v2 seems
> > so much alive;  IAC, to add and link to that discussion, this version
>
> Thanks. This version is sent out to show my solution to the FENCE.TSO
> problem you pointed out before. I will continue to improve it. Look
> forward to more suggestions from you.
>
> > (not a review, just looking at this diff stat) is changing the fastpath
> >
> >   read_unlock()
> >   read_lock()
> >
> > from something like
> >
> >   fence rw,w
> >   amodadd.w
> >   amoadd.w
> >   fence r,rw
> >
> > to
> >
> >   fence rw,rw
> >   amoadd.w
> >   amoadd.w
> >   fence rw,rw
> >
> > no matter Zalasr or !Zalasr.  Similarly for other atomic operations with
> > release or acquire semantics.  I guess the change was not intentional?
> > If it was intentional, it should be properly mentioned in the changelog.
>
> Sorry about that. It is intended. The atomic operation before
> __atomic_acquire_fence or operation after __atomic_release_fence can
> be just a single ld or sd instruction instead of amocas or amoswap. In
> such cases, when the store release operation becomes 'sd.rl', the
> __atomic_acquire_fence via 'fence r, rw' can not ensure FENCE.TSO
> anymore. Thus I replace it with 'fence rw, rw'.

This is also the common implementation on other architectures who use
aq/rl instructions like ARM. And you certainly already knew it~

>
> I will make it a separate commit and provide more messages in the
> changelog. Maybe alternative mechanism can be applied to accelerate
> it.
>
> Best Regards,
> Xu Lu
>
> >
> >   Andrea
Re: [External] Re: [PATCH v3 0/8] riscv: Add Zalasr ISA extension support
Posted by Andrea Parri 1 week, 5 days ago
> > > (not a review, just looking at this diff stat) is changing the fastpath
> > >
> > >   read_unlock()
> > >   read_lock()
> > >
> > > from something like
> > >
> > >   fence rw,w
> > >   amodadd.w
> > >   amoadd.w
> > >   fence r,rw
> > >
> > > to
> > >
> > >   fence rw,rw
> > >   amoadd.w
> > >   amoadd.w
> > >   fence rw,rw
> > >
> > > no matter Zalasr or !Zalasr.  Similarly for other atomic operations with
> > > release or acquire semantics.  I guess the change was not intentional?
> > > If it was intentional, it should be properly mentioned in the changelog.
> >
> > Sorry about that. It is intended. The atomic operation before
> > __atomic_acquire_fence or operation after __atomic_release_fence can
> > be just a single ld or sd instruction instead of amocas or amoswap. In
> > such cases, when the store release operation becomes 'sd.rl', the
> > __atomic_acquire_fence via 'fence r, rw' can not ensure FENCE.TSO
> > anymore. Thus I replace it with 'fence rw, rw'.

But you could apply similar changes you performed for xchg & cmpxchg: use
.AQ and .RL for other atomic RMW operations as well, no?  AFAICS, that is
what arm64 actually does in arch/arm64/include/asm/atomic_{ll_sc,lse}.h .

  Andrea


> This is also the common implementation on other architectures who use
> aq/rl instructions like ARM. And you certainly already knew it~
Re: [External] Re: [PATCH v3 0/8] riscv: Add Zalasr ISA extension support
Posted by Xu Lu 1 week, 5 days ago
On Fri, Sep 19, 2025 at 7:06 PM Andrea Parri <parri.andrea@gmail.com> wrote:
>
> > > > (not a review, just looking at this diff stat) is changing the fastpath
> > > >
> > > >   read_unlock()
> > > >   read_lock()
> > > >
> > > > from something like
> > > >
> > > >   fence rw,w
> > > >   amodadd.w
> > > >   amoadd.w
> > > >   fence r,rw
> > > >
> > > > to
> > > >
> > > >   fence rw,rw
> > > >   amoadd.w
> > > >   amoadd.w
> > > >   fence rw,rw
> > > >
> > > > no matter Zalasr or !Zalasr.  Similarly for other atomic operations with
> > > > release or acquire semantics.  I guess the change was not intentional?
> > > > If it was intentional, it should be properly mentioned in the changelog.
> > >
> > > Sorry about that. It is intended. The atomic operation before
> > > __atomic_acquire_fence or operation after __atomic_release_fence can
> > > be just a single ld or sd instruction instead of amocas or amoswap. In
> > > such cases, when the store release operation becomes 'sd.rl', the
> > > __atomic_acquire_fence via 'fence r, rw' can not ensure FENCE.TSO
> > > anymore. Thus I replace it with 'fence rw, rw'.
>
> But you could apply similar changes you performed for xchg & cmpxchg: use
> .AQ and .RL for other atomic RMW operations as well, no?  AFAICS, that is
> what arm64 actually does in arch/arm64/include/asm/atomic_{ll_sc,lse}.h .

I see. I will study the implementation of ARM and refine my patch. Thanks a lot.

Best regards,
Xu Lu

>
>   Andrea
>
>
> > This is also the common implementation on other architectures who use
> > aq/rl instructions like ARM. And you certainly already knew it~