arch/riscv/include/asm/atomic.h | 164 ++++++------- arch/riscv/include/asm/cmpxchg.h | 404 ++++++++++--------------------- 2 files changed, 200 insertions(+), 368 deletions(-)
While studying riscv's cmpxchg.h file, I got really interested in
understanding how RISCV asm implemented the different versions of
{cmp,}xchg.
When I understood the pattern, it made sense for me to remove the
duplications and create macros to make it easier to understand what exactly
changes between the versions: Instruction sufixes & barriers.
Also, did the same kind of work on atomic.c.
After that, I noted both cmpxchg and xchg only accept variables of
size 4 and 8, compared to x86 and arm64 which do 1,2,4,8.
Now that deduplication is done, it is quite direct to implement them
for variable sizes 1 and 2, so I did it. Then Guo Ren already presented
me some possible users :)
I did compare the generated asm on a test.c that contained usage for every
changed function, and could not detect any change on patches 1 + 2 + 3
compared with upstream.
Pathes 4 & 5 were compiled-tested, merged with guoren/qspinlock_v11 and
booted just fine with qemu -machine virt -append "qspinlock".
(tree: https://gitlab.com/LeoBras/linux/-/commits/guo_qspinlock_v11)
Thanks!
Leo
Changes since squashed cmpxchg RFCv4:
- Added (__typeof__(*(p))) before returning from {cmp,}xchg, as done
in current upstream, (possibly) fixing the bug from kernel test robot
https://lore.kernel.org/all/20230809021311.1390578-2-leobras@redhat.com/
Changes since squashed cmpxchg RFCv3:
- Fixed bug on cmpxchg macro for var size 1 & 2: now working
- Macros for var size 1 & 2's lr.w and sc.w now are guaranteed to receive
input of a 32-bit aligned address
- Renamed internal macros from _mask to _masked for patches 4 & 5
- __rc variable on macros for var size 1 & 2 changed from register to ulong
https://lore.kernel.org/all/20230804084900.1135660-2-leobras@redhat.com/
Changes since squashed cmpxchg RFCv2:
- Removed rc parameter from the new macro: it can be internal to the macro
- 2 new patches: cmpxchg size 1 and 2, xchg size 1 and 2
https://lore.kernel.org/all/20230803051401.710236-2-leobras@redhat.com/
Changes since squashed cmpxchg RFCv1:
- Unified with atomic.c patchset
- Rebased on top of torvalds/master (thanks Andrea Parri!)
- Removed helper macros that were not being used elsewhere in the kernel.
https://lore.kernel.org/all/20230419062505.257231-1-leobras@redhat.com/
https://lore.kernel.org/all/20230406082018.70367-1-leobras@redhat.com/
Changes since (cmpxchg) RFCv3:
- Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/
Changes since (cmpxchg) RFCv2:
- Fixed macros that depend on having a local variable with a magic name
- Previous cast to (long) is now only applied on 4-bytes cmpxchg
https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/
Changes since (cmpxchg) RFCv1:
- Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/
Leonardo Bras (5):
riscv/cmpxchg: Deduplicate xchg() asm functions
riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
riscv/atomic.h : Deduplicate arch_atomic.*
riscv/cmpxchg: Implement cmpxchg for variables of size 1 and 2
riscv/cmpxchg: Implement xchg for variables of size 1 and 2
arch/riscv/include/asm/atomic.h | 164 ++++++-------
arch/riscv/include/asm/cmpxchg.h | 404 ++++++++++---------------------
2 files changed, 200 insertions(+), 368 deletions(-)
base-commit: cacc6e22932f373a91d7be55a9b992dc77f4c59b
--
2.41.0
On Thu, Aug 10, 2023 at 01:03:42AM -0300, Leonardo Bras wrote:
> While studying riscv's cmpxchg.h file, I got really interested in
> understanding how RISCV asm implemented the different versions of
> {cmp,}xchg.
>
> When I understood the pattern, it made sense for me to remove the
> duplications and create macros to make it easier to understand what exactly
> changes between the versions: Instruction sufixes & barriers.
>
> Also, did the same kind of work on atomic.c.
>
> After that, I noted both cmpxchg and xchg only accept variables of
> size 4 and 8, compared to x86 and arm64 which do 1,2,4,8.
>
> Now that deduplication is done, it is quite direct to implement them
> for variable sizes 1 and 2, so I did it. Then Guo Ren already presented
> me some possible users :)
>
> I did compare the generated asm on a test.c that contained usage for every
> changed function, and could not detect any change on patches 1 + 2 + 3
> compared with upstream.
>
> Pathes 4 & 5 were compiled-tested, merged with guoren/qspinlock_v11 and
> booted just fine with qemu -machine virt -append "qspinlock".
>
> (tree: https://gitlab.com/LeoBras/linux/-/commits/guo_qspinlock_v11)
Tested-by: Guo Ren <guoren@kernel.org>
Sorry for late reply, because we are stress testing CNA qspinlock on
sg2042 128 cores hardware platform. This series has passed our test for
several weeks. For more detail, ref:
https://lore.kernel.org/linux-riscv/20230910082911.3378782-1-guoren@kernel.org/
>
> Thanks!
> Leo
>
> Changes since squashed cmpxchg RFCv4:
> - Added (__typeof__(*(p))) before returning from {cmp,}xchg, as done
> in current upstream, (possibly) fixing the bug from kernel test robot
> https://lore.kernel.org/all/20230809021311.1390578-2-leobras@redhat.com/
>
> Changes since squashed cmpxchg RFCv3:
> - Fixed bug on cmpxchg macro for var size 1 & 2: now working
> - Macros for var size 1 & 2's lr.w and sc.w now are guaranteed to receive
> input of a 32-bit aligned address
> - Renamed internal macros from _mask to _masked for patches 4 & 5
> - __rc variable on macros for var size 1 & 2 changed from register to ulong
> https://lore.kernel.org/all/20230804084900.1135660-2-leobras@redhat.com/
>
> Changes since squashed cmpxchg RFCv2:
> - Removed rc parameter from the new macro: it can be internal to the macro
> - 2 new patches: cmpxchg size 1 and 2, xchg size 1 and 2
> https://lore.kernel.org/all/20230803051401.710236-2-leobras@redhat.com/
>
> Changes since squashed cmpxchg RFCv1:
> - Unified with atomic.c patchset
> - Rebased on top of torvalds/master (thanks Andrea Parri!)
> - Removed helper macros that were not being used elsewhere in the kernel.
> https://lore.kernel.org/all/20230419062505.257231-1-leobras@redhat.com/
> https://lore.kernel.org/all/20230406082018.70367-1-leobras@redhat.com/
>
> Changes since (cmpxchg) RFCv3:
> - Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
> https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/
>
> Changes since (cmpxchg) RFCv2:
> - Fixed macros that depend on having a local variable with a magic name
> - Previous cast to (long) is now only applied on 4-bytes cmpxchg
> https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/
>
> Changes since (cmpxchg) RFCv1:
> - Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
> https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/
>
> Leonardo Bras (5):
> riscv/cmpxchg: Deduplicate xchg() asm functions
> riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
> riscv/atomic.h : Deduplicate arch_atomic.*
> riscv/cmpxchg: Implement cmpxchg for variables of size 1 and 2
> riscv/cmpxchg: Implement xchg for variables of size 1 and 2
>
> arch/riscv/include/asm/atomic.h | 164 ++++++-------
> arch/riscv/include/asm/cmpxchg.h | 404 ++++++++++---------------------
> 2 files changed, 200 insertions(+), 368 deletions(-)
>
>
> base-commit: cacc6e22932f373a91d7be55a9b992dc77f4c59b
> --
> 2.41.0
>
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv
>
On Sun, Sep 10, 2023 at 04:50:29AM -0400, Guo Ren wrote:
> On Thu, Aug 10, 2023 at 01:03:42AM -0300, Leonardo Bras wrote:
> > While studying riscv's cmpxchg.h file, I got really interested in
> > understanding how RISCV asm implemented the different versions of
> > {cmp,}xchg.
> >
> > When I understood the pattern, it made sense for me to remove the
> > duplications and create macros to make it easier to understand what exactly
> > changes between the versions: Instruction sufixes & barriers.
> >
> > Also, did the same kind of work on atomic.c.
> >
> > After that, I noted both cmpxchg and xchg only accept variables of
> > size 4 and 8, compared to x86 and arm64 which do 1,2,4,8.
> >
> > Now that deduplication is done, it is quite direct to implement them
> > for variable sizes 1 and 2, so I did it. Then Guo Ren already presented
> > me some possible users :)
> >
> > I did compare the generated asm on a test.c that contained usage for every
> > changed function, and could not detect any change on patches 1 + 2 + 3
> > compared with upstream.
> >
> > Pathes 4 & 5 were compiled-tested, merged with guoren/qspinlock_v11 and
> > booted just fine with qemu -machine virt -append "qspinlock".
> >
> > (tree: https://gitlab.com/LeoBras/linux/-/commits/guo_qspinlock_v11)
> Tested-by: Guo Ren <guoren@kernel.org>
>
Hello Guo Ren, thanks for testing!
I will resend this series, and I would like to understand how should I put
your Tested-by over this patchset:
Is it ok if I add it on each patch of this series?
Thanks!
Leo
> Sorry for late reply, because we are stress testing CNA qspinlock on
> sg2042 128 cores hardware platform. This series has passed our test for
> several weeks. For more detail, ref:
> https://lore.kernel.org/linux-riscv/20230910082911.3378782-1-guoren@kernel.org/
>
> >
> > Thanks!
> > Leo
> >
> > Changes since squashed cmpxchg RFCv4:
> > - Added (__typeof__(*(p))) before returning from {cmp,}xchg, as done
> > in current upstream, (possibly) fixing the bug from kernel test robot
> > https://lore.kernel.org/all/20230809021311.1390578-2-leobras@redhat.com/
> >
> > Changes since squashed cmpxchg RFCv3:
> > - Fixed bug on cmpxchg macro for var size 1 & 2: now working
> > - Macros for var size 1 & 2's lr.w and sc.w now are guaranteed to receive
> > input of a 32-bit aligned address
> > - Renamed internal macros from _mask to _masked for patches 4 & 5
> > - __rc variable on macros for var size 1 & 2 changed from register to ulong
> > https://lore.kernel.org/all/20230804084900.1135660-2-leobras@redhat.com/
> >
> > Changes since squashed cmpxchg RFCv2:
> > - Removed rc parameter from the new macro: it can be internal to the macro
> > - 2 new patches: cmpxchg size 1 and 2, xchg size 1 and 2
> > https://lore.kernel.org/all/20230803051401.710236-2-leobras@redhat.com/
> >
> > Changes since squashed cmpxchg RFCv1:
> > - Unified with atomic.c patchset
> > - Rebased on top of torvalds/master (thanks Andrea Parri!)
> > - Removed helper macros that were not being used elsewhere in the kernel.
> > https://lore.kernel.org/all/20230419062505.257231-1-leobras@redhat.com/
> > https://lore.kernel.org/all/20230406082018.70367-1-leobras@redhat.com/
> >
> > Changes since (cmpxchg) RFCv3:
> > - Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
> > https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/
> >
> > Changes since (cmpxchg) RFCv2:
> > - Fixed macros that depend on having a local variable with a magic name
> > - Previous cast to (long) is now only applied on 4-bytes cmpxchg
> > https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/
> >
> > Changes since (cmpxchg) RFCv1:
> > - Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
> > https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/
> >
> > Leonardo Bras (5):
> > riscv/cmpxchg: Deduplicate xchg() asm functions
> > riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
> > riscv/atomic.h : Deduplicate arch_atomic.*
> > riscv/cmpxchg: Implement cmpxchg for variables of size 1 and 2
> > riscv/cmpxchg: Implement xchg for variables of size 1 and 2
> >
> > arch/riscv/include/asm/atomic.h | 164 ++++++-------
> > arch/riscv/include/asm/cmpxchg.h | 404 ++++++++++---------------------
> > 2 files changed, 200 insertions(+), 368 deletions(-)
> >
> >
> > base-commit: cacc6e22932f373a91d7be55a9b992dc77f4c59b
> > --
> > 2.41.0
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
> >
>
On Sat, Dec 23, 2023 at 11:08 AM Leonardo Bras <leobras@redhat.com> wrote:
>
> On Sun, Sep 10, 2023 at 04:50:29AM -0400, Guo Ren wrote:
> > On Thu, Aug 10, 2023 at 01:03:42AM -0300, Leonardo Bras wrote:
> > > While studying riscv's cmpxchg.h file, I got really interested in
> > > understanding how RISCV asm implemented the different versions of
> > > {cmp,}xchg.
> > >
> > > When I understood the pattern, it made sense for me to remove the
> > > duplications and create macros to make it easier to understand what exactly
> > > changes between the versions: Instruction sufixes & barriers.
> > >
> > > Also, did the same kind of work on atomic.c.
> > >
> > > After that, I noted both cmpxchg and xchg only accept variables of
> > > size 4 and 8, compared to x86 and arm64 which do 1,2,4,8.
> > >
> > > Now that deduplication is done, it is quite direct to implement them
> > > for variable sizes 1 and 2, so I did it. Then Guo Ren already presented
> > > me some possible users :)
> > >
> > > I did compare the generated asm on a test.c that contained usage for every
> > > changed function, and could not detect any change on patches 1 + 2 + 3
> > > compared with upstream.
> > >
> > > Pathes 4 & 5 were compiled-tested, merged with guoren/qspinlock_v11 and
> > > booted just fine with qemu -machine virt -append "qspinlock".
> > >
> > > (tree: https://gitlab.com/LeoBras/linux/-/commits/guo_qspinlock_v11)
> > Tested-by: Guo Ren <guoren@kernel.org>
> >
>
> Hello Guo Ren, thanks for testing!
>
> I will resend this series, and I would like to understand how should I put
> your Tested-by over this patchset:
>
> Is it ok if I add it on each patch of this series?
Yes, my qspinlock_v12 based on your patch series.
https://github.com/guoren83/linux/tree/qspinlock_v12
Some people tell me paravirt-qspinlock can't work with nested
virtualization, but I haven't found a way to set up a test
environment. I'm working on that.
>
> Thanks!
> Leo
>
>
> > Sorry for late reply, because we are stress testing CNA qspinlock on
> > sg2042 128 cores hardware platform. This series has passed our test for
> > several weeks. For more detail, ref:
> > https://lore.kernel.org/linux-riscv/20230910082911.3378782-1-guoren@kernel.org/
> >
> > >
> > > Thanks!
> > > Leo
> > >
> > > Changes since squashed cmpxchg RFCv4:
> > > - Added (__typeof__(*(p))) before returning from {cmp,}xchg, as done
> > > in current upstream, (possibly) fixing the bug from kernel test robot
> > > https://lore.kernel.org/all/20230809021311.1390578-2-leobras@redhat.com/
> > >
> > > Changes since squashed cmpxchg RFCv3:
> > > - Fixed bug on cmpxchg macro for var size 1 & 2: now working
> > > - Macros for var size 1 & 2's lr.w and sc.w now are guaranteed to receive
> > > input of a 32-bit aligned address
> > > - Renamed internal macros from _mask to _masked for patches 4 & 5
> > > - __rc variable on macros for var size 1 & 2 changed from register to ulong
> > > https://lore.kernel.org/all/20230804084900.1135660-2-leobras@redhat.com/
> > >
> > > Changes since squashed cmpxchg RFCv2:
> > > - Removed rc parameter from the new macro: it can be internal to the macro
> > > - 2 new patches: cmpxchg size 1 and 2, xchg size 1 and 2
> > > https://lore.kernel.org/all/20230803051401.710236-2-leobras@redhat.com/
> > >
> > > Changes since squashed cmpxchg RFCv1:
> > > - Unified with atomic.c patchset
> > > - Rebased on top of torvalds/master (thanks Andrea Parri!)
> > > - Removed helper macros that were not being used elsewhere in the kernel.
> > > https://lore.kernel.org/all/20230419062505.257231-1-leobras@redhat.com/
> > > https://lore.kernel.org/all/20230406082018.70367-1-leobras@redhat.com/
> > >
> > > Changes since (cmpxchg) RFCv3:
> > > - Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
> > > https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/
> > >
> > > Changes since (cmpxchg) RFCv2:
> > > - Fixed macros that depend on having a local variable with a magic name
> > > - Previous cast to (long) is now only applied on 4-bytes cmpxchg
> > > https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/
> > >
> > > Changes since (cmpxchg) RFCv1:
> > > - Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
> > > https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/
> > >
> > > Leonardo Bras (5):
> > > riscv/cmpxchg: Deduplicate xchg() asm functions
> > > riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
> > > riscv/atomic.h : Deduplicate arch_atomic.*
> > > riscv/cmpxchg: Implement cmpxchg for variables of size 1 and 2
> > > riscv/cmpxchg: Implement xchg for variables of size 1 and 2
> > >
> > > arch/riscv/include/asm/atomic.h | 164 ++++++-------
> > > arch/riscv/include/asm/cmpxchg.h | 404 ++++++++++---------------------
> > > 2 files changed, 200 insertions(+), 368 deletions(-)
> > >
> > >
> > > base-commit: cacc6e22932f373a91d7be55a9b992dc77f4c59b
> > > --
> > > 2.41.0
> > >
> > >
> > > _______________________________________________
> > > linux-riscv mailing list
> > > linux-riscv@lists.infradead.org
> > > http://lists.infradead.org/mailman/listinfo/linux-riscv
> > >
> >
>
--
Best Regards
Guo Ren
On Sun, Sep 10, 2023 at 5:50 AM Guo Ren <guoren@kernel.org> wrote:
>
> On Thu, Aug 10, 2023 at 01:03:42AM -0300, Leonardo Bras wrote:
> > While studying riscv's cmpxchg.h file, I got really interested in
> > understanding how RISCV asm implemented the different versions of
> > {cmp,}xchg.
> >
> > When I understood the pattern, it made sense for me to remove the
> > duplications and create macros to make it easier to understand what exactly
> > changes between the versions: Instruction sufixes & barriers.
> >
> > Also, did the same kind of work on atomic.c.
> >
> > After that, I noted both cmpxchg and xchg only accept variables of
> > size 4 and 8, compared to x86 and arm64 which do 1,2,4,8.
> >
> > Now that deduplication is done, it is quite direct to implement them
> > for variable sizes 1 and 2, so I did it. Then Guo Ren already presented
> > me some possible users :)
> >
> > I did compare the generated asm on a test.c that contained usage for every
> > changed function, and could not detect any change on patches 1 + 2 + 3
> > compared with upstream.
> >
> > Pathes 4 & 5 were compiled-tested, merged with guoren/qspinlock_v11 and
> > booted just fine with qemu -machine virt -append "qspinlock".
> >
> > (tree: https://gitlab.com/LeoBras/linux/-/commits/guo_qspinlock_v11)
> Tested-by: Guo Ren <guoren@kernel.org>
>
> Sorry for late reply, because we are stress testing CNA qspinlock on
> sg2042 128 cores hardware platform. This series has passed our test for
> several weeks. For more detail, ref:
> https://lore.kernel.org/linux-riscv/20230910082911.3378782-1-guoren@kernel.org/
>
That's awesome!
Thanks for testing!
Leo
> >
> > Thanks!
> > Leo
> >
> > Changes since squashed cmpxchg RFCv4:
> > - Added (__typeof__(*(p))) before returning from {cmp,}xchg, as done
> > in current upstream, (possibly) fixing the bug from kernel test robot
> > https://lore.kernel.org/all/20230809021311.1390578-2-leobras@redhat.com/
> >
> > Changes since squashed cmpxchg RFCv3:
> > - Fixed bug on cmpxchg macro for var size 1 & 2: now working
> > - Macros for var size 1 & 2's lr.w and sc.w now are guaranteed to receive
> > input of a 32-bit aligned address
> > - Renamed internal macros from _mask to _masked for patches 4 & 5
> > - __rc variable on macros for var size 1 & 2 changed from register to ulong
> > https://lore.kernel.org/all/20230804084900.1135660-2-leobras@redhat.com/
> >
> > Changes since squashed cmpxchg RFCv2:
> > - Removed rc parameter from the new macro: it can be internal to the macro
> > - 2 new patches: cmpxchg size 1 and 2, xchg size 1 and 2
> > https://lore.kernel.org/all/20230803051401.710236-2-leobras@redhat.com/
> >
> > Changes since squashed cmpxchg RFCv1:
> > - Unified with atomic.c patchset
> > - Rebased on top of torvalds/master (thanks Andrea Parri!)
> > - Removed helper macros that were not being used elsewhere in the kernel.
> > https://lore.kernel.org/all/20230419062505.257231-1-leobras@redhat.com/
> > https://lore.kernel.org/all/20230406082018.70367-1-leobras@redhat.com/
> >
> > Changes since (cmpxchg) RFCv3:
> > - Squashed the 6 original patches in 2: one for cmpxchg and one for xchg
> > https://lore.kernel.org/all/20230404163741.2762165-1-leobras@redhat.com/
> >
> > Changes since (cmpxchg) RFCv2:
> > - Fixed macros that depend on having a local variable with a magic name
> > - Previous cast to (long) is now only applied on 4-bytes cmpxchg
> > https://lore.kernel.org/all/20230321074249.2221674-1-leobras@redhat.com/
> >
> > Changes since (cmpxchg) RFCv1:
> > - Fixed patch 4/6 suffix from 'w.aqrl' to '.w.aqrl', to avoid build error
> > https://lore.kernel.org/all/20230318080059.1109286-1-leobras@redhat.com/
> >
> > Leonardo Bras (5):
> > riscv/cmpxchg: Deduplicate xchg() asm functions
> > riscv/cmpxchg: Deduplicate cmpxchg() asm and macros
> > riscv/atomic.h : Deduplicate arch_atomic.*
> > riscv/cmpxchg: Implement cmpxchg for variables of size 1 and 2
> > riscv/cmpxchg: Implement xchg for variables of size 1 and 2
> >
> > arch/riscv/include/asm/atomic.h | 164 ++++++-------
> > arch/riscv/include/asm/cmpxchg.h | 404 ++++++++++---------------------
> > 2 files changed, 200 insertions(+), 368 deletions(-)
> >
> >
> > base-commit: cacc6e22932f373a91d7be55a9b992dc77f4c59b
> > --
> > 2.41.0
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > linux-riscv@lists.infradead.org
> > http://lists.infradead.org/mailman/listinfo/linux-riscv
> >
>
© 2016 - 2026 Red Hat, Inc.