[PATCH] riscv, bpf: Make BPF_CMPXCHG fully ordered

Andrea Parri posted 1 patch 1 month, 1 week ago
arch/riscv/net/bpf_jit_comp64.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
[PATCH] riscv, bpf: Make BPF_CMPXCHG fully ordered
Posted by Andrea Parri 1 month, 1 week ago
According to the prototype formal BPF memory consistency model
discussed e.g. in [1] and following the ordering properties of
the C/in-kernel macro atomic_cmpxchg(), a BPF atomic operation
with the BPF_CMPXCHG modifier is fully ordered.  However, the
current RISC-V JIT lowerings fail to meet such memory ordering
property.  This is illustrated by the following litmus test:

BPF BPF__MP+success_cmpxchg+fence
{
 0:r1=x; 0:r3=y; 0:r5=1;
 1:r2=y; 1:r4=f; 1:r7=x;
}
 P0                               | P1                                         ;
 *(u64 *)(r1 + 0) = 1             | r1 = *(u64 *)(r2 + 0)                      ;
 r2 = cmpxchg_64 (r3 + 0, r4, r5) | r3 = atomic_fetch_add((u64 *)(r4 + 0), r5) ;
                                  | r6 = *(u64 *)(r7 + 0)                      ;
exists (1:r1=1 /\ 1:r6=0)

whose "exists" clause is not satisfiable according to the BPF
memory model.  Using the current RISC-V JIT lowerings, the test
can be mapped to the following RISC-V litmus test:

RISCV RISCV__MP+success_cmpxchg+fence
{
 0:x1=x; 0:x3=y; 0:x5=1;
 1:x2=y; 1:x4=f; 1:x7=x;
}
 P0                 | P1                          ;
 sd x5, 0(x1)       | ld x1, 0(x2)                ;
 L00:               | amoadd.d.aqrl x3, x5, 0(x4) ;
 lr.d x2, 0(x3)     | ld x6, 0(x7)                ;
 bne x2, x4, L01    |                             ;
 sc.d x6, x5, 0(x3) |                             ;
 bne x6, x4, L00    |                             ;
 fence rw, rw       |                             ;
 L01:               |                             ;
exists (1:x1=1 /\ 1:x6=0)

where the two stores in P0 can be reordered.  Update the RISC-V
JIT lowerings/implementation of BPF_CMPXCHG to emit an SC with
RELEASE ("rl") annotation in order to meet the expected memory
ordering guarantees.  The resulting RISC-V JIT lowerings of
BPF_CMPXCHG match the RISC-V lowerings of the C atomic_cmpxchg().

Fixes: dd642ccb45ec ("riscv, bpf: Implement more atomic operations for RV64")
Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
Link: https://lpc.events/event/18/contributions/1949/attachments/1665/3441/bpfmemmodel.2024.09.19p.pdf [1]
---
 arch/riscv/net/bpf_jit_comp64.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
index 99f34409fb60f..c207aa33c980b 100644
--- a/arch/riscv/net/bpf_jit_comp64.c
+++ b/arch/riscv/net/bpf_jit_comp64.c
@@ -548,8 +548,8 @@ static void emit_atomic(u8 rd, u8 rs, s16 off, s32 imm, bool is64,
 		     rv_lr_w(r0, 0, rd, 0, 0), ctx);
 		jmp_offset = ninsns_rvoff(8);
 		emit(rv_bne(RV_REG_T2, r0, jmp_offset >> 1), ctx);
-		emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 0) :
-		     rv_sc_w(RV_REG_T3, rs, rd, 0, 0), ctx);
+		emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 1) :
+		     rv_sc_w(RV_REG_T3, rs, rd, 0, 1), ctx);
 		jmp_offset = ninsns_rvoff(-6);
 		emit(rv_bne(RV_REG_T3, 0, jmp_offset >> 1), ctx);
 		emit(rv_fence(0x3, 0x3), ctx);
-- 
2.43.0
Re: [PATCH] riscv, bpf: Make BPF_CMPXCHG fully ordered
Posted by Puranjay Mohan 1 month, 1 week ago
Andrea Parri <parri.andrea@gmail.com> writes:

> According to the prototype formal BPF memory consistency model
> discussed e.g. in [1] and following the ordering properties of
> the C/in-kernel macro atomic_cmpxchg(), a BPF atomic operation
> with the BPF_CMPXCHG modifier is fully ordered.  However, the
> current RISC-V JIT lowerings fail to meet such memory ordering
> property.  This is illustrated by the following litmus test:
>
> BPF BPF__MP+success_cmpxchg+fence
> {
>  0:r1=x; 0:r3=y; 0:r5=1;
>  1:r2=y; 1:r4=f; 1:r7=x;
> }
>  P0                               | P1                                         ;
>  *(u64 *)(r1 + 0) = 1             | r1 = *(u64 *)(r2 + 0)                      ;
>  r2 = cmpxchg_64 (r3 + 0, r4, r5) | r3 = atomic_fetch_add((u64 *)(r4 + 0), r5) ;
>                                   | r6 = *(u64 *)(r7 + 0)                      ;
> exists (1:r1=1 /\ 1:r6=0)
>
> whose "exists" clause is not satisfiable according to the BPF
> memory model.  Using the current RISC-V JIT lowerings, the test
> can be mapped to the following RISC-V litmus test:
>
> RISCV RISCV__MP+success_cmpxchg+fence
> {
>  0:x1=x; 0:x3=y; 0:x5=1;
>  1:x2=y; 1:x4=f; 1:x7=x;
> }
>  P0                 | P1                          ;
>  sd x5, 0(x1)       | ld x1, 0(x2)                ;
>  L00:               | amoadd.d.aqrl x3, x5, 0(x4) ;
>  lr.d x2, 0(x3)     | ld x6, 0(x7)                ;
>  bne x2, x4, L01    |                             ;
>  sc.d x6, x5, 0(x3) |                             ;
>  bne x6, x4, L00    |                             ;
>  fence rw, rw       |                             ;
>  L01:               |                             ;
> exists (1:x1=1 /\ 1:x6=0)
>
> where the two stores in P0 can be reordered.  Update the RISC-V
> JIT lowerings/implementation of BPF_CMPXCHG to emit an SC with
> RELEASE ("rl") annotation in order to meet the expected memory
> ordering guarantees.  The resulting RISC-V JIT lowerings of
> BPF_CMPXCHG match the RISC-V lowerings of the C atomic_cmpxchg().

Thanks for fixing this, I fixed all others in:

20a759df3bba ("riscv, bpf: make some atomic operations fully ordered")

> Fixes: dd642ccb45ec ("riscv, bpf: Implement more atomic operations for RV64")
> Signed-off-by: Andrea Parri <parri.andrea@gmail.com>

Reviewed-by: Puranjay Mohan <puranjay@kernel.org>

> Link: https://lpc.events/event/18/contributions/1949/attachments/1665/3441/bpfmemmodel.2024.09.19p.pdf [1]
> ---
>  arch/riscv/net/bpf_jit_comp64.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/arch/riscv/net/bpf_jit_comp64.c b/arch/riscv/net/bpf_jit_comp64.c
> index 99f34409fb60f..c207aa33c980b 100644
> --- a/arch/riscv/net/bpf_jit_comp64.c
> +++ b/arch/riscv/net/bpf_jit_comp64.c
> @@ -548,8 +548,8 @@ static void emit_atomic(u8 rd, u8 rs, s16 off, s32 imm, bool is64,
>  		     rv_lr_w(r0, 0, rd, 0, 0), ctx);
>  		jmp_offset = ninsns_rvoff(8);
>  		emit(rv_bne(RV_REG_T2, r0, jmp_offset >> 1), ctx);
> -		emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 0) :
> -		     rv_sc_w(RV_REG_T3, rs, rd, 0, 0), ctx);
> +		emit(is64 ? rv_sc_d(RV_REG_T3, rs, rd, 0, 1) :
> +		     rv_sc_w(RV_REG_T3, rs, rd, 0, 1), ctx);
>  		jmp_offset = ninsns_rvoff(-6);
>  		emit(rv_bne(RV_REG_T3, 0, jmp_offset >> 1), ctx);
>  		emit(rv_fence(0x3, 0x3), ctx);
> -- 
> 2.43.0
Re: [PATCH] riscv, bpf: Make BPF_CMPXCHG fully ordered
Posted by Björn Töpel 1 month, 1 week ago
Thanks, Andrea!

Puranjay Mohan <puranjay@kernel.org> writes:

> Andrea Parri <parri.andrea@gmail.com> writes:
>
>> According to the prototype formal BPF memory consistency model
>> discussed e.g. in [1] and following the ordering properties of
>> the C/in-kernel macro atomic_cmpxchg(), a BPF atomic operation
>> with the BPF_CMPXCHG modifier is fully ordered.  However, the
>> current RISC-V JIT lowerings fail to meet such memory ordering
>> property.  This is illustrated by the following litmus test:
>>
>> BPF BPF__MP+success_cmpxchg+fence
>> {
>>  0:r1=x; 0:r3=y; 0:r5=1;
>>  1:r2=y; 1:r4=f; 1:r7=x;
>> }
>>  P0                               | P1                                         ;
>>  *(u64 *)(r1 + 0) = 1             | r1 = *(u64 *)(r2 + 0)                      ;
>>  r2 = cmpxchg_64 (r3 + 0, r4, r5) | r3 = atomic_fetch_add((u64 *)(r4 + 0), r5) ;
>>                                   | r6 = *(u64 *)(r7 + 0)                      ;
>> exists (1:r1=1 /\ 1:r6=0)
>>
>> whose "exists" clause is not satisfiable according to the BPF
>> memory model.  Using the current RISC-V JIT lowerings, the test
>> can be mapped to the following RISC-V litmus test:
>>
>> RISCV RISCV__MP+success_cmpxchg+fence
>> {
>>  0:x1=x; 0:x3=y; 0:x5=1;
>>  1:x2=y; 1:x4=f; 1:x7=x;
>> }
>>  P0                 | P1                          ;
>>  sd x5, 0(x1)       | ld x1, 0(x2)                ;
>>  L00:               | amoadd.d.aqrl x3, x5, 0(x4) ;
>>  lr.d x2, 0(x3)     | ld x6, 0(x7)                ;
>>  bne x2, x4, L01    |                             ;
>>  sc.d x6, x5, 0(x3) |                             ;
>>  bne x6, x4, L00    |                             ;
>>  fence rw, rw       |                             ;
>>  L01:               |                             ;
>> exists (1:x1=1 /\ 1:x6=0)
>>
>> where the two stores in P0 can be reordered.  Update the RISC-V
>> JIT lowerings/implementation of BPF_CMPXCHG to emit an SC with
>> RELEASE ("rl") annotation in order to meet the expected memory
>> ordering guarantees.  The resulting RISC-V JIT lowerings of
>> BPF_CMPXCHG match the RISC-V lowerings of the C atomic_cmpxchg().
>
> Thanks for fixing this, I fixed all others in:
>
> 20a759df3bba ("riscv, bpf: make some atomic operations fully ordered")
>
>> Fixes: dd642ccb45ec ("riscv, bpf: Implement more atomic operations for RV64")
>> Signed-off-by: Andrea Parri <parri.andrea@gmail.com>
>
> Reviewed-by: Puranjay Mohan <puranjay@kernel.org>

Acked-by: Björn Töpel <bjorn@kernel.org>