From: Ammar Faizi <ammarfaizi2@gnuweeb.org> > Sent: 30 August 2023 14:57 > > This is an RFC patchset for nolibc x86-64 string functions. There are 5 > patches in this series. > > ## Patch 1-3: Use `rep movsb`, `rep stosb`, and `rep cmpsb` for: > - memcpy() and memmove() > - memset() > - memcmp() > respectively. They can simplify the generated ASM code. > ... > After this series: > ``` > 000000000000140a <memmove>: > 140a: 48 89 f8 mov %rdi,%rax > 140d: 48 89 d1 mov %rdx,%rcx > 1410: 48 8d 7c 0f ff lea -0x1(%rdi,%rcx,1),%rdi > 1415: 48 8d 74 0e ff lea -0x1(%rsi,%rcx,1),%rsi > 141a: fd std > 141b: f3 a4 rep movsb %ds:(%rsi),%es:(%rdi) > 141d: fc cld > 141e: c3 ret Isn't that completely broken? You need to select between forwards and backwards moves. Since forwards moves are preferred it is best to do if (dst - src < len) backards_copy() else formwards_copy() David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Fri, Sep 01, 2023 at 11:34:18AM +0000, David Laight wrote: > From: Ammar Faizi <ammarfaizi2@gnuweeb.org> > > Sent: 30 August 2023 14:57 > > > > This is an RFC patchset for nolibc x86-64 string functions. There are 5 > > patches in this series. > > > > ## Patch 1-3: Use `rep movsb`, `rep stosb`, and `rep cmpsb` for: > > - memcpy() and memmove() > > - memset() > > - memcmp() > > respectively. They can simplify the generated ASM code. > > > ... > > After this series: > > ``` > > 000000000000140a <memmove>: > > 140a: 48 89 f8 mov %rdi,%rax > > 140d: 48 89 d1 mov %rdx,%rcx > > 1410: 48 8d 7c 0f ff lea -0x1(%rdi,%rcx,1),%rdi > > 1415: 48 8d 74 0e ff lea -0x1(%rsi,%rcx,1),%rsi > > 141a: fd std > > 141b: f3 a4 rep movsb %ds:(%rsi),%es:(%rdi) > > 141d: fc cld > > 141e: c3 ret > > Isn't that completely broken? > > You need to select between forwards and backwards moves. > Since forwards moves are preferred it is best to do > if (dst - src < len) > backards_copy() > else > formwards_copy() > > David You're completely right indeed, reminds me about the copy_up/copy_down that were not used anymore :-) Willy
On Fri, Sep 01, 2023 at 01:46:44PM +0200, Willy Tarreau wrote: > On Fri, Sep 01, 2023 at 11:34:18AM +0000, David Laight wrote: > > Isn't that completely broken? > > > > You need to select between forwards and backwards moves. > > Since forwards moves are preferred it is best to do > > if (dst - src < len) > > backards_copy() > > else > > formwards_copy() > > > > David > > You're completely right indeed, reminds me about the copy_up/copy_down > that were not used anymore :-) I'm an idiot, will fix that. Another attempt as suggested below: __asm__ ( ".section .text.nolibc_memmove\n" ".weak memmove\n" "memmove:\n" " movq %rdx, %rcx\n" " movq %rdi, %rdx\n" " movq %rdi, %rax\n" " subq %rsi, %rdx\n" " cmpq %rcx, %rdx\n" " jnb .Lforward_copy\n" " leaq -1(%rdi, %rcx, 1), %rdi\n" " leaq -1(%rsi, %rcx, 1), %rsi\n" " std\n" ".Lforward_copy:\n" " rep movsb\n" " cld\n" " ret\n" ); -- Ammar Faizi
From: Ammar Faizi > Sent: 01 September 2023 14:06 ... > > You're completely right indeed, reminds me about the copy_up/copy_down > > that were not used anymore :-) > > I'm an idiot, will fix that. Another attempt as suggested below: > > __asm__ ( > ".section .text.nolibc_memmove\n" > ".weak memmove\n" > "memmove:\n" > " movq %rdx, %rcx\n" > " movq %rdi, %rdx\n" > " movq %rdi, %rax\n" You seem to have confused yourself about whether you are using %eax or %edx. > " subq %rsi, %rdx\n" > " cmpq %rcx, %rdx\n" > " jnb .Lforward_copy\n" I think I'd fall through to the forwards copy and not worry about replicating the 'reps movsb' and 'ret'. IIRC 'cld' can be slow as well. > " leaq -1(%rdi, %rcx, 1), %rdi\n" > " leaq -1(%rsi, %rcx, 1), %rsi\n" > " std\n" > ".Lforward_copy:\n" > " rep movsb\n" > " cld\n" > " ret\n" > ); > > -- > Ammar Faizi David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Fri, Sep 01, 2023 at 02:23:28PM +0000, David Laight wrote: > From: Ammar Faizi > > Sent: 01 September 2023 14:06 ... > > __asm__ ( > > ".section .text.nolibc_memmove\n" > > ".weak memmove\n" > > "memmove:\n" > > " movq %rdx, %rcx\n" > > " movq %rdi, %rdx\n" > > " movq %rdi, %rax\n" > > You seem to have confused yourself about whether you are using %eax or %edx. What do you mean? They're all 64-bit pointers. What I know is that the %rdx will be clobbered by "subq %rsi, %rdx" below and the %rax should be return value. That's why I copy the %rdi twice. memmove() returns the dst pointer. Did I miss something? > > " subq %rsi, %rdx\n" > > " cmpq %rcx, %rdx\n" > > " jnb .Lforward_copy\n" > > I think I'd fall through to the forwards copy > and not worry about replicating the 'reps movsb' and 'ret'. > IIRC 'cld' can be slow as well. Alright, I will avoid cld for the forward copy. > > " leaq -1(%rdi, %rcx, 1), %rdi\n" > > " leaq -1(%rsi, %rcx, 1), %rsi\n" > > " std\n" > > ".Lforward_copy:\n" > > " rep movsb\n" > > " cld\n" > > " ret\n" > > ); -- Ammar Faizi
From: Ammar Faizi > Sent: 01 September 2023 15:42 ... > > > " movq %rdx, %rcx\n" > > > " movq %rdi, %rdx\n" > > > " movq %rdi, %rax\n" > > > > You seem to have confused yourself about whether you are using %eax or %edx. > > What do you mean? They're all 64-bit pointers. %ax, %eax, %rax - what is the difference :-) > What I know is that the %rdx will be clobbered by "subq %rsi, %rdx" > below and the %rax should be return value. That's why I copy the %rdi > twice. memmove() returns the dst pointer. Did I miss something? I'd forgotten about the (stupid) return value. I'm pretty sure it is an accident from the original pdp-11 implementation from the days before C had an explicit 'return' statement. (The pdp-11 I used ran RSX/11M - so had a Fortran compiler not a C one.) David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Fri, Sep 01, 2023 at 02:54:56PM +0000, David Laight wrote: > I'd forgotten about the (stupid) return value. > > I'm pretty sure it is an accident from the original pdp-11 > implementation from the days before C had an explicit 'return' > statement. > (The pdp-11 I used ran RSX/11M - so had a Fortran compiler > not a C one.) You're old. I did not exist in that era. And my parents were still young :-) -- Ammar Faizi
© 2016 - 2026 Red Hat, Inc.