[PATCH 2/2] string: retire bcmp()

Mateusz Guzik posted 2 patches 1 year, 2 months ago
[PATCH 2/2] string: retire bcmp()
Posted by Mateusz Guzik 1 year, 2 months ago
While architectures could override it thanks to __HAVE_ARCH_BCMP, none
of them did. Instead it was implemented as a call to memcmp().

These routines differ in the API contract: memcmp()'s result indicates
which way the difference goes (making it usable for sorting), whereas
bcmp()'s result merely states whether the buffers differ in any way.

This means that a dedicated optimized bcmp() is cheaper to execute than
memcmp() for differing buffers as there is no need to compute the return
value.

However, per the above nobody bothered to write one and it is unclear if
it makes sense to do it.

Users which really want to compare stuff may want to handle it
differently (like e.g., the path lookup).

As there are no users and the code is merely a wrapper around memcmp(),
just whack it.

Signed-off-by: Mateusz Guzik <mjguzik@gmail.com>
---
 arch/x86/boot/string.c |  8 --------
 arch/x86/boot/string.h |  1 -
 include/linux/string.h |  3 ---
 lib/Makefile           |  3 +--
 lib/string.c           | 19 -------------------
 5 files changed, 1 insertion(+), 33 deletions(-)

diff --git a/arch/x86/boot/string.c b/arch/x86/boot/string.c
index c23f3b9c84fe..e8095abf03df 100644
--- a/arch/x86/boot/string.c
+++ b/arch/x86/boot/string.c
@@ -37,14 +37,6 @@ int memcmp(const void *s1, const void *s2, size_t len)
 	return diff;
 }
 
-/*
- * Clang may lower `memcmp == 0` to `bcmp == 0`.
- */
-int bcmp(const void *s1, const void *s2, size_t len)
-{
-	return memcmp(s1, s2, len);
-}
-
 int strcmp(const char *str1, const char *str2)
 {
 	const unsigned char *s1 = (const unsigned char *)str1;
diff --git a/arch/x86/boot/string.h b/arch/x86/boot/string.h
index e5d2c6b8c2f1..cf83faee78cc 100644
--- a/arch/x86/boot/string.h
+++ b/arch/x86/boot/string.h
@@ -11,7 +11,6 @@ void *memcpy(void *dst, const void *src, size_t len);
 void *memmove(void *dst, const void *src, size_t len);
 void *memset(void *dst, int c, size_t len);
 int memcmp(const void *s1, const void *s2, size_t len);
-int bcmp(const void *s1, const void *s2, size_t len);
 
 /* Access builtin version by default. */
 #define memcpy(d,s,l) __builtin_memcpy(d,s,l)
diff --git a/include/linux/string.h b/include/linux/string.h
index 0dd27afcfaf7..d4904951b627 100644
--- a/include/linux/string.h
+++ b/include/linux/string.h
@@ -267,9 +267,6 @@ extern void * memscan(void *,int,__kernel_size_t);
 #ifndef __HAVE_ARCH_MEMCMP
 extern int memcmp(const void *,const void *,__kernel_size_t);
 #endif
-#ifndef __HAVE_ARCH_BCMP
-extern int bcmp(const void *,const void *,__kernel_size_t);
-#endif
 #ifndef __HAVE_ARCH_MEMCHR
 extern void * memchr(const void *,int,__kernel_size_t);
 #endif
diff --git a/lib/Makefile b/lib/Makefile
index 773adf88af41..c382b43fe37e 100644
--- a/lib/Makefile
+++ b/lib/Makefile
@@ -265,8 +265,7 @@ obj-$(CONFIG_IRQ_POLL) += irq_poll.o
 obj-$(CONFIG_POLYNOMIAL) += polynomial.o
 
 # stackdepot.c should not be instrumented or call instrumented functions.
-# Prevent the compiler from calling builtins like memcmp() or bcmp() from this
-# file.
+# Prevent the compiler from calling builtins like memcmp() from this file.
 CFLAGS_stackdepot.o += -fno-builtin
 obj-$(CONFIG_STACKDEPOT) += stackdepot.o
 KASAN_SANITIZE_stackdepot.o := n
diff --git a/lib/string.c b/lib/string.c
index 76327b51e36f..3a319d1c5aae 100644
--- a/lib/string.c
+++ b/lib/string.c
@@ -680,25 +680,6 @@ __visible int memcmp(const void *cs, const void *ct, size_t count)
 EXPORT_SYMBOL(memcmp);
 #endif
 
-#ifndef __HAVE_ARCH_BCMP
-/**
- * bcmp - returns 0 if and only if the buffers have identical contents.
- * @a: pointer to first buffer.
- * @b: pointer to second buffer.
- * @len: size of buffers.
- *
- * The sign or magnitude of a non-zero return value has no particular
- * meaning, and architectures may implement their own more efficient bcmp(). So
- * while this particular implementation is a simple (tail) call to memcmp, do
- * not rely on anything but whether the return value is zero or non-zero.
- */
-int bcmp(const void *a, const void *b, size_t len)
-{
-	return memcmp(a, b, len);
-}
-EXPORT_SYMBOL(bcmp);
-#endif
-
 #ifndef __HAVE_ARCH_MEMSCAN
 /**
  * memscan - Find a character in an area of memory.
-- 
2.43.0
Re: [PATCH 2/2] string: retire bcmp()
Posted by kernel test robot 1 year, 2 months ago
Hi Mateusz,

kernel test robot noticed the following build errors:

[auto build test ERROR on kees/for-next/hardening]
[also build test ERROR on tip/master tip/x86/core linus/master v6.12 next-20241125]
[cannot apply to tip/auto-latest]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Mateusz-Guzik/x86-callthunks-s-bcmp-memcmp/20241125-110959
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/hardening
patch link:    https://lore.kernel.org/r/20241123094729.1099378-3-mjguzik%40gmail.com
patch subject: [PATCH 2/2] string: retire bcmp()
config: arm-footbridge_defconfig (https://download.01.org/0day-ci/archive/20241125/202411251926.sIYJ9sEV-lkp@intel.com/config)
compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 592c0fe55f6d9a811028b5f3507be91458ab2713)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241125/202411251926.sIYJ9sEV-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411251926.sIYJ9sEV-lkp@intel.com/

All errors (new ones prefixed by >>, old ones prefixed by <<):

>> ERROR: modpost: "bcmp" [fs/fat/fat.ko] undefined!
>> ERROR: modpost: "bcmp" [fs/isofs/isofs.ko] undefined!
>> ERROR: modpost: "bcmp" [fs/nfsd/nfsd.ko] undefined!
>> ERROR: modpost: "bcmp" [drivers/net/slip/slhc.ko] undefined!
>> ERROR: modpost: "bcmp" [drivers/usb/core/usbcore.ko] undefined!

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
Re: [PATCH 2/2] string: retire bcmp()
Posted by kernel test robot 1 year, 2 months ago
Hi Mateusz,

kernel test robot noticed the following build errors:

[auto build test ERROR on kees/for-next/hardening]
[also build test ERROR on tip/master tip/x86/core linus/master v6.12 next-20241125]
[cannot apply to tip/auto-latest]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Mateusz-Guzik/x86-callthunks-s-bcmp-memcmp/20241125-110959
base:   https://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/hardening
patch link:    https://lore.kernel.org/r/20241123094729.1099378-3-mjguzik%40gmail.com
patch subject: [PATCH 2/2] string: retire bcmp()
config: arm-allnoconfig (https://download.01.org/0day-ci/archive/20241125/202411251629.TKfhELVt-lkp@intel.com/config)
compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project 592c0fe55f6d9a811028b5f3507be91458ab2713)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241125/202411251629.TKfhELVt-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202411251629.TKfhELVt-lkp@intel.com/

All errors (new ones prefixed by >>):

>> ld.lld: error: undefined symbol: bcmp
   >>> referenced by do_mounts.c
   >>>               init/do_mounts.o:(parse_root_device) in archive vmlinux.a
   >>> referenced by do_mounts.c
   >>>               init/do_mounts.o:(parse_root_device) in archive vmlinux.a
   >>> referenced by do_mounts.c
   >>>               init/do_mounts.o:(parse_root_device) in archive vmlinux.a
   >>> referenced 37 more times

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
RE: [PATCH 2/2] string: retire bcmp()
Posted by David Laight 1 year, 2 months ago
From: Mateusz Guzik
> Sent: 23 November 2024 09:47
> 
> While architectures could override it thanks to __HAVE_ARCH_BCMP, none
> of them did. Instead it was implemented as a call to memcmp().
> 
> These routines differ in the API contract: memcmp()'s result indicates
> which way the difference goes (making it usable for sorting), whereas
> bcmp()'s result merely states whether the buffers differ in any way.
> 
> This means that a dedicated optimized bcmp() is cheaper to execute than
> memcmp() for differing buffers as there is no need to compute the return
> value.
> 
> However, per the above nobody bothered to write one and it is unclear if
> it makes sense to do it.
> 
> Users which really want to compare stuff may want to handle it
> differently (like e.g., the path lookup).
> 
> As there are no users and the code is merely a wrapper around memcmp(),
> just whack it.
> 
...
> 
> -/*
> - * Clang may lower `memcmp == 0` to `bcmp == 0`.
> - */
> -int bcmp(const void *s1, const void *s2, size_t len)
> -{
> -	return memcmp(s1, s2, len);
> -}
> -

As per the comment I thought that clang would sometimes generate
calls to bcmp().

So while the two symbols could refer to the same code I don't
think it can be removed.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Re: [PATCH 2/2] string: retire bcmp()
Posted by Nathan Chancellor 1 year, 2 months ago
Hi David,

Thanks for the CC.

On Sat, Nov 23, 2024 at 03:13:09PM +0000, David Laight wrote:
> From: Mateusz Guzik
> > Sent: 23 November 2024 09:47
> > 
> > While architectures could override it thanks to __HAVE_ARCH_BCMP, none
> > of them did. Instead it was implemented as a call to memcmp().
> > 
> > These routines differ in the API contract: memcmp()'s result indicates
> > which way the difference goes (making it usable for sorting), whereas
> > bcmp()'s result merely states whether the buffers differ in any way.
> > 
> > This means that a dedicated optimized bcmp() is cheaper to execute than
> > memcmp() for differing buffers as there is no need to compute the return
> > value.
> > 
> > However, per the above nobody bothered to write one and it is unclear if
> > it makes sense to do it.
> > 
> > Users which really want to compare stuff may want to handle it
> > differently (like e.g., the path lookup).
> > 
> > As there are no users and the code is merely a wrapper around memcmp(),
> > just whack it.
> > 
> ...
> > 
> > -/*
> > - * Clang may lower `memcmp == 0` to `bcmp == 0`.
> > - */
> > -int bcmp(const void *s1, const void *s2, size_t len)
> > -{
> > -	return memcmp(s1, s2, len);
> > -}
> > -
> 
> As per the comment I thought that clang would sometimes generate
> calls to bcmp().
> 
> So while the two symbols could refer to the same code I don't
> think it can be removed.

Right, commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
explicitly added bcmp() to lib/string.c because LLVM will emit calls to
bcmp instead of memcmp in certain circumstances [1], an optimization
that still exists, thus this patch would trigger new errors at link or
modpost time:

  ERROR: modpost: "bcmp" [arch/x86/kvm/kvm.ko] undefined!
  ERROR: modpost: "bcmp" [arch/x86/kvm/kvm-intel.ko] undefined!
  ERROR: modpost: "bcmp" [fs/quota/quota_v2.ko] undefined!
  ERROR: modpost: "bcmp" [fs/dlm/dlm.ko] undefined!
  ERROR: modpost: "bcmp" [fs/netfs/netfs.ko] undefined!
  ERROR: modpost: "bcmp" [fs/ext4/ext4.ko] undefined!
  ERROR: modpost: "bcmp" [fs/minix/minix.ko] undefined!
  ERROR: modpost: "bcmp" [fs/fat/fat.ko] undefined!
  ERROR: modpost: "bcmp" [fs/isofs/isofs.ko] undefined!
  ERROR: modpost: "bcmp" [fs/nfs/nfs.ko] undefined!
  WARNING: modpost: suppressed 254 unresolved symbol warnings because there were too many)

  ld.lld: error: undefined symbol: bcmp
  >>> referenced by fortify-string.h:715 (include/linux/fortify-string.h:715)
  >>>               vmlinux.o:(load_pdptrs)
  >>> referenced by fortify-string.h:715 (include/linux/fortify-string.h:715)
  >>>               vmlinux.o:(kvm_arch_irqfd_route_changed)
  >>> referenced by fortify-string.h:715 (include/linux/fortify-string.h:715)
  >>>               vmlinux.o:(vmx_check_processor_compat)
  >>> referenced 438 more times
  >>> did you mean: bacmp
  >>> defined in: vmlinux.o

Please do not apply this patch. If we need to shore up the comment to
make this explicit, I am happy to do so.

[1]: https://github.com/llvm/llvm-project/commit/8e16d73346f8091461319a7dfc4ddd18eedcff13

Cheers,
Nathan
Re: [PATCH 2/2] string: retire bcmp()
Posted by Mateusz Guzik 1 year, 2 months ago
On Sat, Nov 23, 2024 at 8:09 PM Nathan Chancellor <nathan@kernel.org> wrote:
>
> Hi David,
>
> Thanks for the CC.
>
> On Sat, Nov 23, 2024 at 03:13:09PM +0000, David Laight wrote:
> > From: Mateusz Guzik
> > > Sent: 23 November 2024 09:47
> > >
> > > While architectures could override it thanks to __HAVE_ARCH_BCMP, none
> > > of them did. Instead it was implemented as a call to memcmp().
> > >
> > > These routines differ in the API contract: memcmp()'s result indicates
> > > which way the difference goes (making it usable for sorting), whereas
> > > bcmp()'s result merely states whether the buffers differ in any way.
> > >
> > > This means that a dedicated optimized bcmp() is cheaper to execute than
> > > memcmp() for differing buffers as there is no need to compute the return
> > > value.
> > >
> > > However, per the above nobody bothered to write one and it is unclear if
> > > it makes sense to do it.
> > >
> > > Users which really want to compare stuff may want to handle it
> > > differently (like e.g., the path lookup).
> > >
> > > As there are no users and the code is merely a wrapper around memcmp(),
> > > just whack it.
> > >
> > ...
> > >
> > > -/*
> > > - * Clang may lower `memcmp == 0` to `bcmp == 0`.
> > > - */
> > > -int bcmp(const void *s1, const void *s2, size_t len)
> > > -{
> > > -   return memcmp(s1, s2, len);
> > > -}
> > > -
> >
> > As per the comment I thought that clang would sometimes generate
> > calls to bcmp().
> >
> > So while the two symbols could refer to the same code I don't
> > think it can be removed.
>
> Right, commit 5f074f3e192f ("lib/string.c: implement a basic bcmp")
> explicitly added bcmp() to lib/string.c because LLVM will emit calls to
> bcmp instead of memcmp in certain circumstances [1], an optimization
> that still exists, thus this patch would trigger new errors at link or
> modpost time:
>
>   ERROR: modpost: "bcmp" [arch/x86/kvm/kvm.ko] undefined!
>   ERROR: modpost: "bcmp" [arch/x86/kvm/kvm-intel.ko] undefined!
>   ERROR: modpost: "bcmp" [fs/quota/quota_v2.ko] undefined!
>   ERROR: modpost: "bcmp" [fs/dlm/dlm.ko] undefined!
>   ERROR: modpost: "bcmp" [fs/netfs/netfs.ko] undefined!
>   ERROR: modpost: "bcmp" [fs/ext4/ext4.ko] undefined!
>   ERROR: modpost: "bcmp" [fs/minix/minix.ko] undefined!
>   ERROR: modpost: "bcmp" [fs/fat/fat.ko] undefined!
>   ERROR: modpost: "bcmp" [fs/isofs/isofs.ko] undefined!
>   ERROR: modpost: "bcmp" [fs/nfs/nfs.ko] undefined!
>   WARNING: modpost: suppressed 254 unresolved symbol warnings because there were too many)
>
>   ld.lld: error: undefined symbol: bcmp
>   >>> referenced by fortify-string.h:715 (include/linux/fortify-string.h:715)
>   >>>               vmlinux.o:(load_pdptrs)
>   >>> referenced by fortify-string.h:715 (include/linux/fortify-string.h:715)
>   >>>               vmlinux.o:(kvm_arch_irqfd_route_changed)
>   >>> referenced by fortify-string.h:715 (include/linux/fortify-string.h:715)
>   >>>               vmlinux.o:(vmx_check_processor_compat)
>   >>> referenced 438 more times
>   >>> did you mean: bacmp
>   >>> defined in: vmlinux.o
>
> Please do not apply this patch. If we need to shore up the comment to
> make this explicit, I am happy to do so.
>
> [1]: https://github.com/llvm/llvm-project/commit/8e16d73346f8091461319a7dfc4ddd18eedcff13
>

Hi guys, I just tested and indeed clang will convert some of the
memcmp uses into bcmp.

However, given that this is counterproductive in the kernel (it merely
adds a detour through another symbol), the thing to do here is to
convince clang to *not* do it and then whack the symbol

I admit I did not bother checking how bcmp came to be because I assume
it's a leftover -- it was the standard comparison routine back in the
day in the BSD land for example.

-- 
Mateusz Guzik <mjguzik gmail.com>
Re: [PATCH 2/2] string: retire bcmp()
Posted by Andy Shevchenko 1 year, 2 months ago
On Sat, Nov 23, 2024 at 5:13 PM David Laight <David.Laight@aculab.com> wrote:
> From: Mateusz Guzik
> > Sent: 23 November 2024 09:47

...

> > -/*
> > - * Clang may lower `memcmp == 0` to `bcmp == 0`.
> > - */

> As per the comment I thought that clang would sometimes generate
> calls to bcmp().
>
> So while the two symbols could refer to the same code I don't
> think it can be removed.

You beat me to it, I came just to say the same thing.

-- 
With Best Regards,
Andy Shevchenko