From nobody Sat Apr 18 21:01:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76917C43334 for ; Sun, 10 Jul 2022 14:28:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229567AbiGJO2q (ORCPT ); Sun, 10 Jul 2022 10:28:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44742 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229536AbiGJO2m (ORCPT ); Sun, 10 Jul 2022 10:28:42 -0400 Received: from mail-pg1-x532.google.com (mail-pg1-x532.google.com [IPv6:2607:f8b0:4864:20::532]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 527E56597 for ; Sun, 10 Jul 2022 07:28:41 -0700 (PDT) Received: by mail-pg1-x532.google.com with SMTP id f11so1893967pgj.7 for ; Sun, 10 Jul 2022 07:28:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=2Kqv0NaXAzodot31wDs9fWlsKOTwvlvAKb0cG1vQqm0=; b=ASwXIDiOFiON0FESB8DDrso/r7805axt7PbpWEiJXAMDgkDgQ66ysRTfsYWCp56ZC6 UEj8Xv+taAeCTWPd37ri7/aDuOekUDlGrIBU+qrhNJAfNVhiU8NHC5Y4soY32IBK7XrJ j0RR9qKy0TKeuxa+xbDrFOpGl/gkPn8GfcTMhd68/qqr7j7BtOHeIP7RbqwCgNObCP+j SOMFlCbUDCG3X91LmealXDDDip729ecIYNM5T9BKTcQew9ChaMnJNLKDd55QOY6g1y0P mQK5Ctsl3nbewQQUpz/mGjo6RkYFAx2h7RZfW9RWjeBye7YWzMZi6toL/RGJ17D/uxde 8PsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=2Kqv0NaXAzodot31wDs9fWlsKOTwvlvAKb0cG1vQqm0=; b=nCONbt0JrswqnLWKxr/LCGrprXJ9kwMXLwJLG8xRRHGjjHipFB3SI7yB3pCMQJi01V 96QB0FVKbcj0Yawd4hphJRsadW/X4wPMcsabD+fOmEm7XPfG003bHLKtSa0/HT6EmKvL 1mOZP6Rtan5JyF929z4lQxx+VJAp9poO2hYSBcU3AgU8EzKniy2vcFf5RbiZB5sV/FY6 Ip7sscoNRAEVB/H5LgA0aae6lnFICbxi2xC7OG3XrcCU4M/F8xujA2bSIl+/itO4MIzJ ElkRESRD71i8jt1odwp3umUzi3Yck/5UOd0nmP6OhCWt+N8mLIw8fYMZUlRpbDbOqnXD UI5A== X-Gm-Message-State: AJIora8OOeCVG1O2pnEqci3kXvCnex/xb+8356fFjFoQNWDEH8X1Rgx1 Z/jargRJTb5ZRmaOPC8wT0dQeyOIOsc= X-Google-Smtp-Source: AGRyM1uyEUYfrX9ZZWS+37MtKjYuhuljbKKSr4HuJ8z4YDYDK5lNuG5TrqDhMakL8oHI24keAqBx9g== X-Received: by 2002:a63:6d5:0:b0:412:ac9d:814e with SMTP id 204-20020a6306d5000000b00412ac9d814emr12358255pgg.90.1657463320782; Sun, 10 Jul 2022 07:28:40 -0700 (PDT) Received: from localhost.localdomain ([101.12.54.120]) by smtp.gmail.com with ESMTPSA id u28-20020a63471c000000b0040c644e82efsm2508555pga.43.2022.07.10.07.28.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Jul 2022 07:28:40 -0700 (PDT) From: Yu-Jen Chang To: andy@kernel.org, akinobu.mita@gmail.com Cc: jserv@ccns.ncku.edu.tw, linux-kernel@vger.kernel.org, Yu-Jen Chang Subject: [PATCH 1/2] lib/string.c: Add a macro for memchr_inv() Date: Sun, 10 Jul 2022 22:28:21 +0800 Message-Id: <20220710142822.52539-2-arthurchang09@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220710142822.52539-1-arthurchang09@gmail.com> References: <20220710142822.52539-1-arthurchang09@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" We add a macro MEMCHR_MASK_GEN() so that both memchr_inv() and memchr() can use it to generate a 8 bytes mask. Signed-off-by: Yu-Jen Chang Signed-off-by: Ching-Chun (Jim) Huang --- lib/string.c | 34 ++++++++++++++++++++++++---------- 1 file changed, 24 insertions(+), 10 deletions(-) diff --git a/lib/string.c b/lib/string.c index 485777c9d..80469e6c3 100644 --- a/lib/string.c +++ b/lib/string.c @@ -879,6 +879,29 @@ char *strnstr(const char *s1, const char *s2, size_t l= en) EXPORT_SYMBOL(strnstr); #endif =20 +#if defined(CONFIG_ARCH_HAS_FAST_MULTIPLIER) && BITS_PER_LONG =3D=3D 64 + +#define MEMCHR_MASK_GEN(mask) (mask *=3D 0x0101010101010101ULL) + +#elif defined(CONFIG_ARCH_HAS_FAST_MULTIPLIER) + +#define MEMCHR_MASK_GEN(mask) = \ + do { \ + mask *=3D 0x01010101; \ + mask |=3D mask << 32; \ + } while (0) + +#else + +#define MEMCHR_MASK_GEN(mask) = \ + do { \ + mask |=3D mask << 8; \ + mask |=3D mask << 16; \ + mask |=3D mask << 32; \ + } while (0) + +#endif + #ifndef __HAVE_ARCH_MEMCHR /** * memchr - Find a character in an area of memory. @@ -932,16 +955,7 @@ void *memchr_inv(const void *start, int c, size_t byte= s) return check_bytes8(start, value, bytes); =20 value64 =3D value; -#if defined(CONFIG_ARCH_HAS_FAST_MULTIPLIER) && BITS_PER_LONG =3D=3D 64 - value64 *=3D 0x0101010101010101ULL; -#elif defined(CONFIG_ARCH_HAS_FAST_MULTIPLIER) - value64 *=3D 0x01010101; - value64 |=3D value64 << 32; -#else - value64 |=3D value64 << 8; - value64 |=3D value64 << 16; - value64 |=3D value64 << 32; -#endif + MEMCHR_MASK_GEN(value64); =20 prefix =3D (unsigned long)start % 8; if (prefix) { --=20 2.25.1 From nobody Sat Apr 18 21:01:06 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8DD1C433EF for ; Sun, 10 Jul 2022 14:28:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229590AbiGJO2u (ORCPT ); Sun, 10 Jul 2022 10:28:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44862 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229566AbiGJO2q (ORCPT ); Sun, 10 Jul 2022 10:28:46 -0400 Received: from mail-pl1-x636.google.com (mail-pl1-x636.google.com [IPv6:2607:f8b0:4864:20::636]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9918465F4 for ; Sun, 10 Jul 2022 07:28:44 -0700 (PDT) Received: by mail-pl1-x636.google.com with SMTP id c13so2477811pla.6 for ; Sun, 10 Jul 2022 07:28:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=G6ZV3koDu5LQcC6SxS/8lxYec7GbAece89S8C2Dtw9E=; b=j7X9DUrx9YiONkpf87mV/OuXUQ188nb56/tQ+4P6Kp2r4kYDiKaTfEAnusIBJ6ht5O T2AphCO7KznydVdND/nmerqq4I7FBjydbNuucmtuKz+5TdNVuNGodnK6fYQs8wyfAdzv SVRJDrDbY/WUp0HXR5I1dd0+oI85hTTRDXEsQenEftyiw1fY4dRzstScAb7yPEBRJoRr ++hPS1LiQi9C4kQ7i2jdHZmvYkwD4ToDRXxd5cH2Pxhl1HG0S7Eg69+Ix3vxg1LwU0vp s7hMZSdsbjFVGaT/Lciop5GRJG9PyMMNjkbKJP8yMe1A+x86MILy3y+UoTUs+Ue+od7G qnkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=G6ZV3koDu5LQcC6SxS/8lxYec7GbAece89S8C2Dtw9E=; b=utp/1ao4y46eOKwmJwqCIXqFkBCcTnOZIdk+Q4y9QnTWEje/cBczzADqfP1H5v8Wri tO8K1IyrmxB0gRg9f+pDBskqEkANmidUimhjR5SSVRBXfITKUJ4hnLTMroc/29ogk6BF iSnMadV+oAjthletogUTPsO5OXfD+4UA+O15Jog5K4ORkgdt6znSVSdCibB6FVJShdVu K9zEuwv+87WP0rxhMAFBR3T2zZWpAVT+Nj0k7XLHHj7keF+qe3s5ag5XCbpIKgFQfKa0 tjg8Ov4v/AtMMp67lklIY3N81NChLbNDVapXGSzASMPSYiyhhG+eL3ve58+amwhWO6Hv PPDA== X-Gm-Message-State: AJIora9xmP2sNHtOREeAJi6srNi3fHZujHnUuFBnTP719VGR/bwKB6Rv FbHA31/d37KbUxY1l988okE= X-Google-Smtp-Source: AGRyM1sZ9f+huO/77/vh3cmRje8uZA3Z2BOu+0ZBJw1HnTrdKLjEt8u+Z+J41KiLyN38tNl2f1J0eA== X-Received: by 2002:a17:90b:3e86:b0:1ec:f7e8:e4e4 with SMTP id rj6-20020a17090b3e8600b001ecf7e8e4e4mr12066272pjb.218.1657463324073; Sun, 10 Jul 2022 07:28:44 -0700 (PDT) Received: from localhost.localdomain ([101.12.54.120]) by smtp.gmail.com with ESMTPSA id u28-20020a63471c000000b0040c644e82efsm2508555pga.43.2022.07.10.07.28.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Jul 2022 07:28:43 -0700 (PDT) From: Yu-Jen Chang To: andy@kernel.org, akinobu.mita@gmail.com Cc: jserv@ccns.ncku.edu.tw, linux-kernel@vger.kernel.org, Yu-Jen Chang Subject: [PATCH 2/2] lib/string.c: Optimize memchr() Date: Sun, 10 Jul 2022 22:28:22 +0800 Message-Id: <20220710142822.52539-3-arthurchang09@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220710142822.52539-1-arthurchang09@gmail.com> References: <20220710142822.52539-1-arthurchang09@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The original version of memchr() is implemented with the byte-wise comparing technique, which does not fully use 64-bits or 32-bits registers in CPU. We use word-wide comparing so that 8 characters can be compared at the same time on CPU. This code is base on David Laight's implementation. We create two files to measure the performance. The first file contains on average 10 characters ahead the target character. The second file contains at least 1000 characters ahead the target character. Our implementation of =E2=80=9Cmemchr()=E2=80=9D is sligh= tly better in the first test and nearly 4x faster than the orginal implementation in the second test. Signed-off-by: Yu-Jen Chang Signed-off-by: Ching-Chun (Jim) Huang Reported-by: kernel test robot --- lib/string.c | 28 +++++++++++++++++++++------- 1 file changed, 21 insertions(+), 7 deletions(-) diff --git a/lib/string.c b/lib/string.c index 80469e6c3..8ca965431 100644 --- a/lib/string.c +++ b/lib/string.c @@ -905,21 +905,35 @@ EXPORT_SYMBOL(strnstr); #ifndef __HAVE_ARCH_MEMCHR /** * memchr - Find a character in an area of memory. - * @s: The memory area + * @p: The memory area * @c: The byte to search for - * @n: The size of the area. + * @length: The size of the area. * * returns the address of the first occurrence of @c, or %NULL * if @c is not found */ -void *memchr(const void *s, int c, size_t n) +void *memchr(const void *p, int c, unsigned long length) { - const unsigned char *p =3D s; - while (n-- !=3D 0) { - if ((unsigned char)c =3D=3D *p++) { - return (void *)(p - 1); + u64 mask, val; + const void *end =3D p + length; + + c &=3D 0xff; + if (p <=3D end - 8) { + mask =3D c; + MEMCHR_MASK_GEN(mask); + + for (; p <=3D end - 8; p +=3D 8) { + val =3D *(u64 *)p ^ mask; + if ((val + 0xfefefefefefefeffu) & + (~val & 0x8080808080808080u)) + break; } } + + for (; p < end; p++) + if (*(unsigned char *)p =3D=3D c) + return (void *)p; + return NULL; } EXPORT_SYMBOL(memchr); --=20 2.25.1