From nobody Tue Dec 2 01:05:42 2025 Received: from mail-pf1-f181.google.com (mail-pf1-f181.google.com [209.85.210.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9116E257843 for ; Sat, 22 Nov 2025 04:39:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763786383; cv=none; b=g5w++UGZNWNffqybc3ufm6vWzBMTBi7XiMdbk9Pg/CjyKMV91b813/drCawKThLR4NzjdGzGwpSJj7ioHk/mVRWNekhxGTSYeq7ftCFiBw0bMKPQJ2cCHm1ve0xTgqhLOZUYMd+eWvfPoseOod03IZlUPi05y20RaU3ApaXYcp4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1763786383; c=relaxed/simple; bh=0PJHlRdDNHgFhKZLahfMtcd2ubwxvw8Pv/LLSCLsO4w=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=AbjNvaR+Dl2amSzOWdOJLD7luBaFULKOVGz6HAKhL0V68CPal86RcutW/RmcLcA0s8+/tB50GDLP51KYVxggYhS6v+7o0auDSweRVh+n4oa67KylX5YLYu4Ix7NoK1ovsnjfuOMs4v96vZasDQlJJj70Gh/elOjOhK0GUhlG4GY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gms.tku.edu.tw; spf=pass smtp.mailfrom=gms.tku.edu.tw; dkim=pass (2048-bit key) header.d=gms-tku-edu-tw.20230601.gappssmtp.com header.i=@gms-tku-edu-tw.20230601.gappssmtp.com header.b=2sbMIj/0; arc=none smtp.client-ip=209.85.210.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=gms.tku.edu.tw Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gms.tku.edu.tw Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gms-tku-edu-tw.20230601.gappssmtp.com header.i=@gms-tku-edu-tw.20230601.gappssmtp.com header.b="2sbMIj/0" Received: by mail-pf1-f181.google.com with SMTP id d2e1a72fcca58-7b9215e55e6so1901866b3a.2 for ; Fri, 21 Nov 2025 20:39:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gms-tku-edu-tw.20230601.gappssmtp.com; s=20230601; t=1763786379; x=1764391179; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=c7C3mqrCsMVwpowQpUu7S+e5Gpr1AgruIqWh0f/HNmo=; b=2sbMIj/0XKlcFvp9NmCgb2raj/fIhZbZMbP3Yh/PIq8f34bmOZ00iLhpBRe7q9ovwz vRuSrmGSPdVJ0wVqQ10IVWXMzyL0HghSZwm/iAG/bukH58TZVkgK2e7Ryk/3rV351xeh WUf4Vb7pwH2CcMKaKqlcqEyoL5O0Z1pOTYWnz/G7lEO/XjoTw1n+r6B+KAi/04oi0pbc yfJr27yD7a0CAcW6Zj49gfV3kvxJrD1NxcUnAhuqed/SJhqmoPBEtURZFTLuEj1O5faj SsQMEOP0xP9ZjwPOBPCyusQ05v5hbovQ3gnAUM6sQLtXdX2L2CFGR7yiLnoLtjd67fmH yj5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1763786379; x=1764391179; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=c7C3mqrCsMVwpowQpUu7S+e5Gpr1AgruIqWh0f/HNmo=; b=A6XYN1F0y0d4E57C9TGe3c5y661dGt2rLvp9FVqURvaVZeymhQNMF0p5n7qiNJZvbz K+kib8NyOEG5LJQK4Zl20UG4Q84dzTSQFIEleKTYW2g0CSPBRzwkQwKExBJhJ119jiru JPMOrddeoy674hQPbtd/7feQ3/aQLKrUgjxf3HwQdN82hYItEoA+Ok2Q31H4MjGGIMWk 7SlBgUttBGo2VyKdFpFZ0q7krRpExx/PVRRWJqsLuN9N32iy0rhm7Swh9FJJBaDXtuXv p+gbpkeoYP3x12Dpuppxu8YepfhURmOtKq/jRcDAIYQVBUKHStc/MIiVdwtMuUazbtGQ RHVg== X-Forwarded-Encrypted: i=1; AJvYcCWTPSpmJHne/3fZDdJDMn2yv7Mdz9HJldxRcbf7mgR5IItqO0hT52hJlW3c69PTN+cADl79a33WaOfIygE=@vger.kernel.org X-Gm-Message-State: AOJu0YxydJ8MobfJiD70jZx0kF6l7iwUHVzxvtdVGG4CxLElABdxvRke aR2my7fk5EDpvQ8d1a6ra0xw9Y3uLhFNvdFsD0oSrQDRveDRrpgCg4c+9h68siLdCc0= X-Gm-Gg: ASbGnctDKRmXbQb+h86qYutCfMRzuecCuCnSJ5wKUTTxrTfbiqqK6pJRj9T7mLWXy1m pVsM6G+Cxn6nJRx5ukTXcfyPGxEgCOrUvCJ1D/S6k1JrFRd7xLVJROSseHGgvDfnFm3D59RpwE/ ItNms1noir2LzC3friFnVg1oCU+YsIm3qX9q7afnItaGAhMPGV2XT0WvO6ze1/SiMJi8yrPgcQm gcJXHnukczXv1W5sM7lFp8AHQZ7GEWxYm/sv5XbH5QauXyenIN3lg0XUAZRDbSjooQbsfC2QWU9 NUHeaY2CR6fAlEHkUh1hILTVwYV/alYNBJSrDnEFi2lgSBccbvJxK8avsoXa6QJWAC/jWoVkEAq mJX0FcXm1XqISEZxUf/LgQdYfQlhzBJ9ikX6tjThM8H5FHaQT0eB/Onmnvb6PSFMXt5YTjPQol+ ls0+FKZy6Suq2CSLKClVeKU5PdgoYZrO/3HA== X-Google-Smtp-Source: AGHT+IFjRdWdT+ZczsEHOR0lZAgSTTpkYWA9lKk0ziJQeV3FWTigkgKWYKSTUFKFttAN59yjmKKKaA== X-Received: by 2002:a05:6a00:8d6:b0:7b9:7f18:c716 with SMTP id d2e1a72fcca58-7c58c2a7c48mr4545485b3a.1.1763786378738; Fri, 21 Nov 2025 20:39:38 -0800 (PST) Received: from wu-Pro-E500-G6-WS720T.. ([2001:288:7001:2703:f19:917c:589d:681d]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-7c3f0b63dbcsm7618838b3a.50.2025.11.21.20.39.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 21 Nov 2025 20:39:38 -0800 (PST) From: Guan-Chun Wu <409411716@gms.tku.edu.tw> To: tytso@mit.edu, adilger.kernel@dilger.ca Cc: linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, visitorckw@gmail.com, david.laight.linux@gmail.com, Guan-Chun Wu <409411716@gms.tku.edu.tw> Subject: [PATCH v2] ext4: improve str2hashbuf by processing 4-byte chunks and removing function pointers Date: Sat, 22 Nov 2025 12:39:29 +0800 Message-Id: <20251122043929.1908643-1-409411716@gms.tku.edu.tw> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The original byte-by-byte implementation with modulo checks is less efficient. Refactor str2hashbuf_unsigned() and str2hashbuf_signed() to process input in explicit 4-byte chunks instead of using a modulus-based loop to emit words byte by byte. Additionally, the use of function pointers for selecting the appropriate str2hashbuf implementation has been removed. Instead, the functions are directly invoked based on the hash type, eliminating the overhead of dynamic function calls. Performance test (x86_64, Intel Core i7-10700 @ 2.90GHz, average over 10000 runs, using kernel module for testing): len | orig_s | new_s | orig_u | new_u ----+--------+-------+--------+------- 1 | 70 | 71 | 63 | 63 8 | 68 | 64 | 64 | 62 32 | 75 | 70 | 75 | 63 64 | 96 | 71 | 100 | 68 255 | 192 | 108 | 187 | 84 This change improves performance, especially for larger input sizes. Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw> --- v1 -> v2: - Drop redundant (int) casts. - Replace indirect calls with simple conditionals. - Use get_unaligned_be32() instead of manual byte extraction. - Link to v1: https://lore.kernel.org/lkml/20251116130105.1988020-1-4094117= 16@gms.tku.edu.tw/ --- fs/ext4/hash.c | 64 +++++++++++++++++++++++++++++++++----------------- 1 file changed, 42 insertions(+), 22 deletions(-) diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c index 33cd5b6b02d5..97b7a3b0603e 100644 --- a/fs/ext4/hash.c +++ b/fs/ext4/hash.c @@ -9,6 +9,7 @@ #include #include #include +#include #include "ext4.h" =20 #define DELTA 0x9E3779B9 @@ -141,21 +142,28 @@ static void str2hashbuf_signed(const char *msg, int l= en, __u32 *buf, int num) pad =3D (__u32)len | ((__u32)len << 8); pad |=3D pad << 16; =20 - val =3D pad; if (len > num*4) len =3D num * 4; - for (i =3D 0; i < len; i++) { - val =3D ((int) scp[i]) + (val << 8); - if ((i % 4) =3D=3D 3) { - *buf++ =3D val; - val =3D pad; - num--; - } + + while (len >=3D 4) { + val =3D (scp[0] << 24) + (scp[1] << 16) + (scp[2] << 8) + scp[3]; + *buf++ =3D val; + scp +=3D 4; + len -=3D 4; + num--; } + + val =3D pad; + + for (i =3D 0; i < len; i++) + val =3D scp[i] + (val << 8); + if (--num >=3D 0) *buf++ =3D val; + while (--num >=3D 0) *buf++ =3D pad; + } =20 static void str2hashbuf_unsigned(const char *msg, int len, __u32 *buf, int= num) @@ -167,21 +175,28 @@ static void str2hashbuf_unsigned(const char *msg, int= len, __u32 *buf, int num) pad =3D (__u32)len | ((__u32)len << 8); pad |=3D pad << 16; =20 - val =3D pad; if (len > num*4) len =3D num * 4; - for (i =3D 0; i < len; i++) { - val =3D ((int) ucp[i]) + (val << 8); - if ((i % 4) =3D=3D 3) { - *buf++ =3D val; - val =3D pad; - num--; - } + + while (len >=3D 4) { + val =3D get_unaligned_be32(ucp); + *buf++ =3D val; + ucp +=3D 4; + len -=3D 4; + num--; } + + val =3D pad; + + for (i =3D 0; i < len; i++) + val =3D ucp[i] + (val << 8); + if (--num >=3D 0) *buf++ =3D val; + while (--num >=3D 0) *buf++ =3D pad; + } =20 /* @@ -205,8 +220,7 @@ static int __ext4fs_dirhash(const struct inode *dir, co= nst char *name, int len, const char *p; int i; __u32 in[8], buf[4]; - void (*str2hashbuf)(const char *, int, __u32 *, int) =3D - str2hashbuf_signed; + bool use_unsigned =3D false; =20 /* Initialize the default seed for the hash checksum functions */ buf[0] =3D 0x67452301; @@ -232,12 +246,15 @@ static int __ext4fs_dirhash(const struct inode *dir, = const char *name, int len, hash =3D dx_hack_hash_signed(name, len); break; case DX_HASH_HALF_MD4_UNSIGNED: - str2hashbuf =3D str2hashbuf_unsigned; + use_unsigned =3D true; fallthrough; case DX_HASH_HALF_MD4: p =3D name; while (len > 0) { - (*str2hashbuf)(p, len, in, 8); + if (use_unsigned) + str2hashbuf_unsigned(p, len, in, 8); + else + str2hashbuf_signed(p, len, in, 8); half_md4_transform(buf, in); len -=3D 32; p +=3D 32; @@ -246,12 +263,15 @@ static int __ext4fs_dirhash(const struct inode *dir, = const char *name, int len, hash =3D buf[1]; break; case DX_HASH_TEA_UNSIGNED: - str2hashbuf =3D str2hashbuf_unsigned; + use_unsigned =3D true; fallthrough; case DX_HASH_TEA: p =3D name; while (len > 0) { - (*str2hashbuf)(p, len, in, 4); + if (use_unsigned) + str2hashbuf_unsigned(p, len, in, 4); + else + str2hashbuf_signed(p, len, in, 4); TEA_transform(buf, in); len -=3D 16; p +=3D 16; --=20 2.34.1