From nobody Mon Jun 8 08:36:53 2026 Received: from mail-pl1-f169.google.com (mail-pl1-f169.google.com [209.85.214.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 304D03BB40 for ; Sun, 31 May 2026 08:01:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780214508; cv=none; b=aJQvUcbijWItVgoWp7S3RUuCE/7BUJKQ5ZF1nJ8Ic/NuAHBphjbQW8Xs4FB/p567YlzDYMqLU6M/KHGlK5v4tK0HBQFlC4XMP04sTYNrLfNUVduUh2NFmwR/flSUF9LIQdVfKumlQIlPSL2gzU81tYs6Qx+WdUI0k/Rek9bu1bg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780214508; c=relaxed/simple; bh=WnlAE8Ukw15zMl5Mq87OS/KwhzZ7EVfeODCY+QpNEwg=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=mrtHMyFx5DtuGq1/X7QpuK7CrT+wZ0eOGklgD8vgpqXYID6ag7tUf45SNEeZgk71HlPbu+6BLzcLEdpxYbhog5cyKvfSbgKeGYpXxl//jUFL+jf2SLUaPP/+8Dip4BcXzwCO5UuwEwNCQPs0lqE4RQMMKUQJmMu6mlyfrvVA9Mg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gms.tku.edu.tw; spf=pass smtp.mailfrom=gms.tku.edu.tw; dkim=pass (2048-bit key) header.d=gms-tku-edu-tw.20251104.gappssmtp.com header.i=@gms-tku-edu-tw.20251104.gappssmtp.com header.b=dKLtZLhT; arc=none smtp.client-ip=209.85.214.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gms.tku.edu.tw Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gms.tku.edu.tw Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gms-tku-edu-tw.20251104.gappssmtp.com header.i=@gms-tku-edu-tw.20251104.gappssmtp.com header.b="dKLtZLhT" Received: by mail-pl1-f169.google.com with SMTP id d9443c01a7336-2c0c3315c5dso4097315ad.3 for ; Sun, 31 May 2026 01:01:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gms-tku-edu-tw.20251104.gappssmtp.com; s=20251104; t=1780214505; x=1780819305; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=nUef9sNwRoN0uICwU5o8yqLsbz/MtxfeVdh9b3GdS38=; b=dKLtZLhTnhz/uX39CkF1V8BTSer6URF/ezQPAhaA8V5kR7HsyqxTlj9ROqAdt8oACk 8ALfhnHNpgzXDLftiadQT2v9e5fm37FIAkxrEYuYCiK3jE34ju5rccXTG/zBtqMJ/yhE 7ZBU6Xh227DTKr1HoFr9hqBuFiBeCt95gM5i24KjDkabXJ1E2evO8qiIT10bIwDyKMzQ t9Tvkzjwr9SmTUo9M5WSmgbF5oUJKXTIxfAv2PUNV7vG2RcRXefxz8ankarqwKRUfk/T AbjXac8rbEzgR57L2daE7GjSODtUjY9CVKBrBPSlMv/xpSrw3MgtynWDSWBAdzAAN6kk bLHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780214505; x=1780819305; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=nUef9sNwRoN0uICwU5o8yqLsbz/MtxfeVdh9b3GdS38=; b=f4O94nz0Zo4mkUNvWJIA5sLAe+4ocswNYeurWkn8NjPiUkgS4bsLtCNx3E5oetODRu H6DTAXJ5VyNKEx+W6TQ/0Y4wXzqIlUw4sZwbVZGGgHz/JONiKN8iApDO99ndhrvdsL5f vG+pK7qOX0n0xcgt0xpiu9AlvTgJQzN0osbEuMxWQheX6mcOOf3hBRPb/Y+ZJM7JQver wPToj/7Wv2NzaQfGJKOpYB84DJ9SMB/AJrFeuHpVQ+end1d4jFOdazIdYgOwsQUxzesn SJVIF56iSh3ADpZSEV9X0XZTOLFqILBFgJZFwEF5+MxhmMA2yPakqJnXFNQQO5Fst6nE gOqA== X-Forwarded-Encrypted: i=1; AFNElJ+TFDDSte9izJ699Uupr/V3IvdK+OS76xazv061GrjxO/qeb45hOQ6TiTyeye7SECWvH0+aJZQ5/Obx9r4=@vger.kernel.org X-Gm-Message-State: AOJu0YzirO1k6kJc0ySV9+eRp/wNuZm/Z0v5Gi6j+pdJQ5VI2XBWh0fP 0q+2BjV0mBNsgXSxFigmt9g+zCVKeI4AxpNWhauNRMSGsHGsIJ9R7tQOxLLGLwBnoWs= X-Gm-Gg: Acq92OE6JH5IS9XrG56gbrMvLyadoZJD5CrxWjB/V0JW/gtyVNB1FGCZYwn6WEHBvtw kHHH3bHHst3U2cWIOcVbFXl+SXYmR5wtf8LysMdT1tCiI4t7kM8K2bcqnJNTMOEItJOK46lpoZK qeIvNMcXZ63AGwGc22JYwRkXZ7EgTRKJB+ZDfe53UQWbeKSt/PoFrfd7+ETMf8fTv6KgXZYTSbc gbSqjlraKR0pkKmT+pO0SEFTUEP8iO3XaHvMp0qrxmfF+tTzBJD0DP9xeX1xaU67SHUUFmPpLVI Y+RZ7raM5KLUORxDRo9mfMQJKTHyvg8duCm9EuBryxvkecvL7x4Eic/QAGoqbtCyVQoMc5mf/vV xB8WeA2TQoWPR7aKXo8iWkPwfMTpssqkuYcJA3rvQa8geRspv7LXg9PUZKEJSPQnJIUXgM6Fmsp OSHKCPPDr+1YSJOBAbICH7HxwOalz1EIGcyLBPU8GBGJMr10KhQ9NT+WQ3 X-Received: by 2002:a17:903:17c8:b0:2bd:8dbb:293e with SMTP id d9443c01a7336-2bf367c80b3mr77372935ad.14.1780214505264; Sun, 31 May 2026 01:01:45 -0700 (PDT) Received: from wu-Pro-E500-G6-WS720T.. ([2001:288:7001:2703:429:4bda:e5dc:8a16]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2bf23c3f496sm70588225ad.76.2026.05.31.01.01.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 31 May 2026 01:01:44 -0700 (PDT) From: Guan-Chun Wu <409411716@gms.tku.edu.tw> To: Theodore Ts'o , Andreas Dilger , Baokun Li , Jan Kara , Ojaswin Mujoo , Ritesh Harjani , Zhang Yi Cc: linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, edward062254@gmail.com, visitorckw@gmail.com, david.laight.linux@gmail.com, Guan-Chun Wu <409411716@gms.tku.edu.tw> Subject: [PATCH v6 1/2] ext4: add Kunit coverage for directory hash computation Date: Sun, 31 May 2026 16:00:18 +0800 Message-Id: <20260531080019.3794809-2-409411716@gms.tku.edu.tw> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260531080019.3794809-1-409411716@gms.tku.edu.tw> References: <20260531080019.3794809-1-409411716@gms.tku.edu.tw> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce Kunit tests for fs/ext4/hash.c to verify ext4fs_dirhash() across the legacy, half-MD4, and TEA hash variants. The tests cover empty, seeded hashing, and non-ASCII name handling. They also verify error paths, including invalid hash versions and SipHash without a configured key, and check that the signed and unsigned hash variants differ on non-ASCII input as expected. When CONFIG_UNICODE is enabled, the tests further verify casefolded-name hashing and the fallback behavior for invalid input. Co-developed-by: Chen Hao Yu Signed-off-by: Chen Hao Yu Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw> --- fs/ext4/Makefile | 2 +- fs/ext4/hash-test.c | 567 ++++++++++++++++++++++++++++++++++++++++++++ fs/ext4/hash.c | 4 + 3 files changed, 572 insertions(+), 1 deletion(-) create mode 100644 fs/ext4/hash-test.c diff --git a/fs/ext4/Makefile b/fs/ext4/Makefile index 3baee4e7c..3f9fc0eb8 100644 --- a/fs/ext4/Makefile +++ b/fs/ext4/Makefile @@ -15,7 +15,7 @@ ext4-y :=3D balloc.o bitmap.o block_validity.o dir.o ext4= _jbd2.o extents.o \ ext4-$(CONFIG_EXT4_FS_POSIX_ACL) +=3D acl.o ext4-$(CONFIG_EXT4_FS_SECURITY) +=3D xattr_security.o ext4-test-objs +=3D inode-test.o mballoc-test.o \ - extents-test.o + extents-test.o hash-test.o obj-$(CONFIG_EXT4_KUNIT_TESTS) +=3D ext4-test.o ext4-$(CONFIG_FS_VERITY) +=3D verity.o ext4-$(CONFIG_FS_ENCRYPTION) +=3D crypto.o diff --git a/fs/ext4/hash-test.c b/fs/ext4/hash-test.c new file mode 100644 index 000000000..49b0d874c --- /dev/null +++ b/fs/ext4/hash-test.c @@ -0,0 +1,567 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KUnit tests for ext4 directory hash computation. + */ + +#include +#include +#include +#include +#include +#include +#include "ext4.h" + +static void ext4_hash_init_fake_dir(struct inode *dir, struct super_block = *sb) +{ + memset(sb, 0, sizeof(*sb)); + memset(dir, 0, sizeof(*dir)); + dir->i_sb =3D sb; + strscpy(sb->s_id, "kunit-ext4", sizeof(sb->s_id)); +} + +static void ext4_hash_init_fake_dir_with_sbi(struct inode *dir, + struct super_block *sb, + struct ext4_sb_info *sbi) +{ + ext4_hash_init_fake_dir(dir, sb); + memset(sbi, 0, sizeof(*sbi)); + sb->s_fs_info =3D sbi; + sbi->s_sb =3D sb; +} + +#ifdef CONFIG_FS_ENCRYPTION +static const struct fscrypt_operations ext4_hash_test_cryptops =3D { + .inode_info_offs =3D + (int)offsetof(struct ext4_inode_info, i_crypt_info) - + (int)offsetof(struct ext4_inode_info, vfs_inode), +}; +#endif + +static void ext4_hash_init_fake_ext4_dir(struct ext4_inode_info *ei, + struct super_block *sb, + struct ext4_sb_info *sbi) +{ + struct inode *dir =3D &ei->vfs_inode; + + memset(sb, 0, sizeof(*sb)); + memset(ei, 0, sizeof(*ei)); + memset(sbi, 0, sizeof(*sbi)); + + strscpy(sb->s_id, "kunit-ext4", sizeof(sb->s_id)); + sb->s_fs_info =3D sbi; + sbi->s_sb =3D sb; + + dir->i_sb =3D sb; + dir->i_mode =3D S_IFDIR; + +#ifdef CONFIG_FS_ENCRYPTION + fscrypt_set_ops(sb, &ext4_hash_test_cryptops); +#endif +} + +struct ext4_dirhash_test_case { + const char *name; + u32 hash_version; + const char *input; + int len; + u32 seed[4]; + bool use_seed; + u32 expected_hash; + u32 expected_minor_hash; +}; + +static const struct ext4_dirhash_test_case ext4_dirhash_test_cases[] =3D { + { + .name =3D "legacy_abc", + .hash_version =3D DX_HASH_LEGACY, + .input =3D "abc", + .len =3D 3, + .use_seed =3D false, + .expected_hash =3D 0x75afd992, + .expected_minor_hash =3D 0x00000000, + }, + { + .name =3D "legacy_unsigned_abc", + .hash_version =3D DX_HASH_LEGACY_UNSIGNED, + .input =3D "abc", + .len =3D 3, + .use_seed =3D false, + .expected_hash =3D 0x75afd992, + .expected_minor_hash =3D 0x00000000, + }, + { + .name =3D "half_md4_abc", + .hash_version =3D DX_HASH_HALF_MD4, + .input =3D "abc", + .len =3D 3, + .use_seed =3D false, + .expected_hash =3D 0xd196a868, + .expected_minor_hash =3D 0xc420eb28, + }, + { + .name =3D "half_md4_unsigned_abc", + .hash_version =3D DX_HASH_HALF_MD4_UNSIGNED, + .input =3D "abc", + .len =3D 3, + .use_seed =3D false, + .expected_hash =3D 0xd196a868, + .expected_minor_hash =3D 0xc420eb28, + }, + { + .name =3D "tea_abc", + .hash_version =3D DX_HASH_TEA, + .input =3D "abc", + .len =3D 3, + .use_seed =3D false, + .expected_hash =3D 0xb1435ec4, + .expected_minor_hash =3D 0x3f7eaa0e, + }, + { + .name =3D "tea_unsigned_abc", + .hash_version =3D DX_HASH_TEA_UNSIGNED, + .input =3D "abc", + .len =3D 3, + .use_seed =3D false, + .expected_hash =3D 0xb1435ec4, + .expected_minor_hash =3D 0x3f7eaa0e, + }, + { + .name =3D "empty_half_md4", + .hash_version =3D DX_HASH_HALF_MD4, + .input =3D "", + .len =3D 0, + .use_seed =3D false, + .expected_hash =3D 0xefcdab88, + .expected_minor_hash =3D 0x98badcfe, + }, + { + .name =3D "half_md4_31bytes", + .hash_version =3D DX_HASH_HALF_MD4, + .input =3D "1234567890123456789012345678901", + .len =3D 31, + .use_seed =3D false, + .expected_hash =3D 0xc4db1f78, + .expected_minor_hash =3D 0xea23921b, + }, + { + .name =3D "half_md4_32bytes", + .hash_version =3D DX_HASH_HALF_MD4, + .input =3D "12345678901234567890123456789012", + .len =3D 32, + .use_seed =3D false, + .expected_hash =3D 0xfa6cc63e, + .expected_minor_hash =3D 0x2f77bd1c, + }, + { + .name =3D "half_md4_33bytes", + .hash_version =3D DX_HASH_HALF_MD4, + .input =3D "123456789012345678901234567890123", + .len =3D 33, + .use_seed =3D false, + .expected_hash =3D 0xdc0c2dec, + .expected_minor_hash =3D 0x5ca23365, + }, + { + .name =3D "half_md4_unsigned_31bytes", + .hash_version =3D DX_HASH_HALF_MD4_UNSIGNED, + .input =3D "1234567890123456789012345678901", + .len =3D 31, + .use_seed =3D false, + .expected_hash =3D 0xc4db1f78, + .expected_minor_hash =3D 0xea23921b, + }, + { + .name =3D "half_md4_unsigned_32bytes", + .hash_version =3D DX_HASH_HALF_MD4_UNSIGNED, + .input =3D "12345678901234567890123456789012", + .len =3D 32, + .use_seed =3D false, + .expected_hash =3D 0xfa6cc63e, + .expected_minor_hash =3D 0x2f77bd1c, + }, + { + .name =3D "half_md4_unsigned_33bytes", + .hash_version =3D DX_HASH_HALF_MD4_UNSIGNED, + .input =3D "123456789012345678901234567890123", + .len =3D 33, + .use_seed =3D false, + .expected_hash =3D 0xdc0c2dec, + .expected_minor_hash =3D 0x5ca23365, + }, + { + .name =3D "tea_15bytes", + .hash_version =3D DX_HASH_TEA, + .input =3D "123456789abcdef", + .len =3D 15, + .use_seed =3D false, + .expected_hash =3D 0xa562903a, + .expected_minor_hash =3D 0x6174a00f, + }, + { + .name =3D "tea_16bytes", + .hash_version =3D DX_HASH_TEA, + .input =3D "1234567890abcdef", + .len =3D 16, + .use_seed =3D false, + .expected_hash =3D 0x8449f258, + .expected_minor_hash =3D 0x49a16d46, + }, + { + .name =3D "tea_17bytes", + .hash_version =3D DX_HASH_TEA, + .input =3D "123456789abcdefgh", + .len =3D 17, + .use_seed =3D false, + .expected_hash =3D 0xf32ec10c, + .expected_minor_hash =3D 0x58ceae61, + }, + { + .name =3D "half_md4_seeded", + .hash_version =3D DX_HASH_HALF_MD4, + .input =3D "same-name", + .len =3D 9, + .seed =3D { 0x11111111, 0x22222222, 0x33333333, 0x44444444 }, + .use_seed =3D true, + .expected_hash =3D 0x8aebf604, + .expected_minor_hash =3D 0x66ce48fe, + }, + { + .name =3D "half_md4_non_ascii_signed", + .hash_version =3D DX_HASH_HALF_MD4, + .input =3D "\x80\x81\x82\x83\x84", + .len =3D 5, + .use_seed =3D false, + .expected_hash =3D 0x8bab0498, + .expected_minor_hash =3D 0xc326632d, + }, + { + .name =3D "half_md4_non_ascii_unsigned", + .hash_version =3D DX_HASH_HALF_MD4_UNSIGNED, + .input =3D "\x80\x81\x82\x83\x84", + .len =3D 5, + .use_seed =3D false, + .expected_hash =3D 0xbc48596e, + .expected_minor_hash =3D 0xde0fad41, + }, + { + .name =3D "tea_non_ascii_signed", + .hash_version =3D DX_HASH_TEA, + .input =3D "\x80\x81\x82\x83\x84", + .len =3D 5, + .use_seed =3D false, + .expected_hash =3D 0x21e3a154, + .expected_minor_hash =3D 0x90112c3d, + }, + { + .name =3D "tea_non_ascii_unsigned", + .hash_version =3D DX_HASH_TEA_UNSIGNED, + .input =3D "\x80\x81\x82\x83\x84", + .len =3D 5, + .use_seed =3D false, + .expected_hash =3D 0x9b648616, + .expected_minor_hash =3D 0x011dd507, + }, +}; + +static void test_ext4fs_dirhash_vectors(struct kunit *test) +{ + struct super_block *sb; + struct inode *dir; + int i; + + sb =3D kunit_kzalloc(test, sizeof(*sb), GFP_KERNEL); + dir =3D kunit_kzalloc(test, sizeof(*dir), GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, sb); + KUNIT_ASSERT_NOT_NULL(test, dir); + + ext4_hash_init_fake_dir(dir, sb); + + for (i =3D 0; i < ARRAY_SIZE(ext4_dirhash_test_cases); i++) { + const struct ext4_dirhash_test_case *tc =3D + &ext4_dirhash_test_cases[i]; + struct dx_hash_info hinfo; + int ret; + + memset(&hinfo, 0, sizeof(hinfo)); + hinfo.hash_version =3D tc->hash_version; + hinfo.seed =3D tc->use_seed ? (u32 *)tc->seed : NULL; + + ret =3D ext4fs_dirhash(dir, tc->input, tc->len, &hinfo); + + KUNIT_ASSERT_EQ_MSG(test, ret, 0, "case=3D%s", tc->name); + KUNIT_EXPECT_EQ_MSG(test, hinfo.hash, tc->expected_hash, + "case=3D%s", tc->name); + KUNIT_EXPECT_EQ_MSG(test, hinfo.minor_hash, + tc->expected_minor_hash, + "case=3D%s", tc->name); + } +} + +static void test_ext4fs_dirhash_seed_changes_result(struct kunit *test) +{ + struct super_block *sb; + struct inode *dir; + u32 seed[4] =3D { 0x11111111, 0x22222222, 0x33333333, 0x44444444 }; + struct dx_hash_info plain =3D { + .hash_version =3D DX_HASH_HALF_MD4, + }; + struct dx_hash_info seeded =3D { + .hash_version =3D DX_HASH_HALF_MD4, + .seed =3D seed, + }; + int ret_plain, ret_seeded; + + sb =3D kunit_kzalloc(test, sizeof(*sb), GFP_KERNEL); + dir =3D kunit_kzalloc(test, sizeof(*dir), GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, sb); + KUNIT_ASSERT_NOT_NULL(test, dir); + + ext4_hash_init_fake_dir(dir, sb); + + ret_plain =3D ext4fs_dirhash(dir, "same-name", 9, &plain); + ret_seeded =3D ext4fs_dirhash(dir, "same-name", 9, &seeded); + + KUNIT_ASSERT_EQ(test, ret_plain, 0); + KUNIT_ASSERT_EQ(test, ret_seeded, 0); + + KUNIT_EXPECT_TRUE(test, + plain.hash !=3D seeded.hash || + plain.minor_hash !=3D seeded.minor_hash); +} + +static void test_ext4fs_dirhash_invalid_version_returns_einval(struct kuni= t *test) +{ + struct super_block *sb; + struct inode *dir; + struct ext4_sb_info *sbi; + struct dx_hash_info hinfo =3D { + .hash =3D 0xdeadbeef, + .minor_hash =3D 0xcafebabe, + .hash_version =3D DX_HASH_LAST + 1, + }; + int ret; + + sb =3D kunit_kzalloc(test, sizeof(*sb), GFP_KERNEL); + dir =3D kunit_kzalloc(test, sizeof(*dir), GFP_KERNEL); + sbi =3D kunit_kzalloc(test, sizeof(*sbi), GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, sb); + KUNIT_ASSERT_NOT_NULL(test, dir); + KUNIT_ASSERT_NOT_NULL(test, sbi); + + ext4_hash_init_fake_dir_with_sbi(dir, sb, sbi); + + ret =3D ext4fs_dirhash(dir, "abc", 3, &hinfo); + + KUNIT_EXPECT_EQ(test, ret, -EINVAL); + KUNIT_EXPECT_EQ(test, hinfo.hash, 0); + KUNIT_EXPECT_EQ(test, hinfo.minor_hash, 0); +} + +static void test_ext4fs_dirhash_siphash_without_key_returns_einval(struct = kunit *test) +{ + struct super_block *sb; + struct ext4_inode_info *ei; + struct inode *dir; + struct ext4_sb_info *sbi; + struct dx_hash_info hinfo =3D { + .hash_version =3D DX_HASH_SIPHASH, + }; + int ret; + + sb =3D kunit_kzalloc(test, sizeof(*sb), GFP_KERNEL); + ei =3D kunit_kzalloc(test, sizeof(*ei), GFP_KERNEL); + sbi =3D kunit_kzalloc(test, sizeof(*sbi), GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, sb); + KUNIT_ASSERT_NOT_NULL(test, ei); + KUNIT_ASSERT_NOT_NULL(test, sbi); + + ext4_hash_init_fake_ext4_dir(ei, sb, sbi); + dir =3D &ei->vfs_inode; + + ret =3D ext4fs_dirhash(dir, "name", strlen("name"), &hinfo); + + KUNIT_EXPECT_EQ(test, ret, -EINVAL); +} + +static void test_ext4fs_dirhash_signed_unsigned_differ_on_nonascii(struct = kunit *test) +{ + struct super_block *sb; + struct inode *dir; + static const char input[] =3D "\x80\xff\x81\xfe\101bc"; + struct dx_hash_info legacy_signed =3D { + .hash_version =3D DX_HASH_LEGACY, + }; + struct dx_hash_info legacy_unsigned =3D { + .hash_version =3D DX_HASH_LEGACY_UNSIGNED, + }; + struct dx_hash_info md4_signed =3D { + .hash_version =3D DX_HASH_HALF_MD4, + }; + struct dx_hash_info md4_unsigned =3D { + .hash_version =3D DX_HASH_HALF_MD4_UNSIGNED, + }; + struct dx_hash_info tea_signed =3D { + .hash_version =3D DX_HASH_TEA, + }; + struct dx_hash_info tea_unsigned =3D { + .hash_version =3D DX_HASH_TEA_UNSIGNED, + }; + int ret; + + sb =3D kunit_kzalloc(test, sizeof(*sb), GFP_KERNEL); + dir =3D kunit_kzalloc(test, sizeof(*dir), GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, sb); + KUNIT_ASSERT_NOT_NULL(test, dir); + + ext4_hash_init_fake_dir(dir, sb); + + ret =3D ext4fs_dirhash(dir, input, sizeof(input) - 1, &legacy_signed); + KUNIT_ASSERT_EQ(test, ret, 0); + ret =3D ext4fs_dirhash(dir, input, sizeof(input) - 1, &legacy_unsigned); + KUNIT_ASSERT_EQ(test, ret, 0); + KUNIT_EXPECT_NE(test, legacy_signed.hash, legacy_unsigned.hash); + + ret =3D ext4fs_dirhash(dir, input, sizeof(input) - 1, &md4_signed); + KUNIT_ASSERT_EQ(test, ret, 0); + ret =3D ext4fs_dirhash(dir, input, sizeof(input) - 1, &md4_unsigned); + KUNIT_ASSERT_EQ(test, ret, 0); + KUNIT_EXPECT_TRUE(test, + md4_signed.hash !=3D md4_unsigned.hash || + md4_signed.minor_hash !=3D md4_unsigned.minor_hash); + + ret =3D ext4fs_dirhash(dir, input, sizeof(input) - 1, &tea_signed); + KUNIT_ASSERT_EQ(test, ret, 0); + ret =3D ext4fs_dirhash(dir, input, sizeof(input) - 1, &tea_unsigned); + KUNIT_ASSERT_EQ(test, ret, 0); + KUNIT_EXPECT_TRUE(test, + tea_signed.hash !=3D tea_unsigned.hash || + tea_signed.minor_hash !=3D tea_unsigned.minor_hash); +} + +#if IS_ENABLED(CONFIG_UNICODE) +KUNIT_DEFINE_ACTION_WRAPPER(utf8_unload_action, utf8_unload, + struct unicode_map *); +static void test_ext4fs_dirhash_casefolded_names_hash_consistently(struct = kunit *test) +{ + struct super_block *sb; + struct ext4_inode_info *ei; + struct ext4_sb_info *sbi; + struct unicode_map *um; + struct dx_hash_info h1 =3D { + .hash_version =3D DX_HASH_HALF_MD4, + }; + struct dx_hash_info h2 =3D { + .hash_version =3D DX_HASH_HALF_MD4, + }; + int ret, ret1, ret2; + + sb =3D kunit_kzalloc(test, sizeof(*sb), GFP_KERNEL); + ei =3D kunit_kzalloc(test, sizeof(*ei), GFP_KERNEL); + sbi =3D kunit_kzalloc(test, sizeof(*sbi), GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, sb); + KUNIT_ASSERT_NOT_NULL(test, ei); + KUNIT_ASSERT_NOT_NULL(test, sbi); + + um =3D utf8_load(UTF8_LATEST); + if (IS_ERR(um)) { + kunit_skip(test, "utf8_load(UTF8_LATEST) failed: %pe", + um); + return; + } + + ret =3D kunit_add_action_or_reset(test, utf8_unload_action, um); + KUNIT_ASSERT_EQ(test, ret, 0); + + ext4_hash_init_fake_ext4_dir(ei, sb, sbi); + sb->s_encoding =3D um; + ei->vfs_inode.i_flags |=3D S_CASEFOLD; + + KUNIT_ASSERT_TRUE(test, IS_CASEFOLDED(&ei->vfs_inode)); + + ret1 =3D ext4fs_dirhash(&ei->vfs_inode, "Alpha", 5, &h1); + ret2 =3D ext4fs_dirhash(&ei->vfs_inode, "aLPHa", 5, &h2); + + KUNIT_ASSERT_EQ(test, ret1, 0); + KUNIT_ASSERT_EQ(test, ret2, 0); + KUNIT_EXPECT_EQ(test, h1.hash, h2.hash); + KUNIT_EXPECT_EQ(test, h1.minor_hash, h2.minor_hash); +} + +static void test_ext4fs_dirhash_casefold_fallback(struct kunit *test) +{ + struct super_block *sb_cf, *sb_plain; + struct ext4_inode_info *ei; + struct ext4_sb_info *sbi; + struct inode *plain_dir; + struct unicode_map *um; + static const char invalid_utf8[] =3D "\xc3\x28"; + struct dx_hash_info folded_dir =3D { + .hash_version =3D DX_HASH_HALF_MD4, + }; + struct dx_hash_info plain =3D { + .hash_version =3D DX_HASH_HALF_MD4, + }; + int ret, ret_cf, ret_plain; + + sb_cf =3D kunit_kzalloc(test, sizeof(*sb_cf), GFP_KERNEL); + sb_plain =3D kunit_kzalloc(test, sizeof(*sb_plain), GFP_KERNEL); + ei =3D kunit_kzalloc(test, sizeof(*ei), GFP_KERNEL); + sbi =3D kunit_kzalloc(test, sizeof(*sbi), GFP_KERNEL); + plain_dir =3D kunit_kzalloc(test, sizeof(*plain_dir), GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, sb_cf); + KUNIT_ASSERT_NOT_NULL(test, sb_plain); + KUNIT_ASSERT_NOT_NULL(test, ei); + KUNIT_ASSERT_NOT_NULL(test, sbi); + KUNIT_ASSERT_NOT_NULL(test, plain_dir); + + um =3D utf8_load(UTF8_LATEST); + if (IS_ERR(um)) { + kunit_skip(test, "utf8_load(UTF8_LATEST) failed: %pe", + um); + return; + } + + ret =3D kunit_add_action_or_reset(test, utf8_unload_action, um); + KUNIT_ASSERT_EQ(test, ret, 0); + + ext4_hash_init_fake_ext4_dir(ei, sb_cf, sbi); + sb_cf->s_encoding =3D um; + ei->vfs_inode.i_flags |=3D S_CASEFOLD; + + KUNIT_ASSERT_TRUE(test, IS_CASEFOLDED(&ei->vfs_inode)); + + ext4_hash_init_fake_dir(plain_dir, sb_plain); + + ret_cf =3D ext4fs_dirhash(&ei->vfs_inode, invalid_utf8, + sizeof(invalid_utf8) - 1, &folded_dir); + ret_plain =3D ext4fs_dirhash(plain_dir, invalid_utf8, + sizeof(invalid_utf8) - 1, &plain); + + KUNIT_ASSERT_EQ(test, ret_cf, 0); + KUNIT_ASSERT_EQ(test, ret_plain, 0); + KUNIT_EXPECT_EQ(test, folded_dir.hash, plain.hash); + KUNIT_EXPECT_EQ(test, folded_dir.minor_hash, plain.minor_hash); +} +#endif + +static struct kunit_case ext4_hash_test_cases[] =3D { + KUNIT_CASE(test_ext4fs_dirhash_vectors), + KUNIT_CASE(test_ext4fs_dirhash_seed_changes_result), + KUNIT_CASE(test_ext4fs_dirhash_invalid_version_returns_einval), + KUNIT_CASE(test_ext4fs_dirhash_siphash_without_key_returns_einval), + KUNIT_CASE(test_ext4fs_dirhash_signed_unsigned_differ_on_nonascii), +#if IS_ENABLED(CONFIG_UNICODE) + KUNIT_CASE(test_ext4fs_dirhash_casefolded_names_hash_consistently), + KUNIT_CASE(test_ext4fs_dirhash_casefold_fallback), +#endif + {} +}; + +static struct kunit_suite ext4_hash_test_suite =3D { + .name =3D "ext4_hash", + .test_cases =3D ext4_hash_test_cases, +}; + +kunit_test_suites(&ext4_hash_test_suite); + +MODULE_LICENSE("GPL"); diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c index 48483cd01..72645bd92 100644 --- a/fs/ext4/hash.c +++ b/fs/ext4/hash.c @@ -321,3 +321,7 @@ int ext4fs_dirhash(const struct inode *dir, const char = *name, int len, #endif return __ext4fs_dirhash(dir, name, len, hinfo); } + +#if IS_ENABLED(CONFIG_EXT4_KUNIT_TESTS) +EXPORT_SYMBOL_FOR_EXT4_TEST(ext4fs_dirhash); +#endif --=20 2.34.1 From nobody Mon Jun 8 08:36:53 2026 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E0BEB28C869 for ; Sun, 31 May 2026 08:01:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780214511; cv=none; b=Fyrxc9uh1VXT2kDcBFsdChIL0u0Hyu20sedggkIFo4Vpqb+sHs5+jIFpfxkFHcxpY5CjtIxo3biB51BVP7GDvw4FeOg4WHLhI5uiuGMGEANc7rg1lxDY7kfAmfBo7FkxG04yqveSVz8yfjNbC9zzcob3heEjul2H71l6tqzcGPQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780214511; c=relaxed/simple; bh=4/z6yP3+JuzW+3nQ9d+tB8xobpPUjdjHcioAcY9jFlQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=YZsPjgJjGoQ/KotlPZFRw55wUW6Xft7Jaf2zbGIFByX+Avtgy9fdlH9hkrsoOCttRqOQILp3zn9E8sYs+81uNb6sQcUOIY1+lENTuT3u9FMV6gXOpslaPbtzJdSF98KYPF/ESZX54h/gcUmzRsIq4cBEryIAiLwkdiICGhtvnd0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gms.tku.edu.tw; spf=pass smtp.mailfrom=gms.tku.edu.tw; dkim=pass (2048-bit key) header.d=gms-tku-edu-tw.20251104.gappssmtp.com header.i=@gms-tku-edu-tw.20251104.gappssmtp.com header.b=igf9ZB7Y; arc=none smtp.client-ip=209.85.214.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gms.tku.edu.tw Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gms.tku.edu.tw Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gms-tku-edu-tw.20251104.gappssmtp.com header.i=@gms-tku-edu-tw.20251104.gappssmtp.com header.b="igf9ZB7Y" Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-2c0c20f0c0aso2765925ad.0 for ; Sun, 31 May 2026 01:01:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gms-tku-edu-tw.20251104.gappssmtp.com; s=20251104; t=1780214509; x=1780819309; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QwgjGyriqartm9D+KeKxuc1rd60Ey0Lbv69xmUQfiEQ=; b=igf9ZB7YfHmZ93CTyJtS81ePKA+/RsxS1k7f/9kcvMy1bpKj44GwTW8OeAoimzKqor uIoF9f704cVr5FtPU8wJqzWej+l2w3lkwArhOOlttb/m41fLc+ribzpn4ifgNXMf1T7w N4bHvzxysSpq2n7Yiu9XVtwDDobU4py4rV/42ZiaECRTO+wJwG4iessnJfC5t9u7lgd0 ICh89xkBvHsXzVxv90t0gXq3GxAK+EYFcatyMaWy88xfokq6p/60CN609dWLkYvc4eaP w0V/1aRjs9kdOrKaX92WMoHpdlEUbSwZNAuVfyaG9vL83p+owH0yhhMGxjz1txQJpCvo Wo+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780214509; x=1780819309; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=QwgjGyriqartm9D+KeKxuc1rd60Ey0Lbv69xmUQfiEQ=; b=DrMZUiRbl3zc7F6i6KJkgqN8kBDmyhRYqvkKf74Ql3yxm+j0VSYmXWFtW1mbBKAFjU PJaGdgMKIYdeAOYhXsC0nfH05RO2m62kCJXaRoW7TYUmiPSTpia60fPf8QQXgolxCPH/ DBwR/aseT+3F1cnNDdoWwbw0jRiwjfgbpsyBW4LFw8uMFrSjK1VsrTpisrPh8Ce2rR6n U1DSE3n4WAcHa23qzK28tA04rrrbcHcZmbr7JNbdiCVauyFoXw8qPVAUTKXZ3CqMLRhI gW7HkTd/WZlsVSbtO8hmJdc17JYgjSSur196qIBl2qRLQzX0f4f4AhRNu71cOynjkQU0 FoSA== X-Forwarded-Encrypted: i=1; AFNElJ9eY6ZdPAKmEGNjR5vwG6RxsXLbVvVFn+470/bjBfVoCEFal27JDC3mSVYmT6m1kPz4M/Igf+Afj2rFFFA=@vger.kernel.org X-Gm-Message-State: AOJu0YwS+unw1QEKPNjeotM9o+6wrIB9javqHe0krXH0lejHOtC7uvpS kHLJLjYXU0EaSW+y+UA0UMcnuCFPFmlP3P33p9TFFAUfZPMP1EO/RjFyxuuLpqbl+8Y= X-Gm-Gg: Acq92OFoywPxB7JWZC4hAvk/QTEzW2aG9b3Sy5BweyHy/NPOSgSwA5wMhj0BYp9ql74 3WpJ/2HJeusDh2dvvpkdVNd66umoVlVesqV+lXwApwk4YXq0jVMdPW/263pbdj2xE1WsGQHuP25 mClKvwZbH151NSLsjbf/e1E34VX4AuQFNpbl2x4AWhKSEQfdOFiUYUmXAq7v4+vKEyxmeJQU0yr i0pyP511dpfZEGoflBnYqd6FOR9Amacr0g3z4nsgA9k2KLrRI3gb2AfJyguOtX1/jgsv/84zMKx iC1xwrnwsuTqojV4a2nLGwstMBCv27E26IzpTbK5PiqxwmOi/lAceIC6y+TIGdACF82XcGqJ7G+ AR7QiVwfisGQNDk6TbsUvnnTJxG0YsiWV1qkhnqpI31BKeRLcXUqymmpDf7Up1oX02zsSpUcyIZ PNjKVGHdygkQ8Rv+1jNn7JUWBetg582WNETIDxWzU4CKIzmlWzaqFyoGst X-Received: by 2002:a17:903:32c3:b0:2bf:7b62:a02c with SMTP id d9443c01a7336-2bf7b62a2e2mr61420835ad.3.1780214509161; Sun, 31 May 2026 01:01:49 -0700 (PDT) Received: from wu-Pro-E500-G6-WS720T.. ([2001:288:7001:2703:429:4bda:e5dc:8a16]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2bf23c3f496sm70588225ad.76.2026.05.31.01.01.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 31 May 2026 01:01:48 -0700 (PDT) From: Guan-Chun Wu <409411716@gms.tku.edu.tw> To: Theodore Ts'o , Andreas Dilger , Baokun Li , Jan Kara , Ojaswin Mujoo , Ritesh Harjani , Zhang Yi Cc: linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, edward062254@gmail.com, visitorckw@gmail.com, david.laight.linux@gmail.com, Guan-Chun Wu <409411716@gms.tku.edu.tw> Subject: [PATCH v6 2/2] ext4: improve str2hashbuf by processing 4-byte chunks and removing function pointers Date: Sun, 31 May 2026 16:00:19 +0800 Message-Id: <20260531080019.3794809-3-409411716@gms.tku.edu.tw> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260531080019.3794809-1-409411716@gms.tku.edu.tw> References: <20260531080019.3794809-1-409411716@gms.tku.edu.tw> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The original byte-by-byte implementation with modulo checks is less efficient. Refactor str2hashbuf_unsigned() and str2hashbuf_signed() to process input in explicit 4-byte chunks instead of using a modulus-based loop to emit words byte by byte. Additionally, the use of function pointers for selecting the appropriate str2hashbuf implementation has been removed. Instead, the functions are directly invoked based on the hash type, eliminating the overhead of dynamic function calls. Performance test (x86_64, Intel Core i7-10700 @ 2.90GHz, average over 10000 runs, using kernel module for testing): len | orig_s | new_s | orig_u | new_u ----+--------+-------+--------+------- 1 | 70 | 71 | 63 | 63 8 | 68 | 64 | 64 | 62 32 | 75 | 70 | 75 | 63 64 | 96 | 71 | 100 | 68 255 | 192 | 108 | 187 | 84 This change improves performance, especially for larger input sizes. Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw> --- fs/ext4/hash.c | 64 +++++++++++++++++++++++++++++++++----------------- 1 file changed, 42 insertions(+), 22 deletions(-) diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c index 72645bd92..978bd92da 100644 --- a/fs/ext4/hash.c +++ b/fs/ext4/hash.c @@ -9,6 +9,7 @@ #include #include #include +#include #include "ext4.h" =20 #define DELTA 0x9E3779B9 @@ -141,21 +142,28 @@ static void str2hashbuf_signed(const char *msg, int l= en, __u32 *buf, int num) pad =3D (__u32)len | ((__u32)len << 8); pad |=3D pad << 16; =20 - val =3D pad; if (len > num*4) len =3D num * 4; - for (i =3D 0; i < len; i++) { - val =3D ((int) scp[i]) + (val << 8); - if ((i % 4) =3D=3D 3) { - *buf++ =3D val; - val =3D pad; - num--; - } + + while (len >=3D 4) { + val =3D ((__u32)scp[0] << 24) + ((__u32)scp[1] << 16) + ((__u32)scp[2] <= < 8) + scp[3]; + *buf++ =3D val; + scp +=3D 4; + len -=3D 4; + num--; } + + val =3D pad; + + for (i =3D 0; i < len; i++) + val =3D scp[i] + (val << 8); + if (--num >=3D 0) *buf++ =3D val; + while (--num >=3D 0) *buf++ =3D pad; + } =20 static void str2hashbuf_unsigned(const char *msg, int len, __u32 *buf, int= num) @@ -167,21 +175,28 @@ static void str2hashbuf_unsigned(const char *msg, int= len, __u32 *buf, int num) pad =3D (__u32)len | ((__u32)len << 8); pad |=3D pad << 16; =20 - val =3D pad; if (len > num*4) len =3D num * 4; - for (i =3D 0; i < len; i++) { - val =3D ((int) ucp[i]) + (val << 8); - if ((i % 4) =3D=3D 3) { - *buf++ =3D val; - val =3D pad; - num--; - } + + while (len >=3D 4) { + val =3D get_unaligned_be32(ucp); + *buf++ =3D val; + ucp +=3D 4; + len -=3D 4; + num--; } + + val =3D pad; + + for (i =3D 0; i < len; i++) + val =3D ucp[i] + (val << 8); + if (--num >=3D 0) *buf++ =3D val; + while (--num >=3D 0) *buf++ =3D pad; + } =20 /* @@ -205,8 +220,7 @@ static int __ext4fs_dirhash(const struct inode *dir, co= nst char *name, int len, const char *p; int i; __u32 in[8], buf[4]; - void (*str2hashbuf)(const char *, int, __u32 *, int) =3D - str2hashbuf_signed; + bool use_unsigned =3D false; =20 /* Initialize the default seed for the hash checksum functions */ buf[0] =3D 0x67452301; @@ -232,12 +246,15 @@ static int __ext4fs_dirhash(const struct inode *dir, = const char *name, int len, hash =3D dx_hack_hash_signed(name, len); break; case DX_HASH_HALF_MD4_UNSIGNED: - str2hashbuf =3D str2hashbuf_unsigned; + use_unsigned =3D true; fallthrough; case DX_HASH_HALF_MD4: p =3D name; while (len > 0) { - (*str2hashbuf)(p, len, in, 8); + if (use_unsigned) + str2hashbuf_unsigned(p, len, in, 8); + else + str2hashbuf_signed(p, len, in, 8); half_md4_transform(buf, in); len -=3D 32; p +=3D 32; @@ -246,12 +263,15 @@ static int __ext4fs_dirhash(const struct inode *dir, = const char *name, int len, hash =3D buf[1]; break; case DX_HASH_TEA_UNSIGNED: - str2hashbuf =3D str2hashbuf_unsigned; + use_unsigned =3D true; fallthrough; case DX_HASH_TEA: p =3D name; while (len > 0) { - (*str2hashbuf)(p, len, in, 4); + if (use_unsigned) + str2hashbuf_unsigned(p, len, in, 4); + else + str2hashbuf_signed(p, len, in, 4); TEA_transform(buf, in); len -=3D 16; p +=3D 16; --=20 2.34.1