From nobody Mon Jun 8 08:36:53 2026 Received: from mail-pl1-f178.google.com (mail-pl1-f178.google.com [209.85.214.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7100D33AD8A for ; Sat, 30 May 2026 15:59:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780156743; cv=none; b=peYAmyevin54xptXPqotPeo3ktZ7PJFt1jfWr/e17njF1sB2oLIJswAMT6I45fkXKNW1lb06Por9X4Tbzg5rUD2yj4Q5xJiPBqPQ6eSWwumZeD56FF+sOe+jbc6oxFkZQ1FoZtcebAXBZFBpW5ByVEmzpHe0i0UgwwCetINYEYk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780156743; c=relaxed/simple; bh=FWOQ055pFxdU7ZYB0m49NCuXgnO9kc1GrJT/DGd66Kw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=pO91RsUubQ7sg4N7g5YtNjiopsHcE8YPH42k/cJnLXz4WmGvWJlX9DaMF25PWIE1VJduoZyj4piKEVeYHNyyiqVnqiE87BmiQUKKBEiG96FEFlxHx/S9tv9lcRBjTnt/EqtCxJzmwDI4UYeWyhoGZYbBlHtpkQypX5cmJlj1qQU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gms.tku.edu.tw; spf=pass smtp.mailfrom=gms.tku.edu.tw; dkim=pass (2048-bit key) header.d=gms-tku-edu-tw.20251104.gappssmtp.com header.i=@gms-tku-edu-tw.20251104.gappssmtp.com header.b=lpU8ukGX; arc=none smtp.client-ip=209.85.214.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gms.tku.edu.tw Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gms.tku.edu.tw Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gms-tku-edu-tw.20251104.gappssmtp.com header.i=@gms-tku-edu-tw.20251104.gappssmtp.com header.b="lpU8ukGX" Received: by mail-pl1-f178.google.com with SMTP id d9443c01a7336-2bf30d530bdso12070635ad.3 for ; Sat, 30 May 2026 08:59:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gms-tku-edu-tw.20251104.gappssmtp.com; s=20251104; t=1780156740; x=1780761540; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=lXA8Kf6bemjJFZpJ/91pWHwRvg99tuK1KLXcYA1Cdqw=; b=lpU8ukGXgFoFnN52sMjxNUO56PfRIGJuLEfHZMXoxWd6aUDajJskEh3RlK+AVepjYs SiYUx3UjBZoEEgem5APD/B9AfAroXOHf/CDBch4g6GdmgVF69SOX95wqts+6TYi9JkiI vP50gk+FpVaSgrSsfRRNXEkCk3Nu0tsQnEk9hQKUYk8nctyxLAOz2Vz4WkQ4BQJmX+WK EV0ktIlzzlzPerDqyK6bsiV0a6QkjTswK6txcy7hkauoGDAls1cYZPUp0jCwxf1qbvw+ AKEEdxvUapsYbJ55Q98vnrwS0pOWQPxu/C5FAg+doL+X91GQt/qp+8gvTAtcfmdItFre HHpA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780156740; x=1780761540; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=lXA8Kf6bemjJFZpJ/91pWHwRvg99tuK1KLXcYA1Cdqw=; b=gYAeR9LviwdbTWhMo3Xnu7JqZxuNZAW18x6uprNMiqi0vSjUvgxvODd52EFiOW3JyN O2YVe1F7fDjQdphhzkOpV8YSUqKGaDIV7Hi8dj2/85XH2cIQrxYcmMDrZb/t8z9Sb4An xuxr0Ok2/94o3qOk+g4KkAHWN2i1n98Y8L5i0uw6uggnY5gQT+87MlatYWC0VqVYYZcr Vn1zx92MZ/2btsaldgsHmWGGahdORvbrBvw0OfK3zoJVGZdgG+a3bbN7LWSxZH0z2n0r RYdeN2W5bmIMmehT++f5C7rrE9L0dDPt0PhAb7QeQKFLAzGpS+mIHEK6TU8hfAZo9EoS /rgA== X-Forwarded-Encrypted: i=1; AFNElJ91TzY4Mxukc5YzOruGuJjbnIb6FbQsbon7olM6Pmj0T+k7xoLq+ggfyvHvRXA60N1+9x98x621eX88U3U=@vger.kernel.org X-Gm-Message-State: AOJu0YxxRRHpIkQHZSbCGl7Zz7dzfDea3dVyztkG7wXVMDDVVA9otFsR obqxqGObIbyviTRsnWVNd3gkAuX5Ual1k5XuU8ai0XfhOp7foly8BOwebKI5QnwL47g= X-Gm-Gg: Acq92OE0cSDYLdz6GRv0OI27bF2ZBM7bLkpviScrW9gWQfrR4a7h9A3W3iKE9d+HYef pqrwR81EBQ16l6+Hrs5s0BQftnJOFMClHnXtMk/8foM+OM/jOb1a3TE1163A+GPfFkmcoVjTbrq Zi2FyJ/oGG29iCUnXYn2YW3HhMOIwtM4LcIsuytjtLXfNVlRe0XvcZY07OIL5keoXIQiZnnIp8Y jGAtcg+qdJmSqyFPAqr6syos1LHUbibhSRFs5oZdgHqxtyj2Yf3PmJ6VHxfBFi7wxyvfV8z6cSe 5B1s4tGRdkgyCgOsyuyuxLPnYJ6pKEQXGlbPbOT0oOtBiojXhAA4LzXtGrUWIuUph45QPVXqS6k 3DmRizdyK2bQ8YlSkOt494cZ2Q6U9faicGrzPA9Ripc6W6dBJIe2+zqqCP/WZ/hCUw8ChQLMNrj lNwU6cyKdTD8Z8Jfy81+wh+b5lZqnm4Frve93vUOaW0rgf/hBwYohB6W4F X-Received: by 2002:a17:903:38d0:b0:2b2:4f43:b48c with SMTP id d9443c01a7336-2bf367a89a6mr55300905ad.14.1780156739676; Sat, 30 May 2026 08:58:59 -0700 (PDT) Received: from wu-Pro-E500-G6-WS720T.. ([2001:288:7001:2703:429:4bda:e5dc:8a16]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2bf23c1e46asm51130855ad.61.2026.05.30.08.58.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 May 2026 08:58:59 -0700 (PDT) From: Guan-Chun Wu <409411716@gms.tku.edu.tw> To: Theodore Ts'o , Andreas Dilger , Baokun Li , Jan Kara , Ojaswin Mujoo , Ritesh Harjani , Zhang Yi Cc: linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, edward062254@gmail.com, visitorckw@gmail.com, david.laight.linux@gmail.com, Guan-Chun Wu <409411716@gms.tku.edu.tw> Subject: [PATCH v5 1/2] ext4: add Kunit coverage for directory hash computation Date: Sat, 30 May 2026 23:58:16 +0800 Message-Id: <20260530155817.2311587-2-409411716@gms.tku.edu.tw> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260530155817.2311587-1-409411716@gms.tku.edu.tw> References: <20260530155817.2311587-1-409411716@gms.tku.edu.tw> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce Kunit tests for fs/ext4/hash.c to verify ext4fs_dirhash() across the legacy, half-MD4, and TEA hash variants. The tests cover empty, seeded hashing, and non-ASCII name handling. They also verify error paths, including invalid hash versions and SipHash without a configured key, and check that the signed and unsigned hash variants differ on non-ASCII input as expected. When CONFIG_UNICODE is enabled, the tests further verify casefolded-name hashing and the fallback behavior for invalid input. Co-developed-by: Chen Hao Yu Signed-off-by: Chen Hao Yu Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw> --- fs/ext4/Makefile | 2 +- fs/ext4/hash-test.c | 560 ++++++++++++++++++++++++++++++++++++++++++++ fs/ext4/hash.c | 4 + 3 files changed, 565 insertions(+), 1 deletion(-) create mode 100644 fs/ext4/hash-test.c diff --git a/fs/ext4/Makefile b/fs/ext4/Makefile index 3baee4e7c..3f9fc0eb8 100644 --- a/fs/ext4/Makefile +++ b/fs/ext4/Makefile @@ -15,7 +15,7 @@ ext4-y :=3D balloc.o bitmap.o block_validity.o dir.o ext4= _jbd2.o extents.o \ ext4-$(CONFIG_EXT4_FS_POSIX_ACL) +=3D acl.o ext4-$(CONFIG_EXT4_FS_SECURITY) +=3D xattr_security.o ext4-test-objs +=3D inode-test.o mballoc-test.o \ - extents-test.o + extents-test.o hash-test.o obj-$(CONFIG_EXT4_KUNIT_TESTS) +=3D ext4-test.o ext4-$(CONFIG_FS_VERITY) +=3D verity.o ext4-$(CONFIG_FS_ENCRYPTION) +=3D crypto.o diff --git a/fs/ext4/hash-test.c b/fs/ext4/hash-test.c new file mode 100644 index 000000000..2da66cafb --- /dev/null +++ b/fs/ext4/hash-test.c @@ -0,0 +1,560 @@ +// SPDX-License-Identifier: GPL-2.0 +/* + * KUnit tests for ext4 directory hash computation. + */ + +#include +#include +#include +#include +#include "ext4.h" + +static void ext4_hash_init_fake_dir(struct inode *dir, struct super_block = *sb) +{ + memset(sb, 0, sizeof(*sb)); + memset(dir, 0, sizeof(*dir)); + dir->i_sb =3D sb; + strscpy(sb->s_id, "kunit-ext4", sizeof(sb->s_id)); +} + +static void ext4_hash_init_fake_dir_with_sbi(struct inode *dir, + struct super_block *sb, + struct ext4_sb_info *sbi) +{ + ext4_hash_init_fake_dir(dir, sb); + memset(sbi, 0, sizeof(*sbi)); + sb->s_fs_info =3D sbi; + sbi->s_sb =3D sb; +} + +#if IS_ENABLED(CONFIG_UNICODE) +KUNIT_DEFINE_ACTION_WRAPPER(utf8_unload_action, utf8_unload, + struct unicode_map *); +#endif + +static void ext4_hash_init_fake_ext4_dir(struct ext4_inode_info *ei, + struct super_block *sb, + struct ext4_sb_info *sbi) +{ + struct inode *dir =3D &ei->vfs_inode; + + memset(sb, 0, sizeof(*sb)); + memset(ei, 0, sizeof(*ei)); + memset(sbi, 0, sizeof(*sbi)); + + strscpy(sb->s_id, "kunit-ext4", sizeof(sb->s_id)); + sb->s_fs_info =3D sbi; + sbi->s_sb =3D sb; + + dir->i_sb =3D sb; + dir->i_mode =3D S_IFDIR; + +#ifdef CONFIG_FS_ENCRYPTION + fscrypt_set_ops(sb, &ext4_cryptops); +#endif +} + +struct ext4_dirhash_test_case { + const char *name; + u32 hash_version; + const char *input; + int len; + u32 seed[4]; + bool use_seed; + u32 expected_hash; + u32 expected_minor_hash; +}; + +static const struct ext4_dirhash_test_case ext4_dirhash_test_cases[] =3D { + { + .name =3D "legacy_abc", + .hash_version =3D DX_HASH_LEGACY, + .input =3D "abc", + .len =3D 3, + .use_seed =3D false, + .expected_hash =3D 0x75afd992, + .expected_minor_hash =3D 0x00000000, + }, + { + .name =3D "legacy_unsigned_abc", + .hash_version =3D DX_HASH_LEGACY_UNSIGNED, + .input =3D "abc", + .len =3D 3, + .use_seed =3D false, + .expected_hash =3D 0x75afd992, + .expected_minor_hash =3D 0x00000000, + }, + { + .name =3D "half_md4_abc", + .hash_version =3D DX_HASH_HALF_MD4, + .input =3D "abc", + .len =3D 3, + .use_seed =3D false, + .expected_hash =3D 0xd196a868, + .expected_minor_hash =3D 0xc420eb28, + }, + { + .name =3D "half_md4_unsigned_abc", + .hash_version =3D DX_HASH_HALF_MD4_UNSIGNED, + .input =3D "abc", + .len =3D 3, + .use_seed =3D false, + .expected_hash =3D 0xd196a868, + .expected_minor_hash =3D 0xc420eb28, + }, + { + .name =3D "tea_abc", + .hash_version =3D DX_HASH_TEA, + .input =3D "abc", + .len =3D 3, + .use_seed =3D false, + .expected_hash =3D 0xb1435ec4, + .expected_minor_hash =3D 0x3f7eaa0e, + }, + { + .name =3D "tea_unsigned_abc", + .hash_version =3D DX_HASH_TEA_UNSIGNED, + .input =3D "abc", + .len =3D 3, + .use_seed =3D false, + .expected_hash =3D 0xb1435ec4, + .expected_minor_hash =3D 0x3f7eaa0e, + }, + { + .name =3D "empty_half_md4", + .hash_version =3D DX_HASH_HALF_MD4, + .input =3D "", + .len =3D 0, + .use_seed =3D false, + .expected_hash =3D 0xefcdab88, + .expected_minor_hash =3D 0x98badcfe, + }, + { + .name =3D "half_md4_31bytes", + .hash_version =3D DX_HASH_HALF_MD4, + .input =3D "1234567890123456789012345678901", + .len =3D 31, + .use_seed =3D false, + .expected_hash =3D 0xc4db1f78, + .expected_minor_hash =3D 0xea23921b, + }, + { + .name =3D "half_md4_32bytes", + .hash_version =3D DX_HASH_HALF_MD4, + .input =3D "12345678901234567890123456789012", + .len =3D 32, + .use_seed =3D false, + .expected_hash =3D 0xfa6cc63e, + .expected_minor_hash =3D 0x2f77bd1c, + }, + { + .name =3D "half_md4_33bytes", + .hash_version =3D DX_HASH_HALF_MD4, + .input =3D "123456789012345678901234567890123", + .len =3D 33, + .use_seed =3D false, + .expected_hash =3D 0xdc0c2dec, + .expected_minor_hash =3D 0x5ca23365, + }, + { + .name =3D "half_md4_unsigned_31bytes", + .hash_version =3D DX_HASH_HALF_MD4_UNSIGNED, + .input =3D "1234567890123456789012345678901", + .len =3D 31, + .use_seed =3D false, + .expected_hash =3D 0xc4db1f78, + .expected_minor_hash =3D 0xea23921b, + }, + { + .name =3D "half_md4_unsigned_32bytes", + .hash_version =3D DX_HASH_HALF_MD4_UNSIGNED, + .input =3D "12345678901234567890123456789012", + .len =3D 32, + .use_seed =3D false, + .expected_hash =3D 0xfa6cc63e, + .expected_minor_hash =3D 0x2f77bd1c, + }, + { + .name =3D "half_md4_unsigned_33bytes", + .hash_version =3D DX_HASH_HALF_MD4_UNSIGNED, + .input =3D "123456789012345678901234567890123", + .len =3D 33, + .use_seed =3D false, + .expected_hash =3D 0xdc0c2dec, + .expected_minor_hash =3D 0x5ca23365, + }, + { + .name =3D "tea_15bytes", + .hash_version =3D DX_HASH_TEA, + .input =3D "123456789abcdef", + .len =3D 15, + .use_seed =3D false, + .expected_hash =3D 0xa562903a, + .expected_minor_hash =3D 0x6174a00f, + }, + { + .name =3D "tea_16bytes", + .hash_version =3D DX_HASH_TEA, + .input =3D "1234567890abcdef", + .len =3D 16, + .use_seed =3D false, + .expected_hash =3D 0x8449f258, + .expected_minor_hash =3D 0x49a16d46, + }, + { + .name =3D "tea_17bytes", + .hash_version =3D DX_HASH_TEA, + .input =3D "123456789abcdefgh", + .len =3D 17, + .use_seed =3D false, + .expected_hash =3D 0xf32ec10c, + .expected_minor_hash =3D 0x58ceae61, + }, + { + .name =3D "half_md4_seeded", + .hash_version =3D DX_HASH_HALF_MD4, + .input =3D "same-name", + .len =3D 9, + .seed =3D { 0x11111111, 0x22222222, 0x33333333, 0x44444444 }, + .use_seed =3D true, + .expected_hash =3D 0x8aebf604, + .expected_minor_hash =3D 0x66ce48fe, + }, + { + .name =3D "half_md4_non_ascii_signed", + .hash_version =3D DX_HASH_HALF_MD4, + .input =3D "\x80\x81\x82\x83\x84", + .len =3D 5, + .use_seed =3D false, + .expected_hash =3D 0x8bab0498, + .expected_minor_hash =3D 0xc326632d, + }, + { + .name =3D "half_md4_non_ascii_unsigned", + .hash_version =3D DX_HASH_HALF_MD4_UNSIGNED, + .input =3D "\x80\x81\x82\x83\x84", + .len =3D 5, + .use_seed =3D false, + .expected_hash =3D 0xbc48596e, + .expected_minor_hash =3D 0xde0fad41, + }, + { + .name =3D "tea_non_ascii_signed", + .hash_version =3D DX_HASH_TEA, + .input =3D "\x80\x81\x82\x83\x84", + .len =3D 5, + .use_seed =3D false, + .expected_hash =3D 0x21e3a154, + .expected_minor_hash =3D 0x90112c3d, + }, + { + .name =3D "tea_non_ascii_unsigned", + .hash_version =3D DX_HASH_TEA_UNSIGNED, + .input =3D "\x80\x81\x82\x83\x84", + .len =3D 5, + .use_seed =3D false, + .expected_hash =3D 0x9b648616, + .expected_minor_hash =3D 0x011dd507, + }, +}; + +static void test_ext4fs_dirhash_vectors(struct kunit *test) +{ + struct super_block *sb; + struct inode *dir; + int i; + + sb =3D kunit_kzalloc(test, sizeof(*sb), GFP_KERNEL); + dir =3D kunit_kzalloc(test, sizeof(*dir), GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, sb); + KUNIT_ASSERT_NOT_NULL(test, dir); + + ext4_hash_init_fake_dir(dir, sb); + + for (i =3D 0; i < ARRAY_SIZE(ext4_dirhash_test_cases); i++) { + const struct ext4_dirhash_test_case *tc =3D + &ext4_dirhash_test_cases[i]; + struct dx_hash_info hinfo; + int ret; + + memset(&hinfo, 0, sizeof(hinfo)); + hinfo.hash_version =3D tc->hash_version; + hinfo.seed =3D tc->use_seed ? (u32 *)tc->seed : NULL; + + ret =3D ext4fs_dirhash(dir, tc->input, tc->len, &hinfo); + + KUNIT_ASSERT_EQ_MSG(test, ret, 0, "case=3D%s", tc->name); + KUNIT_EXPECT_EQ_MSG(test, hinfo.hash, tc->expected_hash, + "case=3D%s", tc->name); + KUNIT_EXPECT_EQ_MSG(test, hinfo.minor_hash, + tc->expected_minor_hash, + "case=3D%s", tc->name); + } +} + +static void test_ext4fs_dirhash_seed_changes_result(struct kunit *test) +{ + struct super_block *sb; + struct inode *dir; + u32 seed[4] =3D { 0x11111111, 0x22222222, 0x33333333, 0x44444444 }; + struct dx_hash_info plain =3D { + .hash_version =3D DX_HASH_HALF_MD4, + }; + struct dx_hash_info seeded =3D { + .hash_version =3D DX_HASH_HALF_MD4, + .seed =3D seed, + }; + int ret_plain, ret_seeded; + + sb =3D kunit_kzalloc(test, sizeof(*sb), GFP_KERNEL); + dir =3D kunit_kzalloc(test, sizeof(*dir), GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, sb); + KUNIT_ASSERT_NOT_NULL(test, dir); + + ext4_hash_init_fake_dir(dir, sb); + + ret_plain =3D ext4fs_dirhash(dir, "same-name", 9, &plain); + ret_seeded =3D ext4fs_dirhash(dir, "same-name", 9, &seeded); + + KUNIT_ASSERT_EQ(test, ret_plain, 0); + KUNIT_ASSERT_EQ(test, ret_seeded, 0); + + KUNIT_EXPECT_TRUE(test, + plain.hash !=3D seeded.hash || + plain.minor_hash !=3D seeded.minor_hash); +} + +static void test_ext4fs_dirhash_invalid_version_returns_einval(struct kuni= t *test) +{ + struct super_block *sb; + struct inode *dir; + struct ext4_sb_info *sbi; + struct dx_hash_info hinfo =3D { + .hash =3D 0xdeadbeef, + .minor_hash =3D 0xcafebabe, + .hash_version =3D DX_HASH_LAST + 1, + }; + int ret; + + sb =3D kunit_kzalloc(test, sizeof(*sb), GFP_KERNEL); + dir =3D kunit_kzalloc(test, sizeof(*dir), GFP_KERNEL); + sbi =3D kunit_kzalloc(test, sizeof(*sbi), GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, sb); + KUNIT_ASSERT_NOT_NULL(test, dir); + KUNIT_ASSERT_NOT_NULL(test, sbi); + + ext4_hash_init_fake_dir_with_sbi(dir, sb, sbi); + + ret =3D ext4fs_dirhash(dir, "abc", 3, &hinfo); + + KUNIT_EXPECT_EQ(test, ret, -EINVAL); + KUNIT_EXPECT_EQ(test, hinfo.hash, 0); + KUNIT_EXPECT_EQ(test, hinfo.minor_hash, 0); +} + +static void test_ext4fs_dirhash_siphash_without_key_returns_einval(struct = kunit *test) +{ + struct super_block *sb; + struct ext4_inode_info *ei; + struct inode *dir; + struct ext4_sb_info *sbi; + struct dx_hash_info hinfo =3D { + .hash_version =3D DX_HASH_SIPHASH, + }; + int ret; + + sb =3D kunit_kzalloc(test, sizeof(*sb), GFP_KERNEL); + ei =3D kunit_kzalloc(test, sizeof(*ei), GFP_KERNEL); + sbi =3D kunit_kzalloc(test, sizeof(*sbi), GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, sb); + KUNIT_ASSERT_NOT_NULL(test, ei); + KUNIT_ASSERT_NOT_NULL(test, sbi); + + ext4_hash_init_fake_ext4_dir(ei, sb, sbi); + dir =3D &ei->vfs_inode; + + ret =3D ext4fs_dirhash(dir, "name", strlen("name"), &hinfo); + + KUNIT_EXPECT_EQ(test, ret, -EINVAL); +} + +static void test_ext4fs_dirhash_signed_unsigned_differ_on_nonascii(struct = kunit *test) +{ + struct super_block *sb; + struct inode *dir; + static const char input[] =3D "\x80\xff\x81\xfe\101bc"; + struct dx_hash_info legacy_signed =3D { + .hash_version =3D DX_HASH_LEGACY, + }; + struct dx_hash_info legacy_unsigned =3D { + .hash_version =3D DX_HASH_LEGACY_UNSIGNED, + }; + struct dx_hash_info md4_signed =3D { + .hash_version =3D DX_HASH_HALF_MD4, + }; + struct dx_hash_info md4_unsigned =3D { + .hash_version =3D DX_HASH_HALF_MD4_UNSIGNED, + }; + struct dx_hash_info tea_signed =3D { + .hash_version =3D DX_HASH_TEA, + }; + struct dx_hash_info tea_unsigned =3D { + .hash_version =3D DX_HASH_TEA_UNSIGNED, + }; + int ret; + + sb =3D kunit_kzalloc(test, sizeof(*sb), GFP_KERNEL); + dir =3D kunit_kzalloc(test, sizeof(*dir), GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, sb); + KUNIT_ASSERT_NOT_NULL(test, dir); + + ext4_hash_init_fake_dir(dir, sb); + + ret =3D ext4fs_dirhash(dir, input, sizeof(input) - 1, &legacy_signed); + KUNIT_ASSERT_EQ(test, ret, 0); + ret =3D ext4fs_dirhash(dir, input, sizeof(input) - 1, &legacy_unsigned); + KUNIT_ASSERT_EQ(test, ret, 0); + KUNIT_EXPECT_NE(test, legacy_signed.hash, legacy_unsigned.hash); + + ret =3D ext4fs_dirhash(dir, input, sizeof(input) - 1, &md4_signed); + KUNIT_ASSERT_EQ(test, ret, 0); + ret =3D ext4fs_dirhash(dir, input, sizeof(input) - 1, &md4_unsigned); + KUNIT_ASSERT_EQ(test, ret, 0); + KUNIT_EXPECT_TRUE(test, + md4_signed.hash !=3D md4_unsigned.hash || + md4_signed.minor_hash !=3D md4_unsigned.minor_hash); + + ret =3D ext4fs_dirhash(dir, input, sizeof(input) - 1, &tea_signed); + KUNIT_ASSERT_EQ(test, ret, 0); + ret =3D ext4fs_dirhash(dir, input, sizeof(input) - 1, &tea_unsigned); + KUNIT_ASSERT_EQ(test, ret, 0); + KUNIT_EXPECT_TRUE(test, + tea_signed.hash !=3D tea_unsigned.hash || + tea_signed.minor_hash !=3D tea_unsigned.minor_hash); +} + +#if IS_ENABLED(CONFIG_UNICODE) +static void test_ext4fs_dirhash_casefolded_names_hash_consistently(struct = kunit *test) +{ + struct super_block *sb; + struct ext4_inode_info *ei; + struct ext4_sb_info *sbi; + struct unicode_map *um; + struct dx_hash_info h1 =3D { + .hash_version =3D DX_HASH_HALF_MD4, + }; + struct dx_hash_info h2 =3D { + .hash_version =3D DX_HASH_HALF_MD4, + }; + int ret, ret1, ret2; + + sb =3D kunit_kzalloc(test, sizeof(*sb), GFP_KERNEL); + ei =3D kunit_kzalloc(test, sizeof(*ei), GFP_KERNEL); + sbi =3D kunit_kzalloc(test, sizeof(*sbi), GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, sb); + KUNIT_ASSERT_NOT_NULL(test, ei); + KUNIT_ASSERT_NOT_NULL(test, sbi); + + um =3D utf8_load(UTF8_LATEST); + if (IS_ERR(um)) { + kunit_skip(test, "utf8_load(UTF8_LATEST) failed: %pe", + um); + return; + } + + ret =3D kunit_add_action_or_reset(test, utf8_unload_action, um); + KUNIT_ASSERT_EQ(test, ret, 0); + + ext4_hash_init_fake_ext4_dir(ei, sb, sbi); + sb->s_encoding =3D um; + ei->vfs_inode.i_flags |=3D S_CASEFOLD; + + KUNIT_ASSERT_TRUE(test, IS_CASEFOLDED(&ei->vfs_inode)); + + ret1 =3D ext4fs_dirhash(&ei->vfs_inode, "Alpha", 5, &h1); + ret2 =3D ext4fs_dirhash(&ei->vfs_inode, "aLPHa", 5, &h2); + + KUNIT_ASSERT_EQ(test, ret1, 0); + KUNIT_ASSERT_EQ(test, ret2, 0); + KUNIT_EXPECT_EQ(test, h1.hash, h2.hash); + KUNIT_EXPECT_EQ(test, h1.minor_hash, h2.minor_hash); +} + +static void test_ext4fs_dirhash_casefold_fallback(struct kunit *test) +{ + struct super_block *sb_cf, *sb_plain; + struct ext4_inode_info *ei; + struct ext4_sb_info *sbi; + struct inode *plain_dir; + struct unicode_map *um; + static const char invalid_utf8[] =3D "\xc3\x28"; + struct dx_hash_info folded_dir =3D { + .hash_version =3D DX_HASH_HALF_MD4, + }; + struct dx_hash_info plain =3D { + .hash_version =3D DX_HASH_HALF_MD4, + }; + int ret, ret_cf, ret_plain; + + sb_cf =3D kunit_kzalloc(test, sizeof(*sb_cf), GFP_KERNEL); + sb_plain =3D kunit_kzalloc(test, sizeof(*sb_plain), GFP_KERNEL); + ei =3D kunit_kzalloc(test, sizeof(*ei), GFP_KERNEL); + sbi =3D kunit_kzalloc(test, sizeof(*sbi), GFP_KERNEL); + plain_dir =3D kunit_kzalloc(test, sizeof(*plain_dir), GFP_KERNEL); + KUNIT_ASSERT_NOT_NULL(test, sb_cf); + KUNIT_ASSERT_NOT_NULL(test, sb_plain); + KUNIT_ASSERT_NOT_NULL(test, ei); + KUNIT_ASSERT_NOT_NULL(test, sbi); + KUNIT_ASSERT_NOT_NULL(test, plain_dir); + + um =3D utf8_load(UTF8_LATEST); + if (IS_ERR(um)) { + kunit_skip(test, "utf8_load(UTF8_LATEST) failed: %pe", + um); + return; + } + + ret =3D kunit_add_action_or_reset(test, utf8_unload_action, um); + KUNIT_ASSERT_EQ(test, ret, 0); + + ext4_hash_init_fake_ext4_dir(ei, sb_cf, sbi); + sb_cf->s_encoding =3D um; + ei->vfs_inode.i_flags |=3D S_CASEFOLD; + + KUNIT_ASSERT_TRUE(test, IS_CASEFOLDED(&ei->vfs_inode)); + + ext4_hash_init_fake_dir(plain_dir, sb_plain); + + ret_cf =3D ext4fs_dirhash(&ei->vfs_inode, invalid_utf8, + sizeof(invalid_utf8) - 1, &folded_dir); + ret_plain =3D ext4fs_dirhash(plain_dir, invalid_utf8, + sizeof(invalid_utf8) - 1, &plain); + + KUNIT_ASSERT_EQ(test, ret_cf, 0); + KUNIT_ASSERT_EQ(test, ret_plain, 0); + KUNIT_EXPECT_EQ(test, folded_dir.hash, plain.hash); + KUNIT_EXPECT_EQ(test, folded_dir.minor_hash, plain.minor_hash); +} +#endif + +static struct kunit_case ext4_hash_test_cases[] =3D { + KUNIT_CASE(test_ext4fs_dirhash_vectors), + KUNIT_CASE(test_ext4fs_dirhash_seed_changes_result), + KUNIT_CASE(test_ext4fs_dirhash_invalid_version_returns_einval), + KUNIT_CASE(test_ext4fs_dirhash_siphash_without_key_returns_einval), + KUNIT_CASE(test_ext4fs_dirhash_signed_unsigned_differ_on_nonascii), +#if IS_ENABLED(CONFIG_UNICODE) + KUNIT_CASE(test_ext4fs_dirhash_casefolded_names_hash_consistently), + KUNIT_CASE(test_ext4fs_dirhash_casefold_fallback), +#endif + {} +}; + +static struct kunit_suite ext4_hash_test_suite =3D { + .name =3D "ext4_hash", + .test_cases =3D ext4_hash_test_cases, +}; + +kunit_test_suites(&ext4_hash_test_suite); + +MODULE_LICENSE("GPL"); diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c index 48483cd01..72645bd92 100644 --- a/fs/ext4/hash.c +++ b/fs/ext4/hash.c @@ -321,3 +321,7 @@ int ext4fs_dirhash(const struct inode *dir, const char = *name, int len, #endif return __ext4fs_dirhash(dir, name, len, hinfo); } + +#if IS_ENABLED(CONFIG_EXT4_KUNIT_TESTS) +EXPORT_SYMBOL_FOR_EXT4_TEST(ext4fs_dirhash); +#endif --=20 2.34.1 From nobody Mon Jun 8 08:36:53 2026 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 41F1633F8A2 for ; Sat, 30 May 2026 15:59:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780156747; cv=none; b=hJb+jrHpIfvapHATFt52PcmS1pyCxYN7u9JQQsK2MzCmwwtIn0FAMObsNDiGIIwpTnOE7Rwr1gfxzGtrhbbCaUV0mHsfR1kniEc/fJzdT6KzgG5SumLCa12aWVxXtvNS5TbKbZqgzL8DNWG/K5x+8BmxoaW2T1IkWWDnlIVFsRA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1780156747; c=relaxed/simple; bh=4/z6yP3+JuzW+3nQ9d+tB8xobpPUjdjHcioAcY9jFlQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=nDnjy4g8sGWFms+S7P1jcsiNRNxZEUU6bajFo+CyyHtWshv5rUSwHR+oRgeU7CVq4kmQpdqrtmbbcXgsYi1i0+nKw8vQ+eFI8wEdJlww16EPOcKIiSS+BpqIa0I59fXB4LroCBMOFzb9+sp31FpzOymR4mRoOOwZFunBRqm/tzs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gms.tku.edu.tw; spf=pass smtp.mailfrom=gms.tku.edu.tw; dkim=pass (2048-bit key) header.d=gms-tku-edu-tw.20251104.gappssmtp.com header.i=@gms-tku-edu-tw.20251104.gappssmtp.com header.b=LSV1nAVD; arc=none smtp.client-ip=209.85.214.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gms.tku.edu.tw Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gms.tku.edu.tw Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gms-tku-edu-tw.20251104.gappssmtp.com header.i=@gms-tku-edu-tw.20251104.gappssmtp.com header.b="LSV1nAVD" Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-2bc763e2ba8so74886585ad.3 for ; Sat, 30 May 2026 08:59:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gms-tku-edu-tw.20251104.gappssmtp.com; s=20251104; t=1780156743; x=1780761543; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QwgjGyriqartm9D+KeKxuc1rd60Ey0Lbv69xmUQfiEQ=; b=LSV1nAVDGRrj97WDMhvQWlJ22AQPEFVEN5tbEtOU9woZItk0FwrK0cI+oGm7xpfJUd nFvK7QZSf64oEJq5JhyD63m5vOHAlevgc68TwJ1KTM1xiNhPMnJuAmUkCejrZ1iaUfWO YChsYMbmxqf6Me+rSC5FvpmVqYepg8WXQIt9Fa3Z5CvH6bHcSXPljYG5Vq5HveWbv20H 9v6i4x/XTF3O94fuidm1EUEgtAt6hpVoiiNxdc4LElEzwXoENRVCPQA8u0kLv5/EZjFQ qO1RwvyK8FnaZJ5CXqDWjo3Ruy8LmAhh6xTbkQhQEP8aIk4whWsFPxXdhdgokSbLzqIz OijA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780156743; x=1780761543; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=QwgjGyriqartm9D+KeKxuc1rd60Ey0Lbv69xmUQfiEQ=; b=TZJDj7BrAVWM7h5WF8HK1ttmY/c5H3s6BmcuariTs6e2FyVgd3onWBQJouvQJ/8tuB yzhIhKjxHiikoFmMhNlVadQ1Df/FsKppOmf2bRSKfTJngXTxkDai2C4FpVRB6g/6RXh5 E8e+ILSTr2UyuzrJy8CwjnYVKf7BZMv16jYK8bChUIBEKWVWnbqqSPjxt4cHP2G1SXun jgXb+tnI5Gq00ufOd0zC46XAHjlBUaQZzAwXJHOUL1NOl6uX4IejkTXEDv6VOxTBMhHB 4JwjSgNI740kkmPMuO6Pse+usV2L5xyLEAZWqNrjwXBQUv0vL0PEreAeoB1SOT/PdzPG H68A== X-Forwarded-Encrypted: i=1; AFNElJ/K7t+j5YKfLFX1GsVQzP/s8w38n10gbyNaRqO08gHc9ecTE4tHEdwYpsgfinmQqW5pDvp1o94k69d7I38=@vger.kernel.org X-Gm-Message-State: AOJu0YzJgyS6UnwOA25U5aX9QEec+UlvoJ5RLD2gEWMKs9Of8Oni6sr+ ZQMNjiCwLsG4O/OCYoJKclI1D929U0U9UUCt1F+JknvjQAyf1yq1Y92GfPMCsXO5mZY= X-Gm-Gg: Acq92OFR1VWPeFChoBiazH0mnhq4fqzt6NRIubnbBmGxQrJGgz3JLQ9fRj2txHf5/Yz L0aOMzGMF/3kd05qdQsnzzVzgo2lpzqFn/6NQ33TWvXMffLShUIuTGF/U3ezYvTPnd5PwYLp55+ DJopcR9ZCcXyqinbdyNs7LSe6uiUIHFpmHqxU03QhdAK3PEWj6TemAhduj88bIDQXbcPMiY7z2E lBrJRetv21ouVn1JPWpuoFoTCPRHA1qLGvHagqJY5e/d/Krjjo55ZR8gX8xNFkEkTBd5HngmKYW aOSFa2fM2uThsAxmd2EWgayPggrur17MPUBg8G/+1EEd0AIbeU8iFvLLdmM9twpdRLBTADdtypR nh2up9qL2CvsLN5osYOTpdDIT5CLWebDPg9Wx6nCNllbQ0ePVfsFbbaB4kk3KE+GFPnaakR0TQG uCzMFTc7f7b2HdG1c/viSfpWYYvyX6NTKnRBUBX5hzd6p51HUFxJ3VBLWh X-Received: by 2002:a17:903:1983:b0:2bf:23ad:8595 with SMTP id d9443c01a7336-2bf367b260amr49708545ad.4.1780156743412; Sat, 30 May 2026 08:59:03 -0700 (PDT) Received: from wu-Pro-E500-G6-WS720T.. ([2001:288:7001:2703:429:4bda:e5dc:8a16]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-2bf23c1e46asm51130855ad.61.2026.05.30.08.59.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 30 May 2026 08:59:03 -0700 (PDT) From: Guan-Chun Wu <409411716@gms.tku.edu.tw> To: Theodore Ts'o , Andreas Dilger , Baokun Li , Jan Kara , Ojaswin Mujoo , Ritesh Harjani , Zhang Yi Cc: linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, edward062254@gmail.com, visitorckw@gmail.com, david.laight.linux@gmail.com, Guan-Chun Wu <409411716@gms.tku.edu.tw> Subject: [PATCH v5 2/2] ext4: improve str2hashbuf by processing 4-byte chunks and removing function pointers Date: Sat, 30 May 2026 23:58:17 +0800 Message-Id: <20260530155817.2311587-3-409411716@gms.tku.edu.tw> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20260530155817.2311587-1-409411716@gms.tku.edu.tw> References: <20260530155817.2311587-1-409411716@gms.tku.edu.tw> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The original byte-by-byte implementation with modulo checks is less efficient. Refactor str2hashbuf_unsigned() and str2hashbuf_signed() to process input in explicit 4-byte chunks instead of using a modulus-based loop to emit words byte by byte. Additionally, the use of function pointers for selecting the appropriate str2hashbuf implementation has been removed. Instead, the functions are directly invoked based on the hash type, eliminating the overhead of dynamic function calls. Performance test (x86_64, Intel Core i7-10700 @ 2.90GHz, average over 10000 runs, using kernel module for testing): len | orig_s | new_s | orig_u | new_u ----+--------+-------+--------+------- 1 | 70 | 71 | 63 | 63 8 | 68 | 64 | 64 | 62 32 | 75 | 70 | 75 | 63 64 | 96 | 71 | 100 | 68 255 | 192 | 108 | 187 | 84 This change improves performance, especially for larger input sizes. Signed-off-by: Guan-Chun Wu <409411716@gms.tku.edu.tw> --- fs/ext4/hash.c | 64 +++++++++++++++++++++++++++++++++----------------- 1 file changed, 42 insertions(+), 22 deletions(-) diff --git a/fs/ext4/hash.c b/fs/ext4/hash.c index 72645bd92..978bd92da 100644 --- a/fs/ext4/hash.c +++ b/fs/ext4/hash.c @@ -9,6 +9,7 @@ #include #include #include +#include #include "ext4.h" =20 #define DELTA 0x9E3779B9 @@ -141,21 +142,28 @@ static void str2hashbuf_signed(const char *msg, int l= en, __u32 *buf, int num) pad =3D (__u32)len | ((__u32)len << 8); pad |=3D pad << 16; =20 - val =3D pad; if (len > num*4) len =3D num * 4; - for (i =3D 0; i < len; i++) { - val =3D ((int) scp[i]) + (val << 8); - if ((i % 4) =3D=3D 3) { - *buf++ =3D val; - val =3D pad; - num--; - } + + while (len >=3D 4) { + val =3D ((__u32)scp[0] << 24) + ((__u32)scp[1] << 16) + ((__u32)scp[2] <= < 8) + scp[3]; + *buf++ =3D val; + scp +=3D 4; + len -=3D 4; + num--; } + + val =3D pad; + + for (i =3D 0; i < len; i++) + val =3D scp[i] + (val << 8); + if (--num >=3D 0) *buf++ =3D val; + while (--num >=3D 0) *buf++ =3D pad; + } =20 static void str2hashbuf_unsigned(const char *msg, int len, __u32 *buf, int= num) @@ -167,21 +175,28 @@ static void str2hashbuf_unsigned(const char *msg, int= len, __u32 *buf, int num) pad =3D (__u32)len | ((__u32)len << 8); pad |=3D pad << 16; =20 - val =3D pad; if (len > num*4) len =3D num * 4; - for (i =3D 0; i < len; i++) { - val =3D ((int) ucp[i]) + (val << 8); - if ((i % 4) =3D=3D 3) { - *buf++ =3D val; - val =3D pad; - num--; - } + + while (len >=3D 4) { + val =3D get_unaligned_be32(ucp); + *buf++ =3D val; + ucp +=3D 4; + len -=3D 4; + num--; } + + val =3D pad; + + for (i =3D 0; i < len; i++) + val =3D ucp[i] + (val << 8); + if (--num >=3D 0) *buf++ =3D val; + while (--num >=3D 0) *buf++ =3D pad; + } =20 /* @@ -205,8 +220,7 @@ static int __ext4fs_dirhash(const struct inode *dir, co= nst char *name, int len, const char *p; int i; __u32 in[8], buf[4]; - void (*str2hashbuf)(const char *, int, __u32 *, int) =3D - str2hashbuf_signed; + bool use_unsigned =3D false; =20 /* Initialize the default seed for the hash checksum functions */ buf[0] =3D 0x67452301; @@ -232,12 +246,15 @@ static int __ext4fs_dirhash(const struct inode *dir, = const char *name, int len, hash =3D dx_hack_hash_signed(name, len); break; case DX_HASH_HALF_MD4_UNSIGNED: - str2hashbuf =3D str2hashbuf_unsigned; + use_unsigned =3D true; fallthrough; case DX_HASH_HALF_MD4: p =3D name; while (len > 0) { - (*str2hashbuf)(p, len, in, 8); + if (use_unsigned) + str2hashbuf_unsigned(p, len, in, 8); + else + str2hashbuf_signed(p, len, in, 8); half_md4_transform(buf, in); len -=3D 32; p +=3D 32; @@ -246,12 +263,15 @@ static int __ext4fs_dirhash(const struct inode *dir, = const char *name, int len, hash =3D buf[1]; break; case DX_HASH_TEA_UNSIGNED: - str2hashbuf =3D str2hashbuf_unsigned; + use_unsigned =3D true; fallthrough; case DX_HASH_TEA: p =3D name; while (len > 0) { - (*str2hashbuf)(p, len, in, 4); + if (use_unsigned) + str2hashbuf_unsigned(p, len, in, 4); + else + str2hashbuf_signed(p, len, in, 4); TEA_transform(buf, in); len -=3D 16; p +=3D 16; --=20 2.34.1