[PATCH v6 01/10] libfs: Create the helper function generic_ci_validate_strict_name()

André Almeida posted 10 patches 1 month, 2 weeks ago
There is a newer version of this series
[PATCH v6 01/10] libfs: Create the helper function generic_ci_validate_strict_name()
Posted by André Almeida 1 month, 2 weeks ago
Create a helper function for filesystems do the checks required for
casefold directories and strict encoding.

Suggested-by: Gabriel Krisman Bertazi <krisman@suse.de>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
Changes from v4:
- Inline this function

Changes from v2:
- Moved function to libfs and adpated its name
- Wrapped at 72 chars column
- Decomposed the big if (...) to be more clear
---
 include/linux/fs.h | 45 +++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 45 insertions(+)

diff --git a/include/linux/fs.h b/include/linux/fs.h
index 3559446279c152c70b7f1f3e0154f6e66a5aba33..9b232aee4cd6ad8dce64370db0111bd25d3fedfa 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -45,6 +45,7 @@
 #include <linux/slab.h>
 #include <linux/maple_tree.h>
 #include <linux/rw_hint.h>
+#include <linux/unicode.h>
 
 #include <asm/byteorder.h>
 #include <uapi/linux/fs.h>
@@ -3456,6 +3457,50 @@ extern int generic_ci_match(const struct inode *parent,
 			    const struct qstr *folded_name,
 			    const u8 *de_name, u32 de_name_len);
 
+#if IS_ENABLED(CONFIG_UNICODE)
+/**
+ * generic_ci_validate_strict_name - Check if a given name is suitable
+ * for a directory
+ *
+ * This functions checks if the proposed filename is valid for the
+ * parent directory. That means that only valid UTF-8 filenames will be
+ * accepted for casefold directories from filesystems created with the
+ * strict encoding flag.  That also means that any name will be
+ * accepted for directories that doesn't have casefold enabled, or
+ * aren't being strict with the encoding.
+ *
+ * @dir: inode of the directory where the new file will be created
+ * @name: name of the new file
+ *
+ * Return:
+ * * True if the filename is suitable for this directory. It can be
+ * true if a given name is not suitable for a strict encoding
+ * directory, but the directory being used isn't strict
+ * * False if the filename isn't suitable for this directory. This only
+ * happens when a directory is casefolded and the filesystem is strict
+ * about its encoding.
+ */
+static inline bool generic_ci_validate_strict_name(struct inode *dir, struct qstr *name)
+{
+	if (!IS_CASEFOLDED(dir) || !sb_has_strict_encoding(dir->i_sb))
+		return true;
+
+	/*
+	 * A casefold dir must have a encoding set, unless the filesystem
+	 * is corrupted
+	 */
+	if (WARN_ON_ONCE(!dir->i_sb->s_encoding))
+		return true;
+
+	return utf8_validate(dir->i_sb->s_encoding, name);
+}
+#else
+static inline bool generic_ci_validate_strict_name(struct inode *dir, struct qstr *name)
+{
+	return true;
+}
+#endif
+
 static inline bool sb_has_encoding(const struct super_block *sb)
 {
 #if IS_ENABLED(CONFIG_UNICODE)

-- 
2.47.0

Re: [PATCH v6 01/10] libfs: Create the helper function generic_ci_validate_strict_name()
Posted by Gabriel Krisman Bertazi 1 month, 1 week ago
André Almeida <andrealmeid@igalia.com> writes:

> +static inline bool generic_ci_validate_strict_name(struct inode *dir, struct qstr *name)
> +{
> +	if (!IS_CASEFOLDED(dir) || !sb_has_strict_encoding(dir->i_sb))
> +		return true;
> +
> +	/*
> +	 * A casefold dir must have a encoding set, unless the filesystem
> +	 * is corrupted
> +	 */
> +	if (WARN_ON_ONCE(!dir->i_sb->s_encoding))
> +		return true;
> +
> +	return utf8_validate(dir->i_sb->s_encoding, name);

There is something fishy here.  Concerningly, the fstests test doesn't
catch it.

utf8_validate is defined as:

  int utf8_validate(const struct unicode_map *um, const struct qstr *str)

Which returns 0 on success and !0 on error. Thus, when casting to bool,
the return code should be negated.

But generic/556 doesn't fail. That's because we are over cautious, and
also check the string at the end of generic_ci_d_hash.  So we never
really reach utf8_validate in the tested case.

But if you comment the final if in generic_ci_d_hash, you'll see this
patchset regresses the fstests case generic/556 over ext4.

We really need the check in both places, though.  We don't want to rely
on the behavior of generic_ci_d_hash to block invalid filenames, as that
might change.

-- 
Gabriel Krisman Bertazi
Re: [PATCH v6 01/10] libfs: Create the helper function generic_ci_validate_strict_name()
Posted by André Almeida 1 month, 1 week ago
Em 15/10/2024 12:59, Gabriel Krisman Bertazi escreveu:
> André Almeida <andrealmeid@igalia.com> writes:
> 
>> +static inline bool generic_ci_validate_strict_name(struct inode *dir, struct qstr *name)
>> +{
>> +	if (!IS_CASEFOLDED(dir) || !sb_has_strict_encoding(dir->i_sb))
>> +		return true;
>> +
>> +	/*
>> +	 * A casefold dir must have a encoding set, unless the filesystem
>> +	 * is corrupted
>> +	 */
>> +	if (WARN_ON_ONCE(!dir->i_sb->s_encoding))
>> +		return true;
>> +
>> +	return utf8_validate(dir->i_sb->s_encoding, name);
> 
> There is something fishy here.  Concerningly, the fstests test doesn't
> catch it.
> 
> utf8_validate is defined as:
> 
>    int utf8_validate(const struct unicode_map *um, const struct qstr *str)
> 
> Which returns 0 on success and !0 on error. Thus, when casting to bool,
> the return code should be negated.
> 
> But generic/556 doesn't fail. That's because we are over cautious, and
> also check the string at the end of generic_ci_d_hash.  So we never
> really reach utf8_validate in the tested case.
> 
> But if you comment the final if in generic_ci_d_hash, you'll see this
> patchset regresses the fstests case generic/556 over ext4.
> 
> We really need the check in both places, though.  We don't want to rely
> on the behavior of generic_ci_d_hash to block invalid filenames, as that
> might change.
> 

Thanks Krisman! Nice catch. I fixed this for the next version. Testing 
with the modified generic_ci_d_hash(), I also realized that the 
validation was in the wrong place and leaving an inode behind, this fixed:

diff --git a/mm/shmem.c b/mm/shmem.c
index eb1ea1f3b37c..7bd7ca5777af 100644
--- a/mm/shmem.c
+++ b/mm/shmem.c
@@ -3624,13 +3624,13 @@ shmem_mknod(struct mnt_idmap *idmap, struct 
inode *dir,
         struct inode *inode;
         int error;

+       if (!generic_ci_validate_strict_name(dir, &dentry->d_name))
+               return -EINVAL;
+
         inode = shmem_get_inode(idmap, dir->i_sb, dir, mode, dev, 
VM_NORESERVE);
         if (IS_ERR(inode))
                 return PTR_ERR(inode);

-       if (!generic_ci_validate_strict_name(dir, &dentry->d_name))
-               return -EINVAL;
-