[v2] tmpfs: Add case-insesitive support for tmpfs

[PATCH v2 2/8] unicode: Create utf8_check_strict_name

Posted by André Almeida 1 year, 5 months ago

Create a helper function for filesystems do the checks required for
casefold directories and strict enconding.

Suggested-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
Signed-off-by: André Almeida <andrealmeid@igalia.com>
---
 fs/unicode/utf8-core.c  | 26 ++++++++++++++++++++++++++
 include/linux/unicode.h |  2 ++
 2 files changed, 28 insertions(+)

diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c
index 0400824ef493..4966e175ed71 100644
--- a/fs/unicode/utf8-core.c
+++ b/fs/unicode/utf8-core.c
@@ -214,3 +214,29 @@ void utf8_unload(struct unicode_map *um)
 }
 EXPORT_SYMBOL(utf8_unload);
 
+/**
+ * utf8_check_strict_name - Check if a given name is suitable for a directory
+ *
+ * This functions checks if the proposed filename is suitable for the parent
+ * directory. That means that only valid UTF-8 filenames will be accepted for
+ * casefold directories from filesystems created with the strict enconding flags.
+ * That also means that any name will be accepted for directories that doesn't
+ * have casefold enabled, or aren't being strict with the enconding.
+ *
+ * @inode: inode of the directory where the new file will be created
+ * @d_name: name of the new file
+ *
+ * Returns:
+ *  * True if the filename is suitable for this directory. It can be true if a
+ *  given name is not suitable for a strict enconding directory, but the
+ *  directory being used isn't strict
+ *  * False if the filename isn't suitable for this directory. This only happens
+ *  when a directory is casefolded and is strict about its encoding.
+ */
+bool utf8_check_strict_name(struct inode *dir, struct qstr *d_name)
+{
+	return !(IS_CASEFOLDED(dir) && dir->i_sb->s_encoding &&
+	       sb_has_strict_encoding(dir->i_sb) &&
+	       utf8_validate(dir->i_sb->s_encoding, d_name));
+}
+EXPORT_SYMBOL(utf8_check_strict_name);
diff --git a/include/linux/unicode.h b/include/linux/unicode.h
index 4d39e6e11a95..fb56fb5e686c 100644
--- a/include/linux/unicode.h
+++ b/include/linux/unicode.h
@@ -76,4 +76,6 @@ int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
 struct unicode_map *utf8_load(unsigned int version);
 void utf8_unload(struct unicode_map *um);
 
+bool utf8_check_strict_name(struct inode *dir, struct qstr *d_name);
+
 #endif /* _LINUX_UNICODE_H */
-- 
2.46.0

Re: [PATCH v2 2/8] unicode: Create utf8_check_strict_name

Posted by Gabriel Krisman Bertazi 1 year, 5 months ago

André Almeida <andrealmeid@igalia.com> writes:

> Create a helper function for filesystems do the checks required for
> casefold directories and strict enconding.
>
> Suggested-by: Gabriel Krisman Bertazi <gabriel@krisman.be>
> Signed-off-by: André Almeida <andrealmeid@igalia.com>
> ---
>  fs/unicode/utf8-core.c  | 26 ++++++++++++++++++++++++++
>  include/linux/unicode.h |  2 ++
>  2 files changed, 28 insertions(+)
>
> diff --git a/fs/unicode/utf8-core.c b/fs/unicode/utf8-core.c
> index 0400824ef493..4966e175ed71 100644
> --- a/fs/unicode/utf8-core.c
> +++ b/fs/unicode/utf8-core.c

I don't think this belongs in fs/unicode. it is filesystem semantics whether
they don't allow invalid utf8 names and, while fs/unicode provides
utf8_validate to verify if a string is valid, it has no business looking
into superblock and inode flags.

It would be better placed as a libfs helper.

> @@ -214,3 +214,29 @@ void utf8_unload(struct unicode_map *um)
>  }
>  EXPORT_SYMBOL(utf8_unload);
>  
> +/**
> + * utf8_check_strict_name - Check if a given name is suitable for a directory

To follow the namespace in libfs, we could call it

generic_ci_validate_strict_name

> + *
> + * This functions checks if the proposed filename is suitable for the parent

suitable => valid

> + * directory. That means that only valid UTF-8 filenames will be accepted for
> + * casefold directories from filesystems created with the strict enconding flags.

enconding flags => encoding flag

> + * That also means that any name will be accepted for directories that doesn't
> + * have casefold enabled, or aren't being strict with the enconding.

encoding

> + *
> + * @inode: inode of the directory where the new file will be created
> + * @d_name: name of the new file

d_name means 'dentry name'. just 'name' is enough here since it doesn't
matter if the qstr is coming from the dentry.

> + *
> + * Returns:
> + *  * True if the filename is suitable for this directory. It can be true if a
> + *  given name is not suitable for a strict enconding directory, but the
> + *  directory being used isn't strict
> + *  * False if the filename isn't suitable for this directory. This only happens
> + *  when a directory is casefolded and is strict about its encoding.
> + */
> +bool utf8_check_strict_name(struct inode *dir, struct qstr *d_name)
> +{
> +	return !(IS_CASEFOLDED(dir) && dir->i_sb->s_encoding &&
> +	       sb_has_strict_encoding(dir->i_sb) &&
> +	       utf8_validate(dir->i_sb->s_encoding, d_name));
> +}

Now that it is a helper, it could now be unfolded to something more
readable:

if (!IS_CASEFOLDED(dir) || !sb_has_strict_encoding(dir->i_sb)))
   return true;

/* Should never happen.  Unless the filesystem is corrupt. */
if (WARN_ON_ONCE(!dir->i_sb->s_encoding))
   return true;

return utf8_validate(...)

> +EXPORT_SYMBOL(utf8_check_strict_name);
> diff --git a/include/linux/unicode.h b/include/linux/unicode.h
> index 4d39e6e11a95..fb56fb5e686c 100644
> --- a/include/linux/unicode.h
> +++ b/include/linux/unicode.h
> @@ -76,4 +76,6 @@ int utf8_casefold_hash(const struct unicode_map *um, const void *salt,
>  struct unicode_map *utf8_load(unsigned int version);
>  void utf8_unload(struct unicode_map *um);
>  
> +bool utf8_check_strict_name(struct inode *dir, struct qstr *d_name);
> +
>  #endif /* _LINUX_UNICODE_H */

-- 
Gabriel Krisman Bertazi

Re: [PATCH v2 2/8] unicode: Create utf8_check_strict_name

Posted by Theodore Ts'o 1 year, 5 months ago

I'd suggest using the one-line summary:

unicode: create the helper function utf8_check_strict_name()

so that it's a bit more descriptive.

On Mon, Sep 02, 2024 at 07:55:04PM -0300, André Almeida wrote:
> +/**
> + * utf8_check_strict_name - Check if a given name is suitable for a directory
> + *
> + * This functions checks if the proposed filename is suitable for the parent
> + * directory. That means that only valid UTF-8 filenames will be accepted for
> + * casefold directories from filesystems created with the strict enconding flags.
> + * That also means that any name will be accepted for directories that doesn't
> + * have casefold enabled, or aren't being strict with the enconding.

I also suggest wrapping with a fill column of 72 characters, instead
of 80.

						- Ted

Re: [PATCH v2 2/8] unicode: Create utf8_check_strict_name

Posted by kernel test robot 1 year, 5 months ago

Hi André,

kernel test robot noticed the following build errors:

[auto build test ERROR on akpm-mm/mm-everything]
[also build test ERROR on tytso-ext4/dev brauner-vfs/vfs.all linus/master v6.11-rc6 next-20240903]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]

url:    https://github.com/intel-lab-lkp/linux/commits/Andr-Almeida/unicode-Fix-utf8_load-error-path/20240903-070149
base:   https://git.kernel.org/pub/scm/linux/kernel/git/akpm/mm.git mm-everything
patch link:    https://lore.kernel.org/r/20240902225511.757831-3-andrealmeid%40igalia.com
patch subject: [PATCH v2 2/8] unicode: Create utf8_check_strict_name
config: powerpc64-randconfig-r073-20240903 (https://download.01.org/0day-ci/archive/20240903/202409031655.gO1eC1AL-lkp@intel.com/config)
compiler: clang version 20.0.0git (https://github.com/llvm/llvm-project dc19b59ea2502193c0e7bc16bb7d711c8053edcf)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20240903/202409031655.gO1eC1AL-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202409031655.gO1eC1AL-lkp@intel.com/

All errors (new ones prefixed by >>):

>> fs/unicode/utf8-core.c:238:11: error: call to undeclared function 'IS_CASEFOLDED'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     238 |         return !(IS_CASEFOLDED(dir) && dir->i_sb->s_encoding &&
         |                  ^
>> fs/unicode/utf8-core.c:238:36: error: incomplete definition of type 'struct inode'
     238 |         return !(IS_CASEFOLDED(dir) && dir->i_sb->s_encoding &&
         |                                        ~~~^
   include/linux/uprobes.h:21:8: note: forward declaration of 'struct inode'
      21 | struct inode;
         |        ^
>> fs/unicode/utf8-core.c:239:9: error: call to undeclared function 'sb_has_strict_encoding'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
     239 |                sb_has_strict_encoding(dir->i_sb) &&
         |                ^
   fs/unicode/utf8-core.c:239:35: error: incomplete definition of type 'struct inode'
     239 |                sb_has_strict_encoding(dir->i_sb) &&
         |                                       ~~~^
   include/linux/uprobes.h:21:8: note: forward declaration of 'struct inode'
      21 | struct inode;
         |        ^
   fs/unicode/utf8-core.c:240:26: error: incomplete definition of type 'struct inode'
     240 |                utf8_validate(dir->i_sb->s_encoding, d_name));
         |                              ~~~^
   include/linux/uprobes.h:21:8: note: forward declaration of 'struct inode'
      21 | struct inode;
         |        ^
   5 errors generated.


vim +/IS_CASEFOLDED +238 fs/unicode/utf8-core.c

   216	
   217	/**
   218	 * utf8_check_strict_name - Check if a given name is suitable for a directory
   219	 *
   220	 * This functions checks if the proposed filename is suitable for the parent
   221	 * directory. That means that only valid UTF-8 filenames will be accepted for
   222	 * casefold directories from filesystems created with the strict enconding flags.
   223	 * That also means that any name will be accepted for directories that doesn't
   224	 * have casefold enabled, or aren't being strict with the enconding.
   225	 *
   226	 * @inode: inode of the directory where the new file will be created
   227	 * @d_name: name of the new file
   228	 *
   229	 * Returns:
   230	 *  * True if the filename is suitable for this directory. It can be true if a
   231	 *  given name is not suitable for a strict enconding directory, but the
   232	 *  directory being used isn't strict
   233	 *  * False if the filename isn't suitable for this directory. This only happens
   234	 *  when a directory is casefolded and is strict about its encoding.
   235	 */
   236	bool utf8_check_strict_name(struct inode *dir, struct qstr *d_name)
   237	{
 > 238		return !(IS_CASEFOLDED(dir) && dir->i_sb->s_encoding &&
 > 239		       sb_has_strict_encoding(dir->i_sb) &&

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki