[PATCH v3] erofs: fix file handle encoding for 64-bit NIDs

Hongbo Li posted 1 patch 9 months, 2 weeks ago
There is a newer version of this series
fs/erofs/super.c | 54 +++++++++++++++++++++++++++++++++++++++++-------
1 file changed, 46 insertions(+), 8 deletions(-)
[PATCH v3] erofs: fix file handle encoding for 64-bit NIDs
Posted by Hongbo Li 9 months, 2 weeks ago
In erofs, the inode number has the location information of
files. The default encode_fh uses the ino32, this will lack
of some information when the file is too big. So we need
the internal helpers to encode filehandle.

It is easy to reproduce test:
  1. prepare an erofs image with nid bigger than UINT_MAX
  2. mount -t erofs foo.img /mnt/erofs
  3. set exportfs with configuration: /mnt/erofs *(rw,sync,
     no_root_squash)
  4. mount -t nfs $IP:/mnt/erofs /mnt/nfs
  5. md5sum /mnt/nfs/foo # foo is the file which nid bigger
     than UINT_MAX.
For overlayfs case, the under filesystem's file handle is
encoded in ovl_fb.fid, it is same as NFS's case.

Fixes: 3e917cc305c6 ("erofs: make filesystem exportable")
Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
---
v2: https://lore.kernel.org/all/20250429074109.689075-1-lihongbo22@huawei.com/
  - Assign parent nid with correct value.

v1: https://lore.kernel.org/all/20250429011139.686847-1-lihongbo22@huawei.com/
   - Encode generation into file handle and minor clean code.
   - Update the commiti's title.
---
 fs/erofs/super.c | 54 +++++++++++++++++++++++++++++++++++++++++-------
 1 file changed, 46 insertions(+), 8 deletions(-)

diff --git a/fs/erofs/super.c b/fs/erofs/super.c
index cadec6b1b554..28b3701165cc 100644
--- a/fs/erofs/super.c
+++ b/fs/erofs/super.c
@@ -511,24 +511,62 @@ static int erofs_fc_parse_param(struct fs_context *fc,
 	return 0;
 }
 
-static struct inode *erofs_nfs_get_inode(struct super_block *sb,
-					 u64 ino, u32 generation)
+static int erofs_encode_fh(struct inode *inode, u32 *fh, int *max_len,
+			   struct inode *parent)
 {
-	return erofs_iget(sb, ino);
+	int len = parent ? 6 : 3;
+	erofs_nid_t nid;
+	u32 generation;
+
+	if (*max_len < len) {
+		*max_len = len;
+		return FILEID_INVALID;
+	}
+
+	nid = EROFS_I(inode)->nid;
+	generation = inode->i_generation;
+	fh[0] = (u32)(nid >> 32);
+	fh[1] = (u32)(nid & 0xffffffff);
+	fh[2] = generation;
+
+	if (parent) {
+		nid = EROFS_I(parent)->nid;
+		generation = parent->i_generation;
+
+		fh[3] = (u32)(nid >> 32);
+		fh[4] = (u32)(nid & 0xffffffff);
+		fh[5] = generation;
+	}
+
+	*max_len = len;
+	return parent ? FILEID_INO64_GEN_PARENT : FILEID_INO64_GEN;
 }
 
 static struct dentry *erofs_fh_to_dentry(struct super_block *sb,
 		struct fid *fid, int fh_len, int fh_type)
 {
-	return generic_fh_to_dentry(sb, fid, fh_len, fh_type,
-				    erofs_nfs_get_inode);
+	erofs_nid_t nid;
+
+	if ((fh_type != FILEID_INO64_GEN &&
+	     fh_type != FILEID_INO64_GEN_PARENT) || fh_len < 3)
+		return NULL;
+
+	nid = (u64) fid->raw[0] << 32;
+	nid |= (u64) fid->raw[1];
+	return d_obtain_alias(erofs_iget(sb, nid));
 }
 
 static struct dentry *erofs_fh_to_parent(struct super_block *sb,
 		struct fid *fid, int fh_len, int fh_type)
 {
-	return generic_fh_to_parent(sb, fid, fh_len, fh_type,
-				    erofs_nfs_get_inode);
+	erofs_nid_t nid;
+
+	if (fh_type != FILEID_INO64_GEN_PARENT || fh_len < 6)
+		return NULL;
+
+	nid = (u64) fid->raw[3] << 32;
+	nid |= (u64) fid->raw[4];
+	return d_obtain_alias(erofs_iget(sb, nid));
 }
 
 static struct dentry *erofs_get_parent(struct dentry *child)
@@ -544,7 +582,7 @@ static struct dentry *erofs_get_parent(struct dentry *child)
 }
 
 static const struct export_operations erofs_export_ops = {
-	.encode_fh = generic_encode_ino32_fh,
+	.encode_fh = erofs_encode_fh,
 	.fh_to_dentry = erofs_fh_to_dentry,
 	.fh_to_parent = erofs_fh_to_parent,
 	.get_parent = erofs_get_parent,
-- 
2.22.0
Re: [PATCH v3] erofs: fix file handle encoding for 64-bit NIDs
Posted by Gao Xiang 9 months, 1 week ago
Hi Hongbo,

On 2025/4/29 21:42, Hongbo Li wrote:
> In erofs, the inode number has the location information of
> files. The default encode_fh uses the ino32, this will lack
> of some information when the file is too big. So we need
> the internal helpers to encode filehandle.

EROFS uses NID to indicate the on-disk inode offset, which can
exceed 32 bits. However, the default encode_fh uses the ino32,
thus it doesn't work if the image is large than 128GiB.

Let's introduce our own helpers to encode file handles.

> 
> It is easy to reproduce test:

It's easy to reproduce:

>    1. prepare an erofs image with nid bigger than UINT_MAX

      1. Prepare an EROFS image with NIDs larger than U32_MAX

>    2. mount -t erofs foo.img /mnt/erofs
>    3. set exportfs with configuration: /mnt/erofs *(rw,sync,
>       no_root_squash)
>    4. mount -t nfs $IP:/mnt/erofs /mnt/nfs
>    5. md5sum /mnt/nfs/foo # foo is the file which nid bigger
>       than UINT_MAX.
> For overlayfs case, the under filesystem's file handle is
> encoded in ovl_fb.fid, it is same as NFS's case.
> 
> Fixes: 3e917cc305c6 ("erofs: make filesystem exportable")
> Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
> ---
> v2: https://lore.kernel.org/all/20250429074109.689075-1-lihongbo22@huawei.com/
>    - Assign parent nid with correct value.
> 
> v1: https://lore.kernel.org/all/20250429011139.686847-1-lihongbo22@huawei.com/
>     - Encode generation into file handle and minor clean code.
>     - Update the commiti's title.
> ---
>   fs/erofs/super.c | 54 +++++++++++++++++++++++++++++++++++++++++-------
>   1 file changed, 46 insertions(+), 8 deletions(-)
> 
> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
> index cadec6b1b554..28b3701165cc 100644
> --- a/fs/erofs/super.c
> +++ b/fs/erofs/super.c
> @@ -511,24 +511,62 @@ static int erofs_fc_parse_param(struct fs_context *fc,
>   	return 0;
>   }
>   
> -static struct inode *erofs_nfs_get_inode(struct super_block *sb,
> -					 u64 ino, u32 generation)
> +static int erofs_encode_fh(struct inode *inode, u32 *fh, int *max_len,
> +			   struct inode *parent)
>   {
> -	return erofs_iget(sb, ino);
> +	int len = parent ? 6 : 3;
> +	erofs_nid_t nid;

Just `erofs_nid_t nid = EROFS_I(inode)->nid;`?

I think the compiler will optimize out `if (*max_len < len)`.

> +	u32 generation;

It seems it's unnecessary to introduce `generation` variable here?

> +
> +	if (*max_len < len) {
> +		*max_len = len;
> +		return FILEID_INVALID;
> +	}
> +
> +	nid = EROFS_I(inode)->nid;
> +	generation = inode->i_generation;

So drop these two lines.

> +	fh[0] = (u32)(nid >> 32);
> +	fh[1] = (u32)(nid & 0xffffffff);
> +	fh[2] = generation;
> +
> +	if (parent) {
> +		nid = EROFS_I(parent)->nid;
> +		generation = parent->i_generation;
> +
> +		fh[3] = (u32)(nid >> 32);
> +		fh[4] = (u32)(nid & 0xffffffff);
> +		fh[5] = generation;

		fh[5] = parent->i_generation;

> +	}
> +
> +	*max_len = len;
> +	return parent ? FILEID_INO64_GEN_PARENT : FILEID_INO64_GEN;
>   }
>   
>   static struct dentry *erofs_fh_to_dentry(struct super_block *sb,
>   		struct fid *fid, int fh_len, int fh_type)
>   {
> -	return generic_fh_to_dentry(sb, fid, fh_len, fh_type,
> -				    erofs_nfs_get_inode);
> +	erofs_nid_t nid;
> +
> +	if ((fh_type != FILEID_INO64_GEN &&
> +	     fh_type != FILEID_INO64_GEN_PARENT) || fh_len < 3)
> +		return NULL;
> +
> +	nid = (u64) fid->raw[0] << 32;
> +	nid |= (u64) fid->raw[1];

Unnecessary nid variable.

> +	return d_obtain_alias(erofs_iget(sb, nid));

	return d_obtain_alias(erofs_iget(sb,
			((u64)fid->raw[0] << 32) | fid->raw[1]));

>   }
>   
>   static struct dentry *erofs_fh_to_parent(struct super_block *sb,
>   		struct fid *fid, int fh_len, int fh_type)
>   {
> -	return generic_fh_to_parent(sb, fid, fh_len, fh_type,
> -				    erofs_nfs_get_inode);
> +	erofs_nid_t nid;
> +
> +	if (fh_type != FILEID_INO64_GEN_PARENT || fh_len < 6)
> +		return NULL;
> +
> +	nid = (u64) fid->raw[3] << 32;
> +	nid |= (u64) fid->raw[4];

Same here.

Thanks,
Gao Xiang
Re: [PATCH v3] erofs: fix file handle encoding for 64-bit NIDs
Posted by Hongbo Li 9 months, 1 week ago

On 2025/5/6 23:10, Gao Xiang wrote:
> Hi Hongbo,
> 
> On 2025/4/29 21:42, Hongbo Li wrote:
>> In erofs, the inode number has the location information of
>> files. The default encode_fh uses the ino32, this will lack
>> of some information when the file is too big. So we need
>> the internal helpers to encode filehandle.
> 
> EROFS uses NID to indicate the on-disk inode offset, which can
> exceed 32 bits. However, the default encode_fh uses the ino32,
> thus it doesn't work if the image is large than 128GiB.
> 
Thanks for helping me correct my description.

Here, an image larger than 128GiB won't trigger NID reversal. It 
requires a 128GiB file inside, and the NID of the second file may exceed 
U32 during formatting. So here can we change it to "However, the default 
encode_fh uses the ino32, thus it may not work if there exist a file 
which is large than 128GiB." ?

Thanks,
Hongbo

> Let's introduce our own helpers to encode file handles.
> 
>>
>> It is easy to reproduce test:
> 
> It's easy to reproduce:
> 
>>    1. prepare an erofs image with nid bigger than UINT_MAX
> 
>       1. Prepare an EROFS image with NIDs larger than U32_MAX
> 
>>    2. mount -t erofs foo.img /mnt/erofs
>>    3. set exportfs with configuration: /mnt/erofs *(rw,sync,
>>       no_root_squash)
>>    4. mount -t nfs $IP:/mnt/erofs /mnt/nfs
>>    5. md5sum /mnt/nfs/foo # foo is the file which nid bigger
>>       than UINT_MAX.
>> For overlayfs case, the under filesystem's file handle is
>> encoded in ovl_fb.fid, it is same as NFS's case.
>>
>> Fixes: 3e917cc305c6 ("erofs: make filesystem exportable")
>> Signed-off-by: Hongbo Li <lihongbo22@huawei.com>
>> ---
>> v2: 
>> https://lore.kernel.org/all/20250429074109.689075-1-lihongbo22@huawei.com/
>>    - Assign parent nid with correct value.
>>
>> v1: 
>> https://lore.kernel.org/all/20250429011139.686847-1-lihongbo22@huawei.com/
>>     - Encode generation into file handle and minor clean code.
>>     - Update the commiti's title.
>> ---
>>   fs/erofs/super.c | 54 +++++++++++++++++++++++++++++++++++++++++-------
>>   1 file changed, 46 insertions(+), 8 deletions(-)
>>
>> diff --git a/fs/erofs/super.c b/fs/erofs/super.c
>> index cadec6b1b554..28b3701165cc 100644
>> --- a/fs/erofs/super.c
>> +++ b/fs/erofs/super.c
>> @@ -511,24 +511,62 @@ static int erofs_fc_parse_param(struct 
>> fs_context *fc,
>>       return 0;
>>   }
>> -static struct inode *erofs_nfs_get_inode(struct super_block *sb,
>> -                     u64 ino, u32 generation)
>> +static int erofs_encode_fh(struct inode *inode, u32 *fh, int *max_len,
>> +               struct inode *parent)
>>   {
>> -    return erofs_iget(sb, ino);
>> +    int len = parent ? 6 : 3;
>> +    erofs_nid_t nid;
> 
> Just `erofs_nid_t nid = EROFS_I(inode)->nid;`?
> 
> I think the compiler will optimize out `if (*max_len < len)`.
> 
>> +    u32 generation;
> 
> It seems it's unnecessary to introduce `generation` variable here?
> 
>> +
>> +    if (*max_len < len) {
>> +        *max_len = len;
>> +        return FILEID_INVALID;
>> +    }
>> +
>> +    nid = EROFS_I(inode)->nid;
>> +    generation = inode->i_generation;
> 
> So drop these two lines.
> 
>> +    fh[0] = (u32)(nid >> 32);
>> +    fh[1] = (u32)(nid & 0xffffffff);
>> +    fh[2] = generation;
>> +
>> +    if (parent) {
>> +        nid = EROFS_I(parent)->nid;
>> +        generation = parent->i_generation;
>> +
>> +        fh[3] = (u32)(nid >> 32);
>> +        fh[4] = (u32)(nid & 0xffffffff);
>> +        fh[5] = generation;
> 
>          fh[5] = parent->i_generation;
> 
>> +    }
>> +
>> +    *max_len = len;
>> +    return parent ? FILEID_INO64_GEN_PARENT : FILEID_INO64_GEN;
>>   }
>>   static struct dentry *erofs_fh_to_dentry(struct super_block *sb,
>>           struct fid *fid, int fh_len, int fh_type)
>>   {
>> -    return generic_fh_to_dentry(sb, fid, fh_len, fh_type,
>> -                    erofs_nfs_get_inode);
>> +    erofs_nid_t nid;
>> +
>> +    if ((fh_type != FILEID_INO64_GEN &&
>> +         fh_type != FILEID_INO64_GEN_PARENT) || fh_len < 3)
>> +        return NULL;
>> +
>> +    nid = (u64) fid->raw[0] << 32;
>> +    nid |= (u64) fid->raw[1];
> 
> Unnecessary nid variable.
> 
>> +    return d_obtain_alias(erofs_iget(sb, nid));
> 
>      return d_obtain_alias(erofs_iget(sb,
>              ((u64)fid->raw[0] << 32) | fid->raw[1]));
> 
>>   }
>>   static struct dentry *erofs_fh_to_parent(struct super_block *sb,
>>           struct fid *fid, int fh_len, int fh_type)
>>   {
>> -    return generic_fh_to_parent(sb, fid, fh_len, fh_type,
>> -                    erofs_nfs_get_inode);
>> +    erofs_nid_t nid;
>> +
>> +    if (fh_type != FILEID_INO64_GEN_PARENT || fh_len < 6)
>> +        return NULL;
>> +
>> +    nid = (u64) fid->raw[3] << 32;
>> +    nid |= (u64) fid->raw[4];
> 
> Same here.
> 
> Thanks,
> Gao Xiang
Re: [PATCH v3] erofs: fix file handle encoding for 64-bit NIDs
Posted by Gao Xiang 9 months, 1 week ago

On 2025/5/7 09:53, Hongbo Li wrote:
> 
> 
> On 2025/5/6 23:10, Gao Xiang wrote:
>> Hi Hongbo,
>>
>> On 2025/4/29 21:42, Hongbo Li wrote:
>>> In erofs, the inode number has the location information of
>>> files. The default encode_fh uses the ino32, this will lack
>>> of some information when the file is too big. So we need
>>> the internal helpers to encode filehandle.
>>
>> EROFS uses NID to indicate the on-disk inode offset, which can
>> exceed 32 bits. However, the default encode_fh uses the ino32,
>> thus it doesn't work if the image is large than 128GiB.
>>
> Thanks for helping me correct my description.
> 
> Here, an image larger than 128GiB won't trigger NID reversal. It requires a 128GiB file inside, and the NID of the second file may exceed U32 during formatting. So here can we change it to "However, the default encode_fh uses the ino32, thus it may not work if there exist a file which is large than 128GiB." ?

Why? Currently EROFS doesn't arrange inode metadata
together, but close to its data (or its directory)
if possible for data locality.

So NIDs can exceed 32-bit for images larger than
128 GiB.

Thanks,
Gao Xiang

> 
> Thanks,
> Hongbo
>
Re: [PATCH v3] erofs: fix file handle encoding for 64-bit NIDs
Posted by Hongbo Li 9 months, 1 week ago

On 2025/5/7 10:42, Gao Xiang wrote:
> 
> 
> On 2025/5/7 09:53, Hongbo Li wrote:
>>
>>
>> On 2025/5/6 23:10, Gao Xiang wrote:
>>> Hi Hongbo,
>>>
>>> On 2025/4/29 21:42, Hongbo Li wrote:
>>>> In erofs, the inode number has the location information of
>>>> files. The default encode_fh uses the ino32, this will lack
>>>> of some information when the file is too big. So we need
>>>> the internal helpers to encode filehandle.
>>>
>>> EROFS uses NID to indicate the on-disk inode offset, which can
>>> exceed 32 bits. However, the default encode_fh uses the ino32,
>>> thus it doesn't work if the image is large than 128GiB.
>>>
>> Thanks for helping me correct my description.
>>
>> Here, an image larger than 128GiB won't trigger NID reversal. It 
>> requires a 128GiB file inside, and the NID of the second file may 
>> exceed U32 during formatting. So here can we change it to "However, 
>> the default encode_fh uses the ino32, thus it may not work if there 
>> exist a file which is large than 128GiB." ?
> 
> Why? Currently EROFS doesn't arrange inode metadata
> together, but close to its data (or its directory)
> if possible for data locality.
> 
> So NIDs can exceed 32-bit for images larger than
> 128 GiB.
> 

Ok, I see your point, and you are right. It doesn't have to be a 128GiB 
file, but it is easy to construct this kind of EROFS image by large 
file. Such as:

mkfs.erofs -d7 --tar=f --clean=data foo.erofs 128g-file.tar  # the nid 
of 128g-file is 39.
mkfs.erofs -d7 --tar=f --incremental=data 1b-file.tar  # the nid of 
1b-file is 4294967425.

Thank you again for your review, I will send the next version of the 
patch later.

Thanks,
Hongbo

> Thanks,
> Gao Xiang
> 
>>
>> Thanks,
>> Hongbo
>>
>