[PATCH] ext4: Refactor breaking condition for xattr_find_entry()

I Hsin Cheng posted 1 patch 3 months ago
fs/ext4/xattr.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
[PATCH] ext4: Refactor breaking condition for xattr_find_entry()
Posted by I Hsin Cheng 3 months ago
Refactor the condition for breaking the loop within xattr_find_entry().
Elimate the usage of "<=" and take condition shortcut when "!cmp" is
true.

Originally, the condition was "(cmp <= 0 && (sorted || cmp == 0))", which
means after it knows "cmp <= 0" is true, it has to check the value of
"sorted" and "cmp". The checking of "cmp" here would be redundant since
it has already checked it.

Observing from the logic, when "cmp == 0" the branch is going to be true,
no need to check "cmp == 0" again, so we only need to take shortcut when
"cmp == 0", on the other hand, we'll check "sorted" when "cmp < 0".

The refactor can shrink the generated code size by 44 bytes. Numerous
instructions can be saved thus should also benefit execution efficiency
as well.

$ ./scripts/bloat-o-meter vmlinux_old vmlinux_new
add/remove: 0/0 grow/shrink: 0/1 up/down: 0/-44 (-44)
Function                                     old     new   delta
xattr_find_entry                             300     256     -44
Total: Before=22989434, After=22989390, chg -0.00%

The test is done on kernel version 6.16 with x86_64 defconfig
and gcc 13.3.0.

Signed-off-by: I Hsin Cheng <richard120310@gmail.com>
---
 fs/ext4/xattr.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
index 8d15acbacc20..1993622e3c74 100644
--- a/fs/ext4/xattr.c
+++ b/fs/ext4/xattr.c
@@ -338,7 +338,7 @@ xattr_find_entry(struct inode *inode, struct ext4_xattr_entry **pentry,
 			cmp = name_len - entry->e_name_len;
 		if (!cmp)
 			cmp = memcmp(name, entry->e_name, name_len);
-		if (cmp <= 0 && (sorted || cmp == 0))
+		if (!cmp || (cmp < 0 && sorted))
 			break;
 	}
 	*pentry = entry;
-- 
2.43.0
Re: [PATCH] ext4: Refactor breaking condition for xattr_find_entry()
Posted by Theodore Ts'o 2 months, 2 weeks ago
On Tue, 08 Jul 2025 10:00:13 +0800, I Hsin Cheng wrote:
> Refactor the condition for breaking the loop within xattr_find_entry().
> Elimate the usage of "<=" and take condition shortcut when "!cmp" is
> true.
> 
> Originally, the condition was "(cmp <= 0 && (sorted || cmp == 0))", which
> means after it knows "cmp <= 0" is true, it has to check the value of
> "sorted" and "cmp". The checking of "cmp" here would be redundant since
> it has already checked it.
> 
> [...]

Applied, thanks!

[1/1] ext4: Refactor breaking condition for xattr_find_entry()
      commit: 9d9076238fe9fe45257f298bf51b35aa796cf0f1

Best regards,
-- 
Theodore Ts'o <tytso@mit.edu>
Re: [PATCH] ext4: Refactor breaking condition for xattr_find_entry()
Posted by Theodore Ts'o 3 months ago
On Tue, Jul 08, 2025 at 10:00:13AM +0800, I Hsin Cheng wrote:
> diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
> index 8d15acbacc20..1993622e3c74 100644
> --- a/fs/ext4/xattr.c
> +++ b/fs/ext4/xattr.c
> @@ -338,7 +338,7 @@ xattr_find_entry(struct inode *inode, struct ext4_xattr_entry **pentry,
>  			cmp = name_len - entry->e_name_len;
>  		if (!cmp)
>  			cmp = memcmp(name, entry->e_name, name_len);
> -		if (cmp <= 0 && (sorted || cmp == 0))
> +		if (!cmp || (cmp < 0 && sorted))

This is *not* identical.  Suppose memcmp returns a positive value
(say, 1).  Previously, the conditional would be false.  With your
change, !cmp would be true, so the overall conditional would be true.

So this does not appear to be a valid transformation.

(Note that valid transformations will be done by the compiler
automatically, without needing to make code changes.)

   	     	 	      - Ted
Re: [PATCH] ext4: Refactor breaking condition for xattr_find_entry()
Posted by I Hsin Cheng 3 months ago
On Mon, Jul 07, 2025 at 10:24:53PM -0400, Theodore Ts'o wrote:
> On Tue, Jul 08, 2025 at 10:00:13AM +0800, I Hsin Cheng wrote:
> > diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c
> > index 8d15acbacc20..1993622e3c74 100644
> > --- a/fs/ext4/xattr.c
> > +++ b/fs/ext4/xattr.c
> > @@ -338,7 +338,7 @@ xattr_find_entry(struct inode *inode, struct ext4_xattr_entry **pentry,
> >  			cmp = name_len - entry->e_name_len;
> >  		if (!cmp)
> >  			cmp = memcmp(name, entry->e_name, name_len);
> > -		if (cmp <= 0 && (sorted || cmp == 0))
> > +		if (!cmp || (cmp < 0 && sorted))
> 
> This is *not* identical.  Suppose memcmp returns a positive value
> (say, 1).  Previously, the conditional would be false.  With your
> change, !cmp would be true, so the overall conditional would be true.
> 
> So this does not appear to be a valid transformation.
> 
> (Note that valid transformations will be done by the compiler
> automatically, without needing to make code changes.)
> 
>    	     	 	      - Ted


Hi Ted,

> This is *not* identical.  Suppose memcmp returns a positive value
> (say, 1).  Previously, the conditional would be false.  With your
> change, !cmp would be true, so the overall conditional would be true.

I would argue that "!cmp" is only true when "cmp" is zero, otherwise
it'll be false no matter the number is positive or negative.

With some transformation according to Demorgan's Law, the following
expressions are equivalent
* "cmp <= 0 && (sorted || cmp == 0)"
* "(cmp <= 0 && sorted) || (cmp <= 0 && cmp == 0)"
* "(cmp <= 0 && sorted) || (cmp == 0)"
* "(cmp == 0) || (cmp <= 0 && sorted)"

Because when "cmp == 0" (which is "!cmp"), the condition is going to
take shortcut, so we can further simplify "(cmp <= 0 && sorted)" to
"(cmp < 0 && sorted)", since "cmp" isn't going to be 0 when entering
this part.

When you put any non-zero value for "cmp", "!cmp" is going to be false
so it will further check whether "(cmp < 0 && sorted)".

This is my derivation flow, let me know if there's anything wrong in it.

> (Note that valid transformations will be done by the compiler
> automatically, without needing to make code changes.)

Makes sense, thanks for the head up, but I think we do have some
benefits from it when compiling with -O2 optimization level?

As the bloat-o-meter indicates the code generation size can actually be
shrinked.

Best regards,
I Hsin Cheng