include/linux/device_cgroup.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)
commit 4ef4ac360101 ("device_cgroup: avoid access to ->i_rdev in the
common case in devcgroup_inode_permission()") reordered the checks in
devcgroup_inode_permission() to check the inode mode before checking
i_rdev, for better cache behavior.
However, the likely() annotation on the i_rdev check was not updated
to reflect the new code flow. Originally, when i_rdev was checked
first, likely(!inode->i_rdev) made sense because most inodes were(?)
regular files/directories, thus i_rdev == 0.
After the reorder, by the time we reach the i_rdev check, we have
already confirmed the inode IS a block or character device. Block and
character special files are precisely defined by having a device number
(i_rdev), so !inode->i_rdev is now the rare edge case, not the common
case.
Branch profiling confirmed this is 100% mispredicted:
correct incorrect % Function File Line
------- --------- - -------- ---- ----
0 2631904 100 devcgroup_inode_permission device_cgroup.h 24
Remove likely() to avoid giving the wrong hint to the CPU.
Fixes: 4ef4ac360101 ("device_cgroup: avoid access to ->i_rdev in the common case in devcgroup_inode_permission()")
Signed-off-by: Breno Leitao <leitao@debian.org>
---
include/linux/device_cgroup.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/device_cgroup.h b/include/linux/device_cgroup.h
index 0864773a57e8..822085bc2d20 100644
--- a/include/linux/device_cgroup.h
+++ b/include/linux/device_cgroup.h
@@ -21,7 +21,7 @@ static inline int devcgroup_inode_permission(struct inode *inode, int mask)
if (likely(!S_ISBLK(inode->i_mode) && !S_ISCHR(inode->i_mode)))
return 0;
- if (likely(!inode->i_rdev))
+ if (!inode->i_rdev)
return 0;
if (S_ISBLK(inode->i_mode))
---
base-commit: f0b9d8eb98dfee8d00419aa07543bdc2c1a44fb1
change-id: 20260107-likely_device-20dae82d502d
Best regards,
--
Breno Leitao <leitao@debian.org>
On Wed, 07 Jan 2026 06:06:36 -0800, Breno Leitao wrote:
> commit 4ef4ac360101 ("device_cgroup: avoid access to ->i_rdev in the
> common case in devcgroup_inode_permission()") reordered the checks in
> devcgroup_inode_permission() to check the inode mode before checking
> i_rdev, for better cache behavior.
>
> However, the likely() annotation on the i_rdev check was not updated
> to reflect the new code flow. Originally, when i_rdev was checked
> first, likely(!inode->i_rdev) made sense because most inodes were(?)
> regular files/directories, thus i_rdev == 0.
>
> [...]
Applied to the vfs-7.0.misc branch of the vfs/vfs.git tree.
Patches in the vfs-7.0.misc branch should appear in linux-next soon.
Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.
It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.
Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.
tree: https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs-7.0.misc
[1/1] device_cgroup: remove branch hint after code refactor
https://git.kernel.org/vfs/vfs/c/2a255acce2e5
On Wed, Jan 7, 2026 at 3:06 PM Breno Leitao <leitao@debian.org> wrote:
>
> commit 4ef4ac360101 ("device_cgroup: avoid access to ->i_rdev in the
> common case in devcgroup_inode_permission()") reordered the checks in
> devcgroup_inode_permission() to check the inode mode before checking
> i_rdev, for better cache behavior.
>
> However, the likely() annotation on the i_rdev check was not updated
> to reflect the new code flow. Originally, when i_rdev was checked
> first, likely(!inode->i_rdev) made sense because most inodes were(?)
> regular files/directories, thus i_rdev == 0.
>
> After the reorder, by the time we reach the i_rdev check, we have
> already confirmed the inode IS a block or character device. Block and
> character special files are precisely defined by having a device number
> (i_rdev), so !inode->i_rdev is now the rare edge case, not the common
> case.
>
> Branch profiling confirmed this is 100% mispredicted:
>
> correct incorrect % Function File Line
> ------- --------- - -------- ---- ----
> 0 2631904 100 devcgroup_inode_permission device_cgroup.h 24
>
> Remove likely() to avoid giving the wrong hint to the CPU.
>
> Fixes: 4ef4ac360101 ("device_cgroup: avoid access to ->i_rdev in the common case in devcgroup_inode_permission()")
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---
> include/linux/device_cgroup.h | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/include/linux/device_cgroup.h b/include/linux/device_cgroup.h
> index 0864773a57e8..822085bc2d20 100644
> --- a/include/linux/device_cgroup.h
> +++ b/include/linux/device_cgroup.h
> @@ -21,7 +21,7 @@ static inline int devcgroup_inode_permission(struct inode *inode, int mask)
> if (likely(!S_ISBLK(inode->i_mode) && !S_ISCHR(inode->i_mode)))
> return 0;
>
> - if (likely(!inode->i_rdev))
> + if (!inode->i_rdev)
> return 0;
>
The branch was left there because I could not be bothered to analyze
whether it can be straight up eleminated with the new checks in place.
A quick look at init_special_inode suggests it is an invariant rdev is
there in this case.
So for the time being I would replace likely with WARN_ON_ONCE . Might
be even a good candidate for the pending release.
Hello Mateusz,
On Wed, Jan 07, 2026 at 03:17:40PM +0100, Mateusz Guzik wrote:
> On Wed, Jan 7, 2026 at 3:06 PM Breno Leitao <leitao@debian.org> wrote:
> > diff --git a/include/linux/device_cgroup.h b/include/linux/device_cgroup.h
> > index 0864773a57e8..822085bc2d20 100644
> > --- a/include/linux/device_cgroup.h
> > +++ b/include/linux/device_cgroup.h
> > @@ -21,7 +21,7 @@ static inline int devcgroup_inode_permission(struct inode *inode, int mask)
> > if (likely(!S_ISBLK(inode->i_mode) && !S_ISCHR(inode->i_mode)))
> > return 0;
> >
> > - if (likely(!inode->i_rdev))
> > + if (!inode->i_rdev)
> > return 0;
> >
>
> The branch was left there because I could not be bothered to analyze
> whether it can be straight up eleminated with the new checks in place.
>
> A quick look at init_special_inode suggests it is an invariant rdev is
> there in this case.
>
> So for the time being I would replace likely with WARN_ON_ONCE . Might
> be even a good candidate for the pending release.
Oh, in fact that was my first try, but, when I tested it, I found that
it warns in some cases.
[ 126.951410] WARNING: ./include/linux/device_cgroup.h:24 at inode_permission+0x181/0x190, CPU#4: networkd)/1212
[ 126.971659] Modules linked in: intel_uncore_frequency(E) intel_uncore_frequency_common(E) skx_edac(E) skx_edac_common(E) nfit(E) libnvdimm(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) iTCO_wdt(E) iTCO_vendor_support(E) irqbypass(E) acpi_cpufreq(E) i2c_i801(E) i2c_smbus(E) wmi(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) evdev(E) button(E) bpf_preload(E) sch_fq_codel(E) xhci_pci(E) xhci_hcd(E) vhost_net(E) tap(E) tun(E) vhost(E) vhost_iotlb(E) mpls_gso(E) mpls_iptunnel(E) mpls_router(E) fou(E) loop(E) efivarfs(E) autofs4(E)
[ 127.097485] Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE
[ 127.109038] Hardware name: Quanta Delta Lake MP 29F0EMA01D0/Delta Lake-Class1, BIOS F0E_3A21 06/27/2024
[ 127.127889] RIP: 0010:inode_permission (./include/linux/device_cgroup.h:24 fs/namei.c:600)
[ 127.137542] Code: 14 81 e2 ff ff 0f 00 31 ff 3d 00 60 00 00 41 8d 0c 48 40 0f 95 c7 ff c7 e8 5c 68 35 00 85 c0 0f 85 6d ff ff ff e9 29 ff ff ff <0f> 0b e9 22 ff ff ff 0f 1f 84 00 00 00 00 00 41 56 53 49 89 f6 48
All code
========
0: 14 81 adc $0x81,%al
2: e2 ff loop 0x3
4: ff 0f decl (%rdi)
6: 00 31 add %dh,(%rcx)
8: ff (bad)
9: 3d 00 60 00 00 cmp $0x6000,%eax
e: 41 8d 0c 48 lea (%r8,%rcx,2),%ecx
12: 40 0f 95 c7 setne %dil
16: ff c7 inc %edi
18: e8 5c 68 35 00 call 0x356879
1d: 85 c0 test %eax,%eax
1f: 0f 85 6d ff ff ff jne 0xffffffffffffff92
25: e9 29 ff ff ff jmp 0xffffffffffffff53
2a:* 0f 0b ud2 <-- trapping instruction
2c: e9 22 ff ff ff jmp 0xffffffffffffff53
31: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
38: 00
39: 41 56 push %r14
3b: 53 push %rbx
3c: 49 89 f6 mov %rsi,%r14
3f: 48 rex.W
Code starting with the faulting instruction
===========================================
0: 0f 0b ud2
2: e9 22 ff ff ff jmp 0xffffffffffffff29
7: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
e: 00
f: 41 56 push %r14
11: 53 push %rbx
12: 49 89 f6 mov %rsi,%r14
15: 48 rex.W
[ 127.175161] RSP: 0018:ffffc9000438fe60 EFLAGS: 00010246
[ 127.185667] RAX: 0000000000002000 RBX: 0000000000000010 RCX: 0000000000006000
[ 127.199992] RDX: 0000000000000000 RSI: ffff88842a9b5418 RDI: ffffffff83e31178
[ 127.214320] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
[ 127.228667] R10: 0000000000000000 R11: ffff88842a9b54b0 R12: ffffffff83e31178
[ 127.243017] R13: ffff88842a9b5418 R14: ffff88842a9b5418 R15: ffff88842a9b5498
[ 127.257360] FS: 00007f86e4bc3900(0000) GS:ffff8890b23fa000(0000) knlGS:0000000000000000
[ 127.273624] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 127.285172] CR2: 000055990443e068 CR3: 00000004183b2001 CR4: 00000000007726f0
[ 127.299500] PKRU: 55555554
[ 127.304961] Call Trace:
[ 127.309885] <TASK>
[ 127.314121] do_faccessat (fs/open.c:508)
[ 127.321488] ? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:131)
[ 127.331973] __x64_sys_access (fs/open.c:549 fs/open.c:547 fs/open.c:547)
[ 127.339677] do_syscall_64 (arch/x86/entry/syscall_64.c:?)
[ 127.347038] ? rcu_is_watching (./include/linux/context_tracking.h:128 kernel/rcu/tree.c:751)
[ 127.354775] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:131)
[ 127.364914] RIP: 0033:0x7f86e42ff21b
Lines against commit f0b9d8eb98df ("Merge tag 'nfsd-6.19-3' of
git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux"), plus something like:
diff --git a/include/linux/device_cgroup.h b/include/linux/device_cgroup.h
index 822085bc2d20..c2b933a318f8 100644
--- a/include/linux/device_cgroup.h
+++ b/include/linux/device_cgroup.h
@@ -21,7 +21,7 @@ static inline int devcgroup_inode_permission(struct inode *inode, int mask)
if (likely(!S_ISBLK(inode->i_mode) && !S_ISCHR(inode->i_mode)))
return 0;
- if (likely(!inode->i_rdev))
+ if (WARN_ON_ONCE(!inode->i_rdev))
return 0;
if (S_ISBLK(inode->i_mode))
On Wed, Jan 7, 2026 at 4:17 PM Breno Leitao <leitao@debian.org> wrote:
>
> Hello Mateusz,
>
> On Wed, Jan 07, 2026 at 03:17:40PM +0100, Mateusz Guzik wrote:
> > On Wed, Jan 7, 2026 at 3:06 PM Breno Leitao <leitao@debian.org> wrote:
> > > diff --git a/include/linux/device_cgroup.h b/include/linux/device_cgroup.h
> > > index 0864773a57e8..822085bc2d20 100644
> > > --- a/include/linux/device_cgroup.h
> > > +++ b/include/linux/device_cgroup.h
> > > @@ -21,7 +21,7 @@ static inline int devcgroup_inode_permission(struct inode *inode, int mask)
> > > if (likely(!S_ISBLK(inode->i_mode) && !S_ISCHR(inode->i_mode)))
> > > return 0;
> > >
> > > - if (likely(!inode->i_rdev))
> > > + if (!inode->i_rdev)
> > > return 0;
> > >
> >
> > The branch was left there because I could not be bothered to analyze
> > whether it can be straight up eleminated with the new checks in place.
> >
> > A quick look at init_special_inode suggests it is an invariant rdev is
> > there in this case.
> >
> > So for the time being I would replace likely with WARN_ON_ONCE . Might
> > be even a good candidate for the pending release.
>
> Oh, in fact that was my first try, but, when I tested it, I found that
> it warns in some cases.
>
heh, bummer
in that case:
Reviewed-by: Mateusz Guzik <mjguzik@gmail.com>
thanks
> [ 126.951410] WARNING: ./include/linux/device_cgroup.h:24 at inode_permission+0x181/0x190, CPU#4: networkd)/1212
> [ 126.971659] Modules linked in: intel_uncore_frequency(E) intel_uncore_frequency_common(E) skx_edac(E) skx_edac_common(E) nfit(E) libnvdimm(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) coretemp(E) kvm_intel(E) kvm(E) iTCO_wdt(E) iTCO_vendor_support(E) irqbypass(E) acpi_cpufreq(E) i2c_i801(E) i2c_smbus(E) wmi(E) ipmi_si(E) ipmi_devintf(E) ipmi_msghandler(E) evdev(E) button(E) bpf_preload(E) sch_fq_codel(E) xhci_pci(E) xhci_hcd(E) vhost_net(E) tap(E) tun(E) vhost(E) vhost_iotlb(E) mpls_gso(E) mpls_iptunnel(E) mpls_router(E) fou(E) loop(E) efivarfs(E) autofs4(E)
> [ 127.097485] Tainted: [S]=CPU_OUT_OF_SPEC, [E]=UNSIGNED_MODULE
> [ 127.109038] Hardware name: Quanta Delta Lake MP 29F0EMA01D0/Delta Lake-Class1, BIOS F0E_3A21 06/27/2024
> [ 127.127889] RIP: 0010:inode_permission (./include/linux/device_cgroup.h:24 fs/namei.c:600)
> [ 127.137542] Code: 14 81 e2 ff ff 0f 00 31 ff 3d 00 60 00 00 41 8d 0c 48 40 0f 95 c7 ff c7 e8 5c 68 35 00 85 c0 0f 85 6d ff ff ff e9 29 ff ff ff <0f> 0b e9 22 ff ff ff 0f 1f 84 00 00 00 00 00 41 56 53 49 89 f6 48
> All code
> ========
> 0: 14 81 adc $0x81,%al
> 2: e2 ff loop 0x3
> 4: ff 0f decl (%rdi)
> 6: 00 31 add %dh,(%rcx)
> 8: ff (bad)
> 9: 3d 00 60 00 00 cmp $0x6000,%eax
> e: 41 8d 0c 48 lea (%r8,%rcx,2),%ecx
> 12: 40 0f 95 c7 setne %dil
> 16: ff c7 inc %edi
> 18: e8 5c 68 35 00 call 0x356879
> 1d: 85 c0 test %eax,%eax
> 1f: 0f 85 6d ff ff ff jne 0xffffffffffffff92
> 25: e9 29 ff ff ff jmp 0xffffffffffffff53
> 2a:* 0f 0b ud2 <-- trapping instruction
> 2c: e9 22 ff ff ff jmp 0xffffffffffffff53
> 31: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
> 38: 00
> 39: 41 56 push %r14
> 3b: 53 push %rbx
> 3c: 49 89 f6 mov %rsi,%r14
> 3f: 48 rex.W
>
> Code starting with the faulting instruction
> ===========================================
> 0: 0f 0b ud2
> 2: e9 22 ff ff ff jmp 0xffffffffffffff29
> 7: 0f 1f 84 00 00 00 00 nopl 0x0(%rax,%rax,1)
> e: 00
> f: 41 56 push %r14
> 11: 53 push %rbx
> 12: 49 89 f6 mov %rsi,%r14
> 15: 48 rex.W
> [ 127.175161] RSP: 0018:ffffc9000438fe60 EFLAGS: 00010246
> [ 127.185667] RAX: 0000000000002000 RBX: 0000000000000010 RCX: 0000000000006000
> [ 127.199992] RDX: 0000000000000000 RSI: ffff88842a9b5418 RDI: ffffffff83e31178
> [ 127.214320] RBP: 0000000000000001 R08: 0000000000000001 R09: 0000000000000000
> [ 127.228667] R10: 0000000000000000 R11: ffff88842a9b54b0 R12: ffffffff83e31178
> [ 127.243017] R13: ffff88842a9b5418 R14: ffff88842a9b5418 R15: ffff88842a9b5498
> [ 127.257360] FS: 00007f86e4bc3900(0000) GS:ffff8890b23fa000(0000) knlGS:0000000000000000
> [ 127.273624] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 127.285172] CR2: 000055990443e068 CR3: 00000004183b2001 CR4: 00000000007726f0
> [ 127.299500] PKRU: 55555554
> [ 127.304961] Call Trace:
> [ 127.309885] <TASK>
> [ 127.314121] do_faccessat (fs/open.c:508)
> [ 127.321488] ? entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:131)
> [ 127.331973] __x64_sys_access (fs/open.c:549 fs/open.c:547 fs/open.c:547)
> [ 127.339677] do_syscall_64 (arch/x86/entry/syscall_64.c:?)
> [ 127.347038] ? rcu_is_watching (./include/linux/context_tracking.h:128 kernel/rcu/tree.c:751)
> [ 127.354775] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:131)
> [ 127.364914] RIP: 0033:0x7f86e42ff21b
>
>
> Lines against commit f0b9d8eb98df ("Merge tag 'nfsd-6.19-3' of
> git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux"), plus something like:
>
>
> diff --git a/include/linux/device_cgroup.h b/include/linux/device_cgroup.h
> index 822085bc2d20..c2b933a318f8 100644
> --- a/include/linux/device_cgroup.h
> +++ b/include/linux/device_cgroup.h
> @@ -21,7 +21,7 @@ static inline int devcgroup_inode_permission(struct inode *inode, int mask)
> if (likely(!S_ISBLK(inode->i_mode) && !S_ISCHR(inode->i_mode)))
> return 0;
>
> - if (likely(!inode->i_rdev))
> + if (WARN_ON_ONCE(!inode->i_rdev))
> return 0;
>
> if (S_ISBLK(inode->i_mode))
>
>
© 2016 - 2026 Red Hat, Inc.