[PATCH] entry: always inline local_irq_{enable,disable}_exit_to_user()

Eric Dumazet posted 1 patch 2 months ago
include/linux/irq-entry-common.h | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
[PATCH] entry: always inline local_irq_{enable,disable}_exit_to_user()
Posted by Eric Dumazet 2 months ago
clang needs __always_inline instead of inline, even for tiny helpers.

This saves some cycles in system call fast path, and saves 195 bytes
on x86_64 build:

$ size vmlinux.before vmlinux.after
   text	   data	    bss	    dec	    hex	filename
34652814	22291961	5875180	62819955	3be8e73	vmlinux.before
34652619	22291961	5875180	62819760	3be8db0	vmlinux.after

Signed-off-by: Eric Dumazet <edumazet@google.com>
---
 include/linux/irq-entry-common.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/irq-entry-common.h b/include/linux/irq-entry-common.h
index 6ab913e57da0a8acde84a1002645a9dfa5e6303a..d26d1b1bcbfb9798885426fbb2b978f43fcfcdc1 100644
--- a/include/linux/irq-entry-common.h
+++ b/include/linux/irq-entry-common.h
@@ -110,7 +110,7 @@ static __always_inline void enter_from_user_mode(struct pt_regs *regs)
 static inline void local_irq_enable_exit_to_user(unsigned long ti_work);
 
 #ifndef local_irq_enable_exit_to_user
-static inline void local_irq_enable_exit_to_user(unsigned long ti_work)
+static __always_inline void local_irq_enable_exit_to_user(unsigned long ti_work)
 {
 	local_irq_enable();
 }
@@ -125,7 +125,7 @@ static inline void local_irq_enable_exit_to_user(unsigned long ti_work)
 static inline void local_irq_disable_exit_to_user(void);
 
 #ifndef local_irq_disable_exit_to_user
-static inline void local_irq_disable_exit_to_user(void)
+static __always_inline void local_irq_disable_exit_to_user(void)
 {
 	local_irq_disable();
 }
-- 
2.52.0.177.g9f829587af-goog
Re: [PATCH] entry: always inline local_irq_{enable,disable}_exit_to_user()
Posted by Peter Zijlstra 2 months ago
On Thu, Dec 04, 2025 at 03:31:27PM +0000, Eric Dumazet wrote:
> clang needs __always_inline instead of inline, even for tiny helpers.
> 
> This saves some cycles in system call fast path, and saves 195 bytes
> on x86_64 build:
> 
> $ size vmlinux.before vmlinux.after
>    text	   data	    bss	    dec	    hex	filename
> 34652814	22291961	5875180	62819955	3be8e73	vmlinux.before
> 34652619	22291961	5875180	62819760	3be8db0	vmlinux.after
> 
> Signed-off-by: Eric Dumazet <edumazet@google.com>

Yeah, sometimes these inline heuristics drive me mad. I've picked up
this and the rseq one. I'll do something with them after rc1.
Re: [PATCH] entry: always inline local_irq_{enable,disable}_exit_to_user()
Posted by Eric Dumazet 2 months ago
On Fri, Dec 5, 2025 at 2:51 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Thu, Dec 04, 2025 at 03:31:27PM +0000, Eric Dumazet wrote:
> > clang needs __always_inline instead of inline, even for tiny helpers.
> >
> > This saves some cycles in system call fast path, and saves 195 bytes
> > on x86_64 build:
> >
> > $ size vmlinux.before vmlinux.after
> >    text          data     bss     dec     hex filename
> > 34652814      22291961        5875180 62819955        3be8e73 vmlinux.before
> > 34652619      22291961        5875180 62819760        3be8db0 vmlinux.after
> >
> > Signed-off-by: Eric Dumazet <edumazet@google.com>
>
> Yeah, sometimes these inline heuristics drive me mad. I've picked up
> this and the rseq one. I'll do something with them after rc1.

Thanks Peter.

I forgot to include perf numbers for this one, but apparently having a
 local_irq_enable()
in an out-of-line function in syscall path was adding a 5 % penalty on
some platforms.

Crazy...
Re: [PATCH] entry: always inline local_irq_{enable,disable}_exit_to_user()
Posted by Peter Zijlstra 2 months ago
On Fri, Dec 05, 2025 at 02:54:26AM -0800, Eric Dumazet wrote:
> On Fri, Dec 5, 2025 at 2:51 AM Peter Zijlstra <peterz@infradead.org> wrote:
> >
> > On Thu, Dec 04, 2025 at 03:31:27PM +0000, Eric Dumazet wrote:
> > > clang needs __always_inline instead of inline, even for tiny helpers.
> > >
> > > This saves some cycles in system call fast path, and saves 195 bytes
> > > on x86_64 build:
> > >
> > > $ size vmlinux.before vmlinux.after
> > >    text          data     bss     dec     hex filename
> > > 34652814      22291961        5875180 62819955        3be8e73 vmlinux.before
> > > 34652619      22291961        5875180 62819760        3be8db0 vmlinux.after
> > >
> > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> >
> > Yeah, sometimes these inline heuristics drive me mad. I've picked up
> > this and the rseq one. I'll do something with them after rc1.
> 
> Thanks Peter.
> 
> I forgot to include perf numbers for this one, but apparently having a
>  local_irq_enable()
> in an out-of-line function in syscall path was adding a 5 % penalty on
> some platforms.
> 
> Crazy...

Earlier Zen with RET mitigation? ;-)
Re: [PATCH] entry: always inline local_irq_{enable,disable}_exit_to_user()
Posted by Eric Dumazet 2 months ago
On Fri, Dec 5, 2025 at 4:45 AM Peter Zijlstra <peterz@infradead.org> wrote:
>
> On Fri, Dec 05, 2025 at 02:54:26AM -0800, Eric Dumazet wrote:
> > On Fri, Dec 5, 2025 at 2:51 AM Peter Zijlstra <peterz@infradead.org> wrote:
> > >
> > > On Thu, Dec 04, 2025 at 03:31:27PM +0000, Eric Dumazet wrote:
> > > > clang needs __always_inline instead of inline, even for tiny helpers.
> > > >
> > > > This saves some cycles in system call fast path, and saves 195 bytes
> > > > on x86_64 build:
> > > >
> > > > $ size vmlinux.before vmlinux.after
> > > >    text          data     bss     dec     hex filename
> > > > 34652814      22291961        5875180 62819955        3be8e73 vmlinux.before
> > > > 34652619      22291961        5875180 62819760        3be8db0 vmlinux.after
> > > >
> > > > Signed-off-by: Eric Dumazet <edumazet@google.com>
> > >
> > > Yeah, sometimes these inline heuristics drive me mad. I've picked up
> > > this and the rseq one. I'll do something with them after rc1.
> >
> > Thanks Peter.
> >
> > I forgot to include perf numbers for this one, but apparently having a
> >  local_irq_enable()
> > in an out-of-line function in syscall path was adding a 5 % penalty on
> > some platforms.
> >
> > Crazy...
>
> Earlier Zen with RET mitigation? ;-)

This was AMD Rome "AMD EPYC 7B12 64-Core Processor",
bu also AMDTurin "AMD EPYC 9B45 128-Core Processor" to a certain extent.

When you say RET mitigation, this is the five int3 after retq, right ?
Re: [PATCH] entry: always inline local_irq_{enable,disable}_exit_to_user()
Posted by Peter Zijlstra 2 months ago
On Fri, Dec 05, 2025 at 05:03:33AM -0800, Eric Dumazet wrote:

> > Earlier Zen with RET mitigation? ;-)
> 
> This was AMD Rome "AMD EPYC 7B12 64-Core Processor",
> bu also AMDTurin "AMD EPYC 9B45 128-Core Processor" to a certain extent.
> 
> When you say RET mitigation, this is the five int3 after retq, right ?

Nope, that one is SLS, AMD has BTB type confusion on return prediction
(the AMD RetBleed) and patches all the RET sites with jumps to
retbleed_return_thunk(), or one of the srso*return_thunk() thingies. All
are somewhat expensive.

So while normally CALL+RET is well optimized and hardly noticeable, the
moment your uarch needs one of these return thunks, you're going to
notice them.
[tip: core/urgent] entry: Always inline local_irq_{enable,disable}_exit_to_user()
Posted by tip-bot2 for Eric Dumazet 1 month, 3 weeks ago
The following commit has been merged into the core/urgent branch of tip:

Commit-ID:     4a824c3128998158a093eaadd776a79abe3a601a
Gitweb:        https://git.kernel.org/tip/4a824c3128998158a093eaadd776a79abe3a601a
Author:        Eric Dumazet <edumazet@google.com>
AuthorDate:    Thu, 04 Dec 2025 15:31:27 
Committer:     Peter Zijlstra <peterz@infradead.org>
CommitterDate: Thu, 18 Dec 2025 10:43:52 +01:00

entry: Always inline local_irq_{enable,disable}_exit_to_user()

clang needs __always_inline instead of inline, even for tiny helpers.

This saves some cycles in system call fast path, and saves 195 bytes
on x86_64 build:

$ size vmlinux.before vmlinux.after
   text	   data	    bss	    dec	    hex	filename
34652814	22291961	5875180	62819955	3be8e73	vmlinux.before
34652619	22291961	5875180	62819760	3be8db0	vmlinux.after

Signed-off-by: Eric Dumazet <edumazet@google.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Link: https://patch.msgid.link/20251204153127.1321824-1-edumazet@google.com
---
 include/linux/irq-entry-common.h | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/include/linux/irq-entry-common.h b/include/linux/irq-entry-common.h
index 6ab913e..d26d1b1 100644
--- a/include/linux/irq-entry-common.h
+++ b/include/linux/irq-entry-common.h
@@ -110,7 +110,7 @@ static __always_inline void enter_from_user_mode(struct pt_regs *regs)
 static inline void local_irq_enable_exit_to_user(unsigned long ti_work);
 
 #ifndef local_irq_enable_exit_to_user
-static inline void local_irq_enable_exit_to_user(unsigned long ti_work)
+static __always_inline void local_irq_enable_exit_to_user(unsigned long ti_work)
 {
 	local_irq_enable();
 }
@@ -125,7 +125,7 @@ static inline void local_irq_enable_exit_to_user(unsigned long ti_work)
 static inline void local_irq_disable_exit_to_user(void);
 
 #ifndef local_irq_disable_exit_to_user
-static inline void local_irq_disable_exit_to_user(void)
+static __always_inline void local_irq_disable_exit_to_user(void)
 {
 	local_irq_disable();
 }