popf is a horribly expensive instruction, while sti is an optimised fastpath.
Switching popf for a conditional branch and sti caused an 8% perf improvement
in various linux measurements.
Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com>
---
CC: Jan Beulich <JBeulich@suse.com>
CC: Roger Pau Monné <roger.pau@citrix.com>
CC: Wei Liu <wl@xen.org>
---
xen/include/asm-x86/system.h | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)
diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h
index 65e63de69a67..4be235472ecd 100644
--- a/xen/include/asm-x86/system.h
+++ b/xen/include/asm-x86/system.h
@@ -267,13 +267,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long index,
})
#define local_irq_restore(x) \
({ \
- BUILD_BUG_ON(sizeof(x) != sizeof(long)); \
- asm volatile ( "pushfq\n\t" \
- "andq %0, (%%rsp)\n\t" \
- "orq %1, (%%rsp)\n\t" \
- "popfq" \
- : : "i?r" ( ~X86_EFLAGS_IF ), \
- "ri" ( (x) & X86_EFLAGS_IF ) ); \
+ if ( (x) & X86_EFLAGS_IF ) \
+ local_irq_enable(); \
})
static inline int local_irq_is_enabled(void)
--
2.11.0
On 06/12/2021 13:38, Andrew Cooper wrote: > popf is a horribly expensive instruction, while sti is an optimised fastpath. > Switching popf for a conditional branch and sti caused an 8% perf improvement > in various linux measurements. > > Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> > --- > CC: Jan Beulich <JBeulich@suse.com> > CC: Roger Pau Monné <roger.pau@citrix.com> > CC: Wei Liu <wl@xen.org> > --- > xen/include/asm-x86/system.h | 9 ++------- > 1 file changed, 2 insertions(+), 7 deletions(-) > > diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h > index 65e63de69a67..4be235472ecd 100644 > --- a/xen/include/asm-x86/system.h > +++ b/xen/include/asm-x86/system.h > @@ -267,13 +267,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long index, > }) > #define local_irq_restore(x) \ > ({ \ > - BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ > - asm volatile ( "pushfq\n\t" \ > - "andq %0, (%%rsp)\n\t" \ > - "orq %1, (%%rsp)\n\t" \ > - "popfq" \ > - : : "i?r" ( ~X86_EFLAGS_IF ), \ > - "ri" ( (x) & X86_EFLAGS_IF ) ); \ > + if ( (x) & X86_EFLAGS_IF ) \ > + local_irq_enable(); \ > }) Bah. There's still the one total abuse of local_irq_restore() to disable interrupts. I'll do a pre-requisite patch. ~Andrew
On 06.12.2021 14:55, Andrew Cooper wrote: > On 06/12/2021 13:38, Andrew Cooper wrote: >> popf is a horribly expensive instruction, while sti is an optimised fastpath. >> Switching popf for a conditional branch and sti caused an 8% perf improvement >> in various linux measurements. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >> --- >> CC: Jan Beulich <JBeulich@suse.com> >> CC: Roger Pau Monné <roger.pau@citrix.com> >> CC: Wei Liu <wl@xen.org> >> --- >> xen/include/asm-x86/system.h | 9 ++------- >> 1 file changed, 2 insertions(+), 7 deletions(-) >> >> diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h >> index 65e63de69a67..4be235472ecd 100644 >> --- a/xen/include/asm-x86/system.h >> +++ b/xen/include/asm-x86/system.h >> @@ -267,13 +267,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long index, >> }) >> #define local_irq_restore(x) \ >> ({ \ >> - BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ >> - asm volatile ( "pushfq\n\t" \ >> - "andq %0, (%%rsp)\n\t" \ >> - "orq %1, (%%rsp)\n\t" \ >> - "popfq" \ >> - : : "i?r" ( ~X86_EFLAGS_IF ), \ >> - "ri" ( (x) & X86_EFLAGS_IF ) ); \ >> + if ( (x) & X86_EFLAGS_IF ) \ >> + local_irq_enable(); \ >> }) > > Bah. There's still the one total abuse of local_irq_restore() to > disable interrupts. Question is whether that's really to be considered an abuse: To me "restore" doesn't mean only "maybe re-enable", but also "maybe re-disable". And a conditional STI-or-CLI is likely still be better than POPF. Jan
On 06/12/2021 14:07, Jan Beulich wrote: > On 06.12.2021 14:55, Andrew Cooper wrote: >> On 06/12/2021 13:38, Andrew Cooper wrote: >>> popf is a horribly expensive instruction, while sti is an optimised fastpath. >>> Switching popf for a conditional branch and sti caused an 8% perf improvement >>> in various linux measurements. >>> >>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >>> --- >>> CC: Jan Beulich <JBeulich@suse.com> >>> CC: Roger Pau Monné <roger.pau@citrix.com> >>> CC: Wei Liu <wl@xen.org> >>> --- >>> xen/include/asm-x86/system.h | 9 ++------- >>> 1 file changed, 2 insertions(+), 7 deletions(-) >>> >>> diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h >>> index 65e63de69a67..4be235472ecd 100644 >>> --- a/xen/include/asm-x86/system.h >>> +++ b/xen/include/asm-x86/system.h >>> @@ -267,13 +267,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long index, >>> }) >>> #define local_irq_restore(x) \ >>> ({ \ >>> - BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ >>> - asm volatile ( "pushfq\n\t" \ >>> - "andq %0, (%%rsp)\n\t" \ >>> - "orq %1, (%%rsp)\n\t" \ >>> - "popfq" \ >>> - : : "i?r" ( ~X86_EFLAGS_IF ), \ >>> - "ri" ( (x) & X86_EFLAGS_IF ) ); \ >>> + if ( (x) & X86_EFLAGS_IF ) \ >>> + local_irq_enable(); \ >>> }) >> Bah. There's still the one total abuse of local_irq_restore() to >> disable interrupts. > Question is whether that's really to be considered an abuse: These are Linux's APIs, not ours, and they've spoken on the matter. Furthermore, I agree with this being an abuse of the mechanism. > To me > "restore" doesn't mean only "maybe re-enable", but also "maybe > re-disable". nor does "save" mean "save and disable", but that's what it does. The naming may not be completely ideal, but the expected usage is very much one way. > And a conditional STI-or-CLI is likely still be better > than POPF. It likely is better than popf, but for one single abuse which can be written in a better way anyway, it's really not worth it. ~Andrew
On 06.12.2021 16:10, Andrew Cooper wrote: > On 06/12/2021 14:07, Jan Beulich wrote: >> On 06.12.2021 14:55, Andrew Cooper wrote: >>> On 06/12/2021 13:38, Andrew Cooper wrote: >>>> popf is a horribly expensive instruction, while sti is an optimised fastpath. >>>> Switching popf for a conditional branch and sti caused an 8% perf improvement >>>> in various linux measurements. >>>> >>>> Signed-off-by: Andrew Cooper <andrew.cooper3@citrix.com> >>>> --- >>>> CC: Jan Beulich <JBeulich@suse.com> >>>> CC: Roger Pau Monné <roger.pau@citrix.com> >>>> CC: Wei Liu <wl@xen.org> >>>> --- >>>> xen/include/asm-x86/system.h | 9 ++------- >>>> 1 file changed, 2 insertions(+), 7 deletions(-) >>>> >>>> diff --git a/xen/include/asm-x86/system.h b/xen/include/asm-x86/system.h >>>> index 65e63de69a67..4be235472ecd 100644 >>>> --- a/xen/include/asm-x86/system.h >>>> +++ b/xen/include/asm-x86/system.h >>>> @@ -267,13 +267,8 @@ static inline unsigned long array_index_mask_nospec(unsigned long index, >>>> }) >>>> #define local_irq_restore(x) \ >>>> ({ \ >>>> - BUILD_BUG_ON(sizeof(x) != sizeof(long)); \ >>>> - asm volatile ( "pushfq\n\t" \ >>>> - "andq %0, (%%rsp)\n\t" \ >>>> - "orq %1, (%%rsp)\n\t" \ >>>> - "popfq" \ >>>> - : : "i?r" ( ~X86_EFLAGS_IF ), \ >>>> - "ri" ( (x) & X86_EFLAGS_IF ) ); \ >>>> + if ( (x) & X86_EFLAGS_IF ) \ >>>> + local_irq_enable(); \ >>>> }) >>> Bah. There's still the one total abuse of local_irq_restore() to >>> disable interrupts. >> Question is whether that's really to be considered an abuse: > > These are Linux's APIs, not ours, and they've spoken on the matter. > Furthermore, I agree with this being an abuse of the mechanism. > >> To me >> "restore" doesn't mean only "maybe re-enable", but also "maybe >> re-disable". > > nor does "save" mean "save and disable", but that's what it does. > > The naming may not be completely ideal, but the expected usage is very > much one way. > >> And a conditional STI-or-CLI is likely still be better >> than POPF. > > It likely is better than popf, but for one single abuse which can be > written in a better way anyway, it's really not worth it. Fine with me as long as we can be very certain that's it's really only one such case. Jan
© 2016 - 2024 Red Hat, Inc.