include/linux/entry-common.h | 137 ++++++++++++++++++++++++++++++++- kernel/entry/common.c | 145 ++--------------------------------- 2 files changed, 138 insertions(+), 144 deletions(-)
Hi List, looking into the performance of syscall entry/exit after s390 switched to generic entry showed that there's quite some overhead calling some of the entry/exit work functions even when there's nothing to do. This patchset moves the entry and exit function to entry-common.h, so non inlined code gets only called when there is some work pending. I wrote a small program that just issues invalid syscalls in a loop. On an s390 machine, this results in the following numbers: without this series: # ./syscall 1000000000 runtime: 94.886581s / per-syscall 9.488658e-08s with this series: ./syscall 1000000000 runtime: 84.732391s / per-syscall 8.473239e-08s so the time required for one syscall dropped from 94.8ns to 84.7ns, which is a drop of about 11%. Sven Schnelle (3): entry: move exit to usermode functions to header file move enter_from_user_mode() to header file entry: move syscall_enter_from_user_mode() to header file include/linux/entry-common.h | 137 ++++++++++++++++++++++++++++++++- kernel/entry/common.c | 145 ++--------------------------------- 2 files changed, 138 insertions(+), 144 deletions(-) -- 2.40.1
On Tue, Dec 05, 2023 at 02:30:12PM +0100, Sven Schnelle wrote: > Hi List, > > looking into the performance of syscall entry/exit after s390 switched > to generic entry showed that there's quite some overhead calling some > of the entry/exit work functions even when there's nothing to do. > This patchset moves the entry and exit function to entry-common.h, so > non inlined code gets only called when there is some work pending. So per that logic you wouldn't need to inline exit_to_user_mode_loop() for example, that's only called when there is a EXIT_TO_USER_MODE_WORK bit set. That is, I'm just being pedantic here and pointing out that your justification doesn't cover the extent of the changes. > I wrote a small program that just issues invalid syscalls in a loop. > On an s390 machine, this results in the following numbers: > > without this series: > > # ./syscall 1000000000 > runtime: 94.886581s / per-syscall 9.488658e-08s > > with this series: > > ./syscall 1000000000 > runtime: 84.732391s / per-syscall 8.473239e-08s > > so the time required for one syscall dropped from 94.8ns to > 84.7ns, which is a drop of about 11%. That is obviously very nice, and I don't immediately see anything wrong with moving the lot to header based inlines. Thomas?
Peter Zijlstra <peterz@infradead.org> writes: > On Tue, Dec 05, 2023 at 02:30:12PM +0100, Sven Schnelle wrote: >> Hi List, >> >> looking into the performance of syscall entry/exit after s390 switched >> to generic entry showed that there's quite some overhead calling some >> of the entry/exit work functions even when there's nothing to do. >> This patchset moves the entry and exit function to entry-common.h, so >> non inlined code gets only called when there is some work pending. > > So per that logic you wouldn't need to inline exit_to_user_mode_loop() > for example, that's only called when there is a EXIT_TO_USER_MODE_WORK > bit set. > > That is, I'm just being pedantic here and pointing out that your > justification doesn't cover the extent of the changes. > >> I wrote a small program that just issues invalid syscalls in a loop. >> On an s390 machine, this results in the following numbers: >> >> without this series: >> >> # ./syscall 1000000000 >> runtime: 94.886581s / per-syscall 9.488658e-08s >> >> with this series: >> >> ./syscall 1000000000 >> runtime: 84.732391s / per-syscall 8.473239e-08s >> >> so the time required for one syscall dropped from 94.8ns to >> 84.7ns, which is a drop of about 11%. > > That is obviously very nice, and I don't immediately see anything wrong > with moving the lot to header based inlines. > > Thomas? Thomas, any opinion on this change?
On Thu, Dec 14 2023 at 09:24, Sven Schnelle wrote: > Peter Zijlstra <peterz@infradead.org> writes: >>> so the time required for one syscall dropped from 94.8ns to >>> 84.7ns, which is a drop of about 11%. >> >> That is obviously very nice, and I don't immediately see anything wrong >> with moving the lot to header based inlines. >> >> Thomas? No objections in principle. Let me look at the lot
© 2016 - 2025 Red Hat, Inc.