From: Tiwei Bie <tiwei.btw@antgroup.com>
We are going to support SMP in UML, so we can not hard code
the CPU and NUMA node in __vdso_getcpu() anymore.
Signed-off-by: Tiwei Bie <tiwei.btw@antgroup.com>
---
arch/x86/um/vdso/um_vdso.c | 18 ++++++++----------
1 file changed, 8 insertions(+), 10 deletions(-)
diff --git a/arch/x86/um/vdso/um_vdso.c b/arch/x86/um/vdso/um_vdso.c
index cbae2584124f..ee40ac446c1c 100644
--- a/arch/x86/um/vdso/um_vdso.c
+++ b/arch/x86/um/vdso/um_vdso.c
@@ -17,7 +17,7 @@
int __vdso_clock_gettime(clockid_t clock, struct __kernel_old_timespec *ts);
int __vdso_gettimeofday(struct __kernel_old_timeval *tv, struct timezone *tz);
__kernel_old_time_t __vdso_time(__kernel_old_time_t *t);
-long __vdso_getcpu(unsigned int *cpu, unsigned int *node, struct getcpu_cache *unused);
+long __vdso_getcpu(unsigned int *cpu, unsigned int *node, struct getcpu_cache *tcache);
int __vdso_clock_gettime(clockid_t clock, struct __kernel_old_timespec *ts)
{
@@ -60,18 +60,16 @@ __kernel_old_time_t __vdso_time(__kernel_old_time_t *t)
__kernel_old_time_t time(__kernel_old_time_t *t) __attribute__((weak, alias("__vdso_time")));
long
-__vdso_getcpu(unsigned int *cpu, unsigned int *node, struct getcpu_cache *unused)
+__vdso_getcpu(unsigned int *cpu, unsigned int *node, struct getcpu_cache *tcache)
{
- /*
- * UML does not support SMP, we can cheat here. :)
- */
+ long ret;
- if (cpu)
- *cpu = 0;
- if (node)
- *node = 0;
+ asm volatile("syscall"
+ : "=a" (ret)
+ : "0" (__NR_getcpu), "D" (cpu), "S" (node), "d" (tcache)
+ : "rcx", "r11", "memory");
- return 0;
+ return ret;
}
long getcpu(unsigned int *cpu, unsigned int *node, struct getcpu_cache *tcache)
--
2.34.1
On Sun, 2025-08-10 at 13:51 +0800, Tiwei Bie wrote: > From: Tiwei Bie <tiwei.btw@antgroup.com> > > We are going to support SMP in UML, so we can not hard code > the CPU and NUMA node in __vdso_getcpu() anymore. Correct. But does that mean we actually have to implement it via syscall in the VDSO? That seems a bit odd? ARM doesn't seem to have getcpu in the VDSO at all, for example, so could we do the same and just remove it? johannes
On 2025-09-10 13:59:02+0200, Johannes Berg wrote: > On Sun, 2025-08-10 at 13:51 +0800, Tiwei Bie wrote: > > From: Tiwei Bie <tiwei.btw@antgroup.com> > > > > We are going to support SMP in UML, so we can not hard code > > the CPU and NUMA node in __vdso_getcpu() anymore. > > Correct. But does that mean we actually have to implement it via syscall > in the VDSO? That seems a bit odd? ARM doesn't seem to have getcpu in > the VDSO at all, for example, so could we do the same and just remove > it? It is my understanding that the UM VDSO exists to cope with old versions of glibc which would fall back to the old vsyscall mechanism if no VDSO was present. That could fall through to the host kernels vsyscalls. See commit f1c2bb8b9964 ("um: implement a x86_64 vDSO"). If this is not necessary anymore, the whole VDSO on UM can probably go away. Thomas
On Sun, 21 Sep 2025 22:00:41 +0200, Thomas Weißschuh wrote: > On 2025-09-10 13:59:02+0200, Johannes Berg wrote: > > On Sun, 2025-08-10 at 13:51 +0800, Tiwei Bie wrote: > > > From: Tiwei Bie <tiwei.btw@antgroup.com> > > > > > > We are going to support SMP in UML, so we can not hard code > > > the CPU and NUMA node in __vdso_getcpu() anymore. > > > > Correct. But does that mean we actually have to implement it via syscall > > in the VDSO? That seems a bit odd? ARM doesn't seem to have getcpu in > > the VDSO at all, for example, so could we do the same and just remove > > it? > > It is my understanding that the UM VDSO exists to cope with old versions > of glibc which would fall back to the old vsyscall mechanism if no VDSO > was present. That could fall through to the host kernels vsyscalls. > See commit f1c2bb8b9964 ("um: implement a x86_64 vDSO"). > > If this is not necessary anymore, the whole VDSO on UM can probably go > away. The vsyscall usage was removed from glibc a decade ago: https://sourceware.org/git/?p=glibc.git;a=commit;h=7cbeabac0fb28e24c99aaa5085e613ea543a2346 "This patch removes the vsyscall usage for x86_64 port. As indicated by kernel code comments [1], vsyscalls are a legacy ABI and its concept is problematic: - It interferes with ASLR. - It's awkward to write code that lives in kernel addresses but is callable by userspace at fixed addresses. - The whole concept is impossible for 32-bit compat userspace. - UML cannot easily virtualize a vsyscall. ......" The original issue could now be considered resolved. So in v3, we no longer turn __vdso_getcpu into a syscall wrapper; we simply removed it. Perhaps we could remove the whole VDSO before we implement the "real" VDSO. However, its implementation is clean, so keeping it wouldn't hurt and it could serve as a useful starting point for the "real" VDSO. Regards, Tiwei
On 2025-09-22 12:50:20+0800, Tiwei Bie wrote: > On Sun, 21 Sep 2025 22:00:41 +0200, Thomas Weißschuh wrote: > > On 2025-09-10 13:59:02+0200, Johannes Berg wrote: > > > On Sun, 2025-08-10 at 13:51 +0800, Tiwei Bie wrote: > > > > From: Tiwei Bie <tiwei.btw@antgroup.com> > > > > > > > > We are going to support SMP in UML, so we can not hard code > > > > the CPU and NUMA node in __vdso_getcpu() anymore. > > > > > > Correct. But does that mean we actually have to implement it via syscall > > > in the VDSO? That seems a bit odd? ARM doesn't seem to have getcpu in > > > the VDSO at all, for example, so could we do the same and just remove > > > it? > > > > It is my understanding that the UM VDSO exists to cope with old versions > > of glibc which would fall back to the old vsyscall mechanism if no VDSO > > was present. That could fall through to the host kernels vsyscalls. > > See commit f1c2bb8b9964 ("um: implement a x86_64 vDSO"). > > > > If this is not necessary anymore, the whole VDSO on UM can probably go > > away. > > The vsyscall usage was removed from glibc a decade ago: > > https://sourceware.org/git/?p=glibc.git;a=commit;h=7cbeabac0fb28e24c99aaa5085e613ea543a2346 > > "This patch removes the vsyscall usage for x86_64 port. As indicated > by kernel code comments [1], vsyscalls are a legacy ABI and its concept > is problematic: > > - It interferes with ASLR. > - It's awkward to write code that lives in kernel addresses but is > callable by userspace at fixed addresses. > - The whole concept is impossible for 32-bit compat userspace. > - UML cannot easily virtualize a vsyscall. > > ......" Also modern kernels dont even implement the vsyscall page anymore. At most it is implemented as a stub which will trigger the real syscall which then gets handled properly. > The original issue could now be considered resolved. So in v3, we no > longer turn __vdso_getcpu into a syscall wrapper; we simply removed it. > Perhaps we could remove the whole VDSO before we implement the "real" > VDSO. However, its implementation is clean, so keeping it wouldn't hurt > and it could serve as a useful starting point for the "real" VDSO. A "real" vDSO would require quite some more infrastructure. And it is not even clear if such a vDSO will make a difference on UML. In my opinion if __vdso_getcpu() gets removed, the whole vDSO should go with it. The code can still be easily restored from git. Also the functionality to map the host vDSO and vsyscall page into UML userspace looks very weird and error-prone. Maybe it can also go away. Thomas
On Mon, 2025-09-22 at 14:05 +0200, Thomas Weißschuh wrote: > > The original issue could now be considered resolved. So in v3, we no > > longer turn __vdso_getcpu into a syscall wrapper; we simply removed it. > > Perhaps we could remove the whole VDSO before we implement the "real" > > VDSO. However, its implementation is clean, so keeping it wouldn't hurt > > and it could serve as a useful starting point for the "real" VDSO. > > A "real" vDSO would require quite some more infrastructure. > What's not "real" about the vDSO now? Yes it just implement syscalls after the getcpu removal, but ... it's still a vDSO? I _have_ played with getting data into it for the time-travel case, at least. > And it is not even clear if such a vDSO will make a difference on UML. Syscall overhead is _huge_ in UML, if it does anything but syscalls it will _certainly_ make a difference. > In my > opinion if __vdso_getcpu() gets removed, the whole vDSO should go with > it. The code can still be easily restored from git. I mean ... on the one hand, sure, it doesn't really do much after this, but OTOH it lets userspace actually use that path? So might be useful. > Also the functionality to map the host vDSO and vsyscall page into UML > userspace looks very weird and error-prone. Maybe it can also go away. Surely host vDSO etc. is never mapped into UML userspace and never is, not sure what you're thinking of, but clearly that's wrong as written. johannes
On 2025-09-22 14:12:52+0200, Johannes Berg wrote: > On Mon, 2025-09-22 at 14:05 +0200, Thomas Weißschuh wrote: > > > The original issue could now be considered resolved. So in v3, we no > > > longer turn __vdso_getcpu into a syscall wrapper; we simply removed it. > > > Perhaps we could remove the whole VDSO before we implement the "real" > > > VDSO. However, its implementation is clean, so keeping it wouldn't hurt > > > and it could serve as a useful starting point for the "real" VDSO. > > > > A "real" vDSO would require quite some more infrastructure. > > > > What's not "real" about the vDSO now? Yes it just implement syscalls > after the getcpu removal, but ... it's still a vDSO? I _have_ played > with getting data into it for the time-travel case, at least. Right now it does not provide any advantage over a regular syscall. Essentially it is just overhead. That said, if you do want to make a real vDSO out of it, I'd be happy to help in that. (I did most of the recent work on the generic vDSO infrastructure) > > And it is not even clear if such a vDSO will make a difference on UML. > > Syscall overhead is _huge_ in UML, if it does anything but syscalls it > will _certainly_ make a difference. Ack. > > In my > > opinion if __vdso_getcpu() gets removed, the whole vDSO should go with > > it. The code can still be easily restored from git. > > I mean ... on the one hand, sure, it doesn't really do much after this, > but OTOH it lets userspace actually use that path? So might be useful. What advantage does userspace have from it? > > Also the functionality to map the host vDSO and vsyscall page into UML > > userspace looks very weird and error-prone. Maybe it can also go away. > > Surely host vDSO etc. is never mapped into UML userspace and never is, > not sure what you're thinking of, but clearly that's wrong as written. This is how I understand the 32bit implementation using ARCH_REUSE_HOST_VSYSCALL_AREA and NEW_AUX_ENT(AT_SYSINFO_EHDR, vsyscall_ehdr) where vsyscall_ehdr comes from the hosts getauxval(AT_SYSINFO_EHDR). But I didn't actually test this. I'll look at it again, but currently I'm travelling. Thomas
On Mon, 2025-09-22 at 16:01 +0200, Thomas Weißschuh wrote: > Right now it does not provide any advantage over a regular syscall. > Essentially it is just overhead. That said, if you do want to make a > real vDSO out of it, I'd be happy to help in that. I don't know if I'd say "just overhead" - depends on which path is more optimised in a typical libc implementation? I'd basically think it's identical, no? You either link to the vDSO, or a __weak same function in the libc? > > I mean ... on the one hand, sure, it doesn't really do much after this, > > but OTOH it lets userspace actually use that path? So might be useful. > > What advantage does userspace have from it? Right now, none? But it's easier to play with if you have the infrastructure, and I'm not convinced there's a _disadvantage_? > > > Also the functionality to map the host vDSO and vsyscall page into UML > > > userspace looks very weird and error-prone. Maybe it can also go away. > > > > Surely host vDSO etc. is never mapped into UML userspace and never is, > > not sure what you're thinking of, but clearly that's wrong as written. > > This is how I understand the 32bit implementation using > ARCH_REUSE_HOST_VSYSCALL_AREA and NEW_AUX_ENT(AT_SYSINFO_EHDR, vsyscall_ehdr) > where vsyscall_ehdr comes from the hosts getauxval(AT_SYSINFO_EHDR). Huh, hm, yeah I forgot about that ... 32-bit. Yeah, agree we should just kill that. I'm not even sure it works with the host kernel trapping there? Oh well. johannes
On 2025-09-22 17:14:18+0200, Johannes Berg wrote: > On Mon, 2025-09-22 at 16:01 +0200, Thomas Weißschuh wrote: > > Right now it does not provide any advantage over a regular syscall. > > Essentially it is just overhead. That said, if you do want to make a > > real vDSO out of it, I'd be happy to help in that. > > I don't know if I'd say "just overhead" - depends on which path is more > optimised in a typical libc implementation? I'd basically think it's > identical, no? You either link to the vDSO, or a __weak same function in > the libc? The code also needs to be built and maintained. AFAIK __weak is only for the compile-time linker. The vDSO call will be an indirect call. > > > I mean ... on the one hand, sure, it doesn't really do much after this, > > > but OTOH it lets userspace actually use that path? So might be useful. > > > > What advantage does userspace have from it? > > Right now, none? But it's easier to play with if you have the > infrastructure, and I'm not convinced there's a _disadvantage_? So far that hasn't happened. The disadvantages are the ones from above, nothing critical. But of course it is your subsystem and your call to make. > > > > Also the functionality to map the host vDSO and vsyscall page into UML > > > > userspace looks very weird and error-prone. Maybe it can also go away. > > > > > > Surely host vDSO etc. is never mapped into UML userspace and never is, > > > not sure what you're thinking of, but clearly that's wrong as written. > > > > This is how I understand the 32bit implementation using > > ARCH_REUSE_HOST_VSYSCALL_AREA and NEW_AUX_ENT(AT_SYSINFO_EHDR, vsyscall_ehdr) > > where vsyscall_ehdr comes from the hosts getauxval(AT_SYSINFO_EHDR). > > Huh, hm, yeah I forgot about that ... 32-bit. Yeah, agree we should just > kill that. I'm not even sure it works with the host kernel trapping > there? Oh well. Ack, do you want me to send a patch? This was my real gripe with the UM vDSO. I want to enable time namespaces for all architectures but these need to be handled in the vDSO properly. For the 64-bit stub vDSO it's not a problem as the syscalls will work correctly. But the interaction with the weird 32-bit logic on the other hand... Thomas
On Mon, 2025-09-22 at 18:04 +0200, Thomas Weißschuh wrote: > > I don't know if I'd say "just overhead" - depends on which path is more > > optimised in a typical libc implementation? I'd basically think it's > > identical, no? You either link to the vDSO, or a __weak same function in > > the libc? > > The code also needs to be built and maintained. Yeah, fair, though the vDSO doesn't really do anything, and if we do want to use it then having a working version beats having to re-do it from scratch, which > AFAIK __weak is only for > the compile-time linker. The vDSO call will be an indirect call. yeah, I looked at the links you'd sent earlier only later ... > > > > I mean ... on the one hand, sure, it doesn't really do much after this, > > > > but OTOH it lets userspace actually use that path? So might be useful. > > > > > > What advantage does userspace have from it? > > > > Right now, none? But it's easier to play with if you have the > > infrastructure, and I'm not convinced there's a _disadvantage_? > > So far that hasn't happened. The disadvantages are the ones from above, > nothing critical. But of course it is your subsystem and your call to make. Yeah, kind of agree, though I'd like to actually use it - especially in time-travel mode - but haven't really gotten time to add it. Having it maintained in-tree is a bit nicer in case of global updates, but yeah, ultimately it's not really all that important either way. I guess we could get getrandom() pretty easily by taking the x86 one. I actually have half a patch somewhere that rejiggers the UM vDSO to be more like normal architectures, using lib/vdso/gettimeofday.c and making the build more regular etc. Maybe I should dig that up and try to make it work entirely - it was part of a previous attempt of adding the time- travel thing I mentioned. > > Huh, hm, yeah I forgot about that ... 32-bit. Yeah, agree we should just > > kill that. I'm not even sure it works with the host kernel trapping > > there? Oh well. > > Ack, do you want me to send a patch? This was my real gripe with the UM > vDSO. I want to enable time namespaces for all architectures but these > need to be handled in the vDSO properly. For the 64-bit stub vDSO it's > not a problem as the syscalls will work correctly. > But the interaction with the weird 32-bit logic on the other hand... I guess? But I'm confused by what you say about it being related to time namespaces, the vsyscall stuff doesn't really _do_ anything, assuming it works at all? It's not like the host actually could be doing anything other than syscalls there, which are intercepted? If it were doing anything else, it wouldn't work in UML in the first place? johannes
On 2025-09-22 19:07:27+0200, Johannes Berg wrote: > On Mon, 2025-09-22 at 18:04 +0200, Thomas Weißschuh wrote: (...) > > > > > I mean ... on the one hand, sure, it doesn't really do much after this, > > > > > but OTOH it lets userspace actually use that path? So might be useful. > > > > > > > > What advantage does userspace have from it? > > > > > > Right now, none? But it's easier to play with if you have the > > > infrastructure, and I'm not convinced there's a _disadvantage_? > > > > So far that hasn't happened. The disadvantages are the ones from above, > > nothing critical. But of course it is your subsystem and your call to make. > > Yeah, kind of agree, though I'd like to actually use it - especially in > time-travel mode - but haven't really gotten time to add it. Having it > maintained in-tree is a bit nicer in case of global updates, but yeah, > ultimately it's not really all that important either way. > > I guess we could get getrandom() pretty easily by taking the x86 one. Yeah, the only architecture-specific part there is the assembly chacha implementation. And that will be the same one as used by regular x86. > I actually have half a patch somewhere that rejiggers the UM vDSO to be > more like normal architectures, using lib/vdso/gettimeofday.c and making > the build more regular etc. Maybe I should dig that up and try to make > it work entirely - it was part of a previous attempt of adding the time- > travel thing I mentioned. Sounds good. And let me know if you want me to look at it. Using the generic vDSO library and datastore is mandatory nowadays for "real" vDSOs. > > > Huh, hm, yeah I forgot about that ... 32-bit. Yeah, agree we should just > > > kill that. I'm not even sure it works with the host kernel trapping > > > there? Oh well. > > > > Ack, do you want me to send a patch? This was my real gripe with the UM > > vDSO. I want to enable time namespaces for all architectures but these > > need to be handled in the vDSO properly. For the 64-bit stub vDSO it's > > not a problem as the syscalls will work correctly. > > But the interaction with the weird 32-bit logic on the other hand... > > I guess? But I'm confused by what you say about it being related to time > namespaces, the vsyscall stuff doesn't really _do_ anything, assuming it > works at all? It's not like the host actually could be doing anything > other than syscalls there, which are intercepted? If it were doing > anything else, it wouldn't work in UML in the first place? In emulation mode the trapping kernel will not actually trigger a syscall but calculate the time in kernel space and write the results to the respective registers. If I understand correctly the trap is handled by the host kernel, so that would bypass UML completely. My wording was a bit wonky. I stumbled upon this while looking for potential time namespace compatibility issues. And with time namespaces the chance for a clock mismatch between UML and the host are higher. Thomas
On Wed, 10 Sep 2025 13:59:02 +0200, Johannes Berg wrote: > On Sun, 2025-08-10 at 13:51 +0800, Tiwei Bie wrote: > > From: Tiwei Bie <tiwei.btw@antgroup.com> > > > > We are going to support SMP in UML, so we can not hard code > > the CPU and NUMA node in __vdso_getcpu() anymore. > > Correct. But does that mean we actually have to implement it via syscall > in the VDSO? That seems a bit odd? ARM doesn't seem to have getcpu in > the VDSO at all, for example, so could we do the same and just remove > it? Good idea. I checked the implementations in glibc and musl, and they automatically fall back to the syscall when __vdso_getcpu is not available: https://sourceware.org/git/?p=glibc.git;a=blob;f=sysdeps/unix/sysv/linux/sysdep-vdso.h;h=5a33871872da9ccef36293c3ca5eba6503f956e6;hb=HEAD#l36 https://git.musl-libc.org/cgit/musl/tree/src/sched/sched_getcpu.c?h=v1.2.5#n32 I will just remove it in the next version. Regards, Tiwei
© 2016 - 2025 Red Hat, Inc.