kernel/time/posix-timers.c | 3 +++ 1 file changed, 3 insertions(+)
PR_TIMER_CREATE_RESTORE_IDS_ON switches timer_create() into a mode
where the caller supplies explicit timer IDs, intended exclusively for
CRIU checkpoint/restore. The UAPI comment is explicit: "Don't use for
normal operations as the result might be undefined."
Despite that, the prctl handler has no capability check. Any unprivileged
process can enable the mode. Every comparable CRIU prctl in the kernel
gates on checkpoint_restore_ns_capable() - PR_SET_MM_MAP, PR_SET_MM_EXE_FILE,
etc. This one should too.
Add the check to the ON case only. OFF and GET are left unrestricted so a
process can always query or clear the flag without privilege.
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Anna-Maria Behnsen <anna-maria@linutronix.de>
Signed-off-by: Ashutosh Desai <ashutoshdesai993@gmail.com>
---
kernel/time/posix-timers.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c
index 413e2389f..b8582416b 100644
--- a/kernel/time/posix-timers.c
+++ b/kernel/time/posix-timers.c
@@ -9,6 +9,7 @@
*
* These are all the functions necessary to implement POSIX clocks & timers
*/
+#include <linux/capability.h>
#include <linux/compat.h>
#include <linux/compiler.h>
#include <linux/init.h>
@@ -380,6 +381,8 @@ long posixtimer_create_prctl(unsigned long ctrl)
current->signal->timer_create_restore_ids = 0;
return 0;
case PR_TIMER_CREATE_RESTORE_IDS_ON:
+ if (!checkpoint_restore_ns_capable(current_user_ns()))
+ return -EPERM;
current->signal->timer_create_restore_ids = 1;
return 0;
case PR_TIMER_CREATE_RESTORE_IDS_GET:
--
2.34.1
On Wed, Apr 8, 2026 at 8:27 PM Ashutosh Desai <ashutoshdesai993@gmail.com> wrote: > > PR_TIMER_CREATE_RESTORE_IDS_ON switches timer_create() into a mode > where the caller supplies explicit timer IDs, intended exclusively for > CRIU checkpoint/restore. The UAPI comment is explicit: "Don't use for > normal operations as the result might be undefined." > > Despite that, the prctl handler has no capability check. Any unprivileged > process can enable the mode. Every comparable CRIU prctl in the kernel > gates on checkpoint_restore_ns_capable() - PR_SET_MM_MAP, PR_SET_MM_EXE_FILE, > etc. This one should too. This statement is entirely accurate. Capability checks are typically required only when the operation has security implications or allows a process to bypass standard resource isolation. > > Add the check to the ON case only. OFF and GET are left unrestricted so a > process can always query or clear the flag without privilege. > > Cc: Thomas Gleixner <tglx@linutronix.de> > Cc: Anna-Maria Behnsen <anna-maria@linutronix.de> > Signed-off-by: Ashutosh Desai <ashutoshdesai993@gmail.com> > --- > kernel/time/posix-timers.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/kernel/time/posix-timers.c b/kernel/time/posix-timers.c > index 413e2389f..b8582416b 100644 > --- a/kernel/time/posix-timers.c > +++ b/kernel/time/posix-timers.c > @@ -9,6 +9,7 @@ > * > * These are all the functions necessary to implement POSIX clocks & timers > */ > +#include <linux/capability.h> > #include <linux/compat.h> > #include <linux/compiler.h> > #include <linux/init.h> > @@ -380,6 +381,8 @@ long posixtimer_create_prctl(unsigned long ctrl) > current->signal->timer_create_restore_ids = 0; > return 0; > case PR_TIMER_CREATE_RESTORE_IDS_ON: > + if (!checkpoint_restore_ns_capable(current_user_ns())) > + return -EPERM; > current->signal->timer_create_restore_ids = 1; > return 0; > case PR_TIMER_CREATE_RESTORE_IDS_GET: > -- > 2.34.1 > >
Thanks Andrei. Looking at the introducing commit (ec2d0c0), I see Thomas explicitly considered the rogue userspace scenario and concluded it is harmless. I will drop this patch. Ashutosh
© 2016 - 2026 Red Hat, Inc.