io_uring/timeout.c | 35 ++++++++++++++++++++++------------- io_uring/wait.c | 6 +++++- 2 files changed, 27 insertions(+), 14 deletions(-)
This series addresses two io_uring code paths that arm an ABS
hrtimer from a timestamp supplied by the caller. Both paths skip
the conversion from the submitter's time namespace view to host
view via timens_ktime_to_host(). The clock is CLOCK_MONOTONIC by
default, or optionally CLOCK_BOOTTIME.
All four other ABS timer interfaces already do this conversion:
timer_settime(TIMER_ABSTIME), clock_nanosleep(TIMER_ABSTIME),
alarm_timer_nsleep(TIMER_ABSTIME), and
timerfd_settime(TFD_TIMER_ABSTIME).
Patch 1/2 (io_uring/timeout) covers IORING_OP_TIMEOUT and
IORING_OP_LINK_TIMEOUT via io_parse_user_time(). It is essentially
the draft Pavel posted on the original thread. I rebased it on
io_uring-7.1 and verified end to end.
Patch 2/2 (io_uring/wait) covers the IORING_ENTER_ABS_TIMER path
in io_uring_enter(). That path parses ext_arg->ts inline rather
than going through io_parse_user_time(). Patch 1/2 therefore does
not cover it.
Per Pavel and Jens's discussion on the original thread, the two
sites use two direct timens_ktime_to_host() call sites rather
than a shared helper. Patch 1/2 also splits the existing
io_timeout_get_clock() into a flags only io_flags_to_clock(), so
io_parse_user_time() can resolve the clock without a
struct io_timeout_data.
SQPOLL is automatically covered. The SQPOLL kernel thread is
created via create_io_thread() with CLONE_THREAD and no CLONE_NEW*
flag. copy_namespaces() therefore shares the submitter's nsproxy
by reference. timens_ktime_to_host() through "current" sees the
submitter's time_ns when called from the SQPOLL kthread. PoCs for
both paths confirm this.
Reproducers (run inside unshare --user --time with a -10s
monotonic offset):
IORING_TIMEOUT_ABS (patch 1/2):
vanilla 7.1-rc: elapsed = 1 ms (bug, fires immediately)
patched: elapsed = 1000 ms (offset honoured)
IORING_ENTER_ABS_TIMER (patch 2/2):
vanilla 7.1-rc: elapsed = 1 ms (bug)
patched: elapsed = 999 ms (offset honoured)
Maoyi Xie (2):
io_uring/timeout: honour caller's time namespace for
IORING_TIMEOUT_ABS
io_uring/wait: honour caller's time namespace for
IORING_ENTER_ABS_TIMER
io_uring/timeout.c | 35 ++++++++++++++++++++++-------------
io_uring/wait.c | 6 +++++-
2 files changed, 27 insertions(+), 14 deletions(-)
base-commit: 04fe9aeb4f3c0999e6715385664c677469dfd8f4
--
2.34.1
On Mon, 04 May 2026 23:37:53 +0800, Maoyi Xie wrote:
> This series addresses two io_uring code paths that arm an ABS
> hrtimer from a timestamp supplied by the caller. Both paths skip
> the conversion from the submitter's time namespace view to host
> view via timens_ktime_to_host(). The clock is CLOCK_MONOTONIC by
> default, or optionally CLOCK_BOOTTIME.
>
> All four other ABS timer interfaces already do this conversion:
> timer_settime(TIMER_ABSTIME), clock_nanosleep(TIMER_ABSTIME),
> alarm_timer_nsleep(TIMER_ABSTIME), and
> timerfd_settime(TFD_TIMER_ABSTIME).
>
> [...]
Applied, thanks!
[1/2] io_uring/timeout: honour caller's time namespace for IORING_TIMEOUT_ABS
commit: 9cc6bac1bebf8310d2950d1411a91479e86d69a1
[2/2] io_uring/wait: honour caller's time namespace for IORING_ENTER_ABS_TIMER
commit: 45d2b37a37ab98484693533496395c610a2cab96
Best regards,
--
Jens Axboe
On 5/4/26 16:37, Maoyi Xie wrote: > This series addresses two io_uring code paths that arm an ABS > hrtimer from a timestamp supplied by the caller. Both paths skip > the conversion from the submitter's time namespace view to host > view via timens_ktime_to_host(). The clock is CLOCK_MONOTONIC by > default, or optionally CLOCK_BOOTTIME. > > All four other ABS timer interfaces already do this conversion: > timer_settime(TIMER_ABSTIME), clock_nanosleep(TIMER_ABSTIME), > alarm_timer_nsleep(TIMER_ABSTIME), and > timerfd_settime(TFD_TIMER_ABSTIME). > > Patch 1/2 (io_uring/timeout) covers IORING_OP_TIMEOUT and > IORING_OP_LINK_TIMEOUT via io_parse_user_time(). It is essentially > the draft Pavel posted on the original thread. I rebased it on > io_uring-7.1 and verified end to end. > > Patch 2/2 (io_uring/wait) covers the IORING_ENTER_ABS_TIMER path > in io_uring_enter(). That path parses ext_arg->ts inline rather > than going through io_parse_user_time(). Patch 1/2 therefore does > not cover it. > > Per Pavel and Jens's discussion on the original thread, the two > sites use two direct timens_ktime_to_host() call sites rather > than a shared helper. Patch 1/2 also splits the existing > io_timeout_get_clock() into a flags only io_flags_to_clock(), so > io_parse_user_time() can resolve the clock without a > struct io_timeout_data. > > SQPOLL is automatically covered. The SQPOLL kernel thread is > created via create_io_thread() with CLONE_THREAD and no CLONE_NEW* > flag. copy_namespaces() therefore shares the submitter's nsproxy > by reference. timens_ktime_to_host() through "current" sees the > submitter's time_ns when called from the SQPOLL kthread. PoCs for > both paths confirm this. At a quick glance, both look good. I think you had an isolated reproducer, are you sending it as a liburing test? Would be greatly appreciated. -- Pavel Begunkov
On 5/6/26 3:05 AM, Pavel Begunkov wrote: > On 5/4/26 16:37, Maoyi Xie wrote: >> This series addresses two io_uring code paths that arm an ABS >> hrtimer from a timestamp supplied by the caller. Both paths skip >> the conversion from the submitter's time namespace view to host >> view via timens_ktime_to_host(). The clock is CLOCK_MONOTONIC by >> default, or optionally CLOCK_BOOTTIME. >> >> All four other ABS timer interfaces already do this conversion: >> timer_settime(TIMER_ABSTIME), clock_nanosleep(TIMER_ABSTIME), >> alarm_timer_nsleep(TIMER_ABSTIME), and >> timerfd_settime(TFD_TIMER_ABSTIME). >> >> Patch 1/2 (io_uring/timeout) covers IORING_OP_TIMEOUT and >> IORING_OP_LINK_TIMEOUT via io_parse_user_time(). It is essentially >> the draft Pavel posted on the original thread. I rebased it on >> io_uring-7.1 and verified end to end. >> >> Patch 2/2 (io_uring/wait) covers the IORING_ENTER_ABS_TIMER path >> in io_uring_enter(). That path parses ext_arg->ts inline rather >> than going through io_parse_user_time(). Patch 1/2 therefore does >> not cover it. >> >> Per Pavel and Jens's discussion on the original thread, the two >> sites use two direct timens_ktime_to_host() call sites rather >> than a shared helper. Patch 1/2 also splits the existing >> io_timeout_get_clock() into a flags only io_flags_to_clock(), so >> io_parse_user_time() can resolve the clock without a >> struct io_timeout_data. >> >> SQPOLL is automatically covered. The SQPOLL kernel thread is >> created via create_io_thread() with CLONE_THREAD and no CLONE_NEW* >> flag. copy_namespaces() therefore shares the submitter's nsproxy >> by reference. timens_ktime_to_host() through "current" sees the >> submitter's time_ns when called from the SQPOLL kthread. PoCs for >> both paths confirm this. > > At a quick glance, both look good. I think you had an isolated > reproducer, are you sending it as a liburing test? Would be > greatly appreciated. +1 Yes please, test case for liburing would be great! -- Jens Axboe
Hi Pavel, Thanks for the look. We will turn the reproducers into a liburing test and send it shortly. The current shape is two minimal C programs. Each forks into a fresh user namespace plus time namespace with a -10s monotonic offset. The child submits either IORING_OP_TIMEOUT or io_uring_enter with IORING_ENTER_ABS_TIMER and a deadline of now + 1s. The test asserts the call returns after the expected ~1000ms rather than after <1ms. We will reshape that into a single liburing test that exercises both paths. The test will gate the unshare on CLONE_NEWUSER | CLONE_NEWTIME availability so it skips gracefully on kernels without time namespace support. It will use the standard t_* helpers. Maoyi Nanyang Technological University https://maoyixie.com/
© 2016 - 2026 Red Hat, Inc.