[PATCH 0/3] selftests/futex: fix the issue of abnormal test results caused by thread timing

Yuwen Chen posted 3 patches 5 days ago
There is a newer version of this series
[PATCH 0/3] selftests/futex: fix the issue of abnormal test results caused by thread timing
Posted by Yuwen Chen 5 days ago
On the Android arm32 platform, when performing the futex_requeue test, it will
most likely return a failure. The specific reason is detailed in a commit[1]
previously submitted by Edward Liaw. However, this commit cannot perfectly
solve the problem. This is because using a barrier does not guarantee that
the child thread will wait on futex_wait.

This series of patches attempts to solve this problem by checking whether
the child thread is in a sleeping state. This is because when the child thread
goes to sleep, it indicates that it is waiting for the futex lock.

Link: https://lore.kernel.org/all/20240918231102.234253-1-edliaw@google.com/
Re: [PATCH 3/3] selftests/futex: fix the issue of abnormal test results caused by thread timing
Posted by Licay 4 days, 4 hours ago
From: Licay <licayy@outlook.com>

Hi Ywen,

Thanks for the patch! I have a few suggestions to improve it:

1. Avoid Parsing /proc
   The current approach uses get_thread_state() to read /proc/$pid/status, which isn't very reliable.
   A better way would be to have waiterfn directly signal when it's ready using atomic operations.

2. Use Atomic Counting Instead of Polling Thread State
   Before entering futex_wait, waiterfn can atomically increment a counter. The parent thread then just waits for this counter to reach the expected value.
   This is much simpler and avoids the overhead of checking /proc repeatedly.

3. Use Standard Atomic Types
   Replace the custom READ_ONCE/WRITE_ONCE macros with standard <stdatomic.h> types like atomic_int.
   It's cleaner and more portable across different platforms.

Here's the basic idea:
- Add a global atomic_int ready_count variable
- In waiterfn: atomic_fetch_add(&ready_count, 1) right before futex_wait()
- Parent thread: spin-wait until atomic_load(&ready_count) reaches the expected value

This approach is much cleaner - no /proc dependency, simpler logic, and better performance.

Best regards,
Licay
Signed-off-by: Licay <licayy@outlook.com>
Re: [PATCH 3/3] selftests/futex: fix the issue of abnormal test results caused by thread timing
Posted by Yuwen Chen 1 day, 1 hour ago
on Fri, 12 Dec 2025 12:30:20 +0800, Licay wrote:
> 1. Avoid Parsing /proc
>    The current approach uses get_thread_state() to read /proc/$pid/status, which isn't very reliable.
>    A better way would be to have waiterfn directly signal when it's ready using atomic operations.

I haven't found a better way to check whether a process has entered the
sleep state without using the proc interface for the time being. The kernel
probably doesn't provide a system call to obtain whether a process has
entered the sleep state.

> 2. Use Atomic Counting Instead of Polling Thread State
>    Before entering futex_wait, waiterfn can atomically increment a counter. The parent thread then just waits for this counter to reach the expected value.
>    This is much simpler and avoids the overhead of checking /proc repeatedly.

Using atomic instructions does not guarantee that the process is waiting in futex_wait.

> 3. Use Standard Atomic Types
>    Replace the custom READ_ONCE/WRITE_ONCE macros with standard <stdatomic.h> types like atomic_int.
>    It's cleaner and more portable across different platforms.

Only two operations, atomic_set and atomic_read, are used here. Using
WRITE_ONCE and READ_ONCE can avoid introducing too many dependencies.

> Here's the basic idea:
> - Add a global atomic_int ready_count variable
> - In waiterfn: atomic_fetch_add(&ready_count, 1) right before futex_wait()
> - Parent thread: spin-wait until atomic_load(&ready_count) reaches the expected value

Similarly, using atomic_inc before futex_wait does not guarantee that
the thread has reached the futex_wait execution point.

Thank you very much for your reply.