[v2] selftests/futex: fix the failed futex_requeue test issue

[PATCH v2] selftests/futex: fix the failed futex_requeue test issue

Posted by Yuwen Chen 1 week, 5 days ago

This test item has extremely high requirements for timing and can only
pass the test under specific conditions. The following situations will
lead to test failure:

    MainThread                  Thread1
        │
  pthread_create-------------------┐
        │                          │
 futex_cmp_requeue                 │
        │                     futex_wait
        │                          │

If the child thread is not waiting in the futex_wait function when the
main thread reaches the futex_cmp_requeue function, the test will fail.

This patch avoids this problem by checking whether the child thread is
in a sleeping state in the main thread.

Fixes: 7cb5dd8e2c8c ("selftests: futex: Add futex compare requeue test")
Signed-off-by: Yuwen Chen <ywen.chen@foxmail.com>
Co-developed-by: Edward Liaw <edliaw@google.com>
Signed-off-by: Edward Liaw <edliaw@google.com>
---
v1->v2:
    1. Fix the issue of abnormal use of fscanf in the get_thread_state function
    2. Add timeout logic

 .../futex/functional/futex_requeue.c          | 70 ++++++++++++++++---
 1 file changed, 59 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/futex/functional/futex_requeue.c b/tools/testing/selftests/futex/functional/futex_requeue.c
index 35d4be23db5da..a9d96105134d0 100644
--- a/tools/testing/selftests/futex/functional/futex_requeue.c
+++ b/tools/testing/selftests/futex/functional/futex_requeue.c
@@ -7,21 +7,27 @@
 
 #include <pthread.h>
 #include <limits.h>
+#include <stdatomic.h>
 
 #include "futextest.h"
 #include "kselftest_harness.h"
 
-#define timeout_ns  30000000
-#define WAKE_WAIT_US 10000
+#define timeout_s  3 /* 3s */
+#define WAKE_WAIT_US (10000 * 100) /* 1s */
 
 volatile futex_t *f1;
+static pthread_barrier_t barrier;
 
 void *waiterfn(void *arg)
 {
 	struct timespec to;
+	atomic_int *tid = (atomic_int *)arg;
 
-	to.tv_sec = 0;
-	to.tv_nsec = timeout_ns;
+	to.tv_sec = timeout_s;
+	to.tv_nsec = 0;
+
+	atomic_store(tid, gettid());
+	pthread_barrier_wait(&barrier);
 
 	if (futex_wait(f1, *f1, &to, 0))
 		printf("waiter failed errno %d\n", errno);
@@ -29,22 +35,52 @@ void *waiterfn(void *arg)
 	return NULL;
 }
 
+static int get_thread_state(pid_t pid)
+{
+	FILE *fp;
+	char buf[80], tag[80];
+	char val = 0;
+
+	snprintf(buf, sizeof(buf), "/proc/%d/status", pid);
+	fp = fopen(buf, "r");
+	if (!fp)
+		return -1;
+
+	while (fgets(buf, sizeof(buf), fp))
+		if (sscanf(buf, "%s %c", tag, &val) == 2 && !strcmp(tag, "State:")) {
+			fclose(fp);
+			return val;
+		}
+
+	fclose(fp);
+	return -1;
+}
+
 TEST(requeue_single)
 {
 	volatile futex_t _f1 = 0;
 	volatile futex_t f2 = 0;
 	pthread_t waiter[10];
-	int res;
+	atomic_int tid = 0;
+	int res, state, retry = 100;
 
 	f1 = &_f1;
+	pthread_barrier_init(&barrier, NULL, 2);
 
 	/*
 	 * Requeue a waiter from f1 to f2, and wake f2.
 	 */
-	if (pthread_create(&waiter[0], NULL, waiterfn, NULL))
+	if (pthread_create(&waiter[0], NULL, waiterfn, &tid))
 		ksft_exit_fail_msg("pthread_create failed\n");
 
-	usleep(WAKE_WAIT_US);
+	pthread_barrier_wait(&barrier);
+	pthread_barrier_destroy(&barrier);
+	while ((state = get_thread_state(atomic_load(&tid))) != 'S') {
+		usleep(WAKE_WAIT_US / 100);
+
+		if (state < 0 || retry-- <= 0)
+			break;
+	}
 
 	ksft_print_dbg_msg("Requeuing 1 futex from f1 to f2\n");
 	res = futex_cmp_requeue(f1, 0, &f2, 0, 1, 0);
@@ -69,7 +105,8 @@ TEST(requeue_multiple)
 	volatile futex_t _f1 = 0;
 	volatile futex_t f2 = 0;
 	pthread_t waiter[10];
-	int res, i;
+	atomic_int tids[10] = {0};
+	int res, i, state, retry = 0;
 
 	f1 = &_f1;
 
@@ -78,11 +115,22 @@ TEST(requeue_multiple)
 	 * At futex_wake, wake INT_MAX (should be exactly 7).
 	 */
 	for (i = 0; i < 10; i++) {
-		if (pthread_create(&waiter[i], NULL, waiterfn, NULL))
+		pthread_barrier_init(&barrier, NULL, 2);
+
+		if (pthread_create(&waiter[i], NULL, waiterfn, &tids[i]))
 			ksft_exit_fail_msg("pthread_create failed\n");
-	}
 
-	usleep(WAKE_WAIT_US);
+		pthread_barrier_wait(&barrier);
+		pthread_barrier_destroy(&barrier);
+
+		retry += 10;
+		while ((state = get_thread_state(atomic_load(&tids[i]))) != 'S') {
+			usleep(WAKE_WAIT_US / 100);
+
+			if (state < 0 || retry-- <= 0)
+				break;
+		}
+	}
 
 	ksft_print_dbg_msg("Waking 3 futexes at f1 and requeuing 7 futexes from f1 to f2\n");
 	res = futex_cmp_requeue(f1, 0, &f2, 3, 7, 0);
-- 
2.34.1

Re: [PATCH v2] selftests/futex: fix the failed futex_requeue test issue

Posted by Thomas Gleixner 1 week, 3 days ago

On Mon, Jan 26 2026 at 17:33, Yuwen Chen wrote:
> This test item has extremely high requirements for timing and can only

Extremely high?

The main thread waits for 10000us aka. 10 seconds to allow the waiter
thread to reach futex_wait().

If anything is extreme then it's the 10 seconds wait, not the
requirements. Please write factual changelogs and not fairy tales.

> pass the test under specific conditions. The following situations will
> lead to test failure:
>
>     MainThread                  Thread1
>         │
>   pthread_create-------------------┐
>         │                          │
>  futex_cmp_requeue                 │
>         │                     futex_wait
>         │                          │
>
> If the child thread is not waiting in the futex_wait function when the
> main thread reaches the futex_cmp_requeue function, the test will
> fail.

That's a known issue for all futex selftests when the test system is
under extreme load. That's why there is a gratious 10 seconds timeout,
which is annoyingly long already.

Also why is this special for the requeue_single test case?

It's exactly the same issue for all futex selftests including the multi
waiter one in the very same file, no?

> This patch avoids this problem by checking whether the child thread is
> in a sleeping state in the main thread.

# git grep 'This patch' Documentation/process

>  volatile futex_t *f1;
> +static pthread_barrier_t barrier;
>  
>  void *waiterfn(void *arg)
>  {
>  	struct timespec to;
> +	atomic_int *tid = (atomic_int *)arg;

https://www.kernel.org/doc/html/latest/process/maintainer-tip.html#coding-style-notes

All over the place.

> -	to.tv_sec = 0;
> -	to.tv_nsec = timeout_ns;
> +	to.tv_sec = timeout_s;
> +	to.tv_nsec = 0;
> +
> +	atomic_store(tid, gettid());

Why do you need an atomic store here?

pthread_barrier_wait() is a full memory barrier already, no?

> +	pthread_barrier_wait(&barrier);
>  
>  	if (futex_wait(f1, *f1, &to, 0))
>  		printf("waiter failed errno %d\n", errno);
> @@ -29,22 +35,52 @@ void *waiterfn(void *arg)
>  	return NULL;
>  }
>  
> +static int get_thread_state(pid_t pid)
> +{
> +	FILE *fp;
> +	char buf[80], tag[80];
> +	char val = 0;
> +
> +	snprintf(buf, sizeof(buf), "/proc/%d/status", pid);
> +	fp = fopen(buf, "r");
> +	if (!fp)
> +		return -1;
> +
> +	while (fgets(buf, sizeof(buf), fp))

Lacks curly braces on the while...

> +		if (sscanf(buf, "%s %c", tag, &val) == 2 && !strcmp(tag, "State:")) {
> +			fclose(fp);
> +			return val;
> +		}

What's wrong with reading /proc/$PID/wchan ?

It's equally unreliable as /proc/$PID/stat because both can return the
desired state _before_ the thread reaches the inner workings of the test
related sys_futex(... WAIT).

> +	fclose(fp);
> +	return -1;
> +}
> +

>  TEST(requeue_single)
>  {
>  	volatile futex_t _f1 = 0;
>  	volatile futex_t f2 = 0;
>  	pthread_t waiter[10];
> -	int res;
> +	atomic_int tid = 0;
> +	int res, state, retry = 100;
>  
>  	f1 = &_f1;
> +	pthread_barrier_init(&barrier, NULL, 2);
>  
>  	/*
>  	 * Requeue a waiter from f1 to f2, and wake f2.
>  	 */
> -	if (pthread_create(&waiter[0], NULL, waiterfn, NULL))
> +	if (pthread_create(&waiter[0], NULL, waiterfn, &tid))
>  		ksft_exit_fail_msg("pthread_create failed\n");
>  
> -	usleep(WAKE_WAIT_US);
> +	pthread_barrier_wait(&barrier);
> +	pthread_barrier_destroy(&barrier);
> +	while ((state = get_thread_state(atomic_load(&tid))) != 'S') {
> +		usleep(WAKE_WAIT_US / 100);
> +
> +		if (state < 0 || retry-- <= 0)
> +			break;
> +	}

That's a disgusting hack. Are you going to copy this stuff around into
_all_ futex selftests, which suffer from exactly the same problem?
Please grep for 'WAKE_WAIT_US' to see them all.

Something like the uncompiled below in a "library" C source which is
linked into every futex test case:

#define WAIT_THREAD_RETRIES		100
#define WAIT_THREAD_DELAY_US		100

static int wait_for_thread(FILE *fp)
{
	char buf[80];

	for (int i = 0; i < WAIT_THREAD_RETRIES; i++) {
		if (!fgets(buf, sizeof(buf), fp))
			return -EIO;
		if (!strncmp(buf, "futex", 5))
			return 0;
                usleep(WAIT_THREAD_DELAY_US);
		rewind(fp);
	}
	return -ETIMEDOUT;
}

int futex_wait_for_thread(pid_t tid)
{
	char fname[80];
	FILE *fp;
	int res;

	snprintf(fname, sizeof(fname), "/proc/%d/wchan", tid);
	fp = fopen(fname, "r");
	if (!fp)
        	return -EIO;
	res = wait_for_thread(fp);
	fclose(fp);
	return res;
}

No?

While at it create a helper mechanism which avoids copying the whole
pthread_create()/barrier() muck around to every single test case:

struct thread_data {
	pthread_t		thread;
	pthread_barrier_t	barrier;
	pid_t			tid;
	void			(*threadfn)(void *);
	void			*arg;
};

static void *futex_thread_fn(void *arg)
{
	struct thread_data *td = arg;

	td->tid = gettid();
	pthread_barrier_wait(&td->barrier);
	td->threadfn(td->arg);
	return NULL;
}

int futex_thread_create(struct thread_data *td, void (*threadfn)(void*), void *arg)
{
	int ret;

	pthread_barrier_init(&td->barrier, NULL, 2);
	td->tid = 0;
	td->threadfn = threadfn;
	td->arg = arg;

	ret = pthread_create(&td->thread, NULL, futex_thread_fn, td);
	if (ret)
		return ret;

	pthread_barrier_wait(&td->barrier);
	return futex_wait_for_thread(td->tid);
}

or something like that. That will at least fix all the futex muck and
w/o looking I'm sure that's something which other selftests might find
useful too.

The upside of such a change is that the futex selftests runtime will be
significantly lower because the hideous 10 seconds wait can be avoided,
which is an actual improvement and not a made up extreme requirement...

Thanks,

        tglx

Re: [PATCH v2] selftests/futex: fix the failed futex_requeue test issue

Posted by Yuwen Chen 1 week, 3 days ago

On Tue, 27 Jan 2026 19:30:31 +0100, Thomas Gleixner wrote:
> Extremely high?
> 
> The main thread waits for 10000us aka. 10 seconds to allow the waiter
> thread to reach futex_wait().
> 
> If anything is extreme then it's the 10 seconds wait, not the
> requirements. Please write factual changelogs and not fairy tales.

10,000 us is equal to 10 ms. On a specific ARM64 platform, it's quite
common for this test case to fail when there is a 10-millisecond waiting
time.

> That's a known issue for all futex selftests when the test system is
> under extreme load. That's why there is a gratious 10 seconds timeout,
> which is annoyingly long already.
> 
> Also why is this special for the requeue_single test case?
> 
> It's exactly the same issue for all futex selftests including the multi
> waiter one in the very same file, no?

Yes, this is a common phenomenon. However, for the sake of convenient
illustration, only the case of requeue_single is listed here.

> Why do you need an atomic store here?
> 
> pthread_barrier_wait() is a full memory barrier already, no?

Yes, there's no need to use atomic here. However, in the kernel, WRITE_ONCE
and READ_ONCE are more likely to be used. Since it's particularly difficult
to use them here, atomic is adopted.

> What's wrong with reading /proc/$PID/wchan ?
> 
> It's equally unreliable as /proc/$PID/stat because both can return the
> desired state _before_ the thread reaches the inner workings of the test
> related sys_futex(... WAIT).

Is it possible for the waiterfn to enter the sleep state between the
pthread_barrier_wait function and the futex_wait function? If so, would
checking the call stack be a solution?
Maybe using /proc/$PID/wchan is a better approach. Currently, I haven't found
any problems when using /proc/$PID/stat on our platform.

Thanks

    Yuwen

Re: [PATCH v2] selftests/futex: fix the failed futex_requeue test issue

Posted by Thomas Gleixner 1 week, 3 days ago

On Wed, Jan 28 2026 at 11:29, Yuwen Chen wrote:
> On Tue, 27 Jan 2026 19:30:31 +0100, Thomas Gleixner wrote:
>> Extremely high?
>> 
>> The main thread waits for 10000us aka. 10 seconds to allow the waiter
>> thread to reach futex_wait().
>> 
>> If anything is extreme then it's the 10 seconds wait, not the
>> requirements. Please write factual changelogs and not fairy tales.
>
> 10,000 us is equal to 10 ms. On a specific ARM64 platform, it's quite
> common for this test case to fail when there is a 10-millisecond waiting
> time.

Sorry. Somehow my tired brain converted micro seconds to milliseconds.

But looking at it with brain awake again. Your change does not address
the underlying problem at all. It just papers over it to the extent that
it can't be observed anymore. Assume the following situation:

   CPU0                       		CPU1
   pthread_create()
   ...                                  run new thread
                                        --> preemption
   for (i = 0; i < 100; i++) {
       if (waiting_on_futex)
          break;
       usleep(100);
   }

   -> fail

As this still sleeps only 10ms in total this just works by chance for
you, but there is no guarantee that it actually works under a wide range
of scenarios. So this needs to increase the total wait time to let's say
1 second, which is fine as the wait check will terminate the loop once
the other thread reached the wait condition.

>> That's a known issue for all futex selftests when the test system is
>> under extreme load. That's why there is a gratious 10 seconds timeout,
>> which is annoyingly long already.
>> 
>> Also why is this special for the requeue_single test case?
>> 
>> It's exactly the same issue for all futex selftests including the multi
>> waiter one in the very same file, no?
>
> Yes, this is a common phenomenon. However, for the sake of convenient
> illustration, only the case of requeue_single is listed here.

Sure, but why are you then implementing it per case instead of making it
a general usable facility and fix up _all_ problematic cases which rely
on the sleep in one go?

>> Why do you need an atomic store here?
>> 
>> pthread_barrier_wait() is a full memory barrier already, no?
>
> Yes, there's no need to use atomic here. However, in the kernel, WRITE_ONCE
> and READ_ONCE are more likely to be used. Since it's particularly difficult
> to use them here, atomic is adopted.

You don't need READ/WRITE_ONCE() at all as there is no concurrency. The
waiter thread writes before invoking pthread_barrier_wait() so the
control thread _cannot_ read concurrently. Ergo there is no need for any
of this voodoo.

>> What's wrong with reading /proc/$PID/wchan ?
>> 
>> It's equally unreliable as /proc/$PID/stat because both can return the
>> desired state _before_ the thread reaches the inner workings of the test
>> related sys_futex(... WAIT).
>
> Is it possible for the waiterfn to enter the sleep state between the
> pthread_barrier_wait function and the futex_wait function?

No, but it can reach sleep state _before_ even reaching the thread
function. pthread_barrier_wait() itself can result in a futex_wait() too
if the control thread did not reach pthread_barrier_wait() before, but
that's harmless because then the control thread will wake the waiter
thread _before_ checking the state of the waiter.

> If so, would checking the call stack be a solution?

To make it even more complex and convoluted?

Thanks,

        tglx