From nobody Thu Oct 9 10:42:21 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 25E4E2853E0; Tue, 17 Jun 2025 18:35:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750185328; cv=none; b=QLe+kd5vB+/HyoNLtPny7wTra8pwCh/WmjFp/+9wMSThAwjCae2SoL1DCjhHlspgBSyw0057JIYiIJOolF1TaKCxw78L422K4Aj5DTfzK0Y3dV5z1AMn2YxrhE8sI9Cnlypykmyqr/j6HoLqxJJ7T5800pWP4IGNc25dfHRg1K8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750185328; c=relaxed/simple; bh=HdsgE3dVEvFD0Kza/FGL4PVyH2AyEt8jTw7NRdOU/NU=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=GvZTNKS94tRjulDB8D3t+c8Uw8z1iz1eK6e8Tdy6zLFnM8zko0hWWgZfYI828BAouxv/h9MEwtvwAo4WAvL0/jJSpH9hEqChLfYZb1b8iu0lI2yIBkpiw7SY7QxlHP1/eV6mq98mbdsbUecDHYChucqhZxMb0iN9YJv9Y1YB84s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=br0MBrsR; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="br0MBrsR" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=fyXXUndsFU1fi1/cHrhrRGj97R8gSsjU9AE57nmvn3s=; b=br0MBrsRE9RK8fS/T57MdHP6H4 VnkGANwToBV2D1zevNfDTTWSDoOPoy1ShzfxLKrwc5GF14DrnhOr8nBiocgx32niK86/sJlXd8Baa S7RxASIfpWCLgwhGOQPHwt17orkvOPoMIMTYrJ6dffwKTOeNvrSOm6UDLn/N6GK4qFGtXqh+z0X9s tvvb9jhye2eef45j1saJ0Qrg12Ts9yN37X4U7QtsVu7hI+lRbfX/M6W3xVoyznfmZvWvI/zeB/l0L pHhjTxNNbotVsY09/v5gRHd4DumcGXP2gjlBYJncOJ8xgzHRA7eUYv6WyQy5zInMuwptirP3heBzo 72QjImgQ==; Received: from [191.204.192.64] (helo=[192.168.15.100]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uRb9D-004j89-3M; Tue, 17 Jun 2025 20:35:03 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= Date: Tue, 17 Jun 2025 15:34:18 -0300 Subject: [PATCH RESEND v4 1/7] selftests/futex: Add ASSERT_ macros Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250617-tonyk-robust_futex-v4-1-6586f5fb9d33@igalia.com> References: <20250617-tonyk-robust_futex-v4-0-6586f5fb9d33@igalia.com> In-Reply-To: <20250617-tonyk-robust_futex-v4-0-6586f5fb9d33@igalia.com> To: Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Darren Hart , Davidlohr Bueso , Shuah Khan , Arnd Bergmann , Sebastian Andrzej Siewior , Waiman Long Cc: linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-api@vger.kernel.org, kernel-dev@igalia.com, =?utf-8?q?Andr=C3=A9_Almeida?= X-Mailer: b4 0.14.2 Create ASSERT_{EQ, NE, TRUE, FALSE} macros to make test creation easier. Signed-off-by: Andr=C3=A9 Almeida --- tools/testing/selftests/futex/include/logging.h | 38 +++++++++++++++++++++= ++++ 1 file changed, 38 insertions(+) diff --git a/tools/testing/selftests/futex/include/logging.h b/tools/testin= g/selftests/futex/include/logging.h index 874c69ce5cce9efa3a9d6de246f5972a75437dbf..a19755622a877932884570c8f58= aaee7371d5f8f 100644 --- a/tools/testing/selftests/futex/include/logging.h +++ b/tools/testing/selftests/futex/include/logging.h @@ -23,6 +23,44 @@ #include #include "kselftest.h" =20 +#define ASSERT_EQ(var, value) \ +do { \ + if (var !=3D value) { \ + ksft_test_result_fail("%s: expected %ld, but %s has %ld\n", \ + __func__, (long) value, #var, \ + (long) var); \ + return; \ + } \ +} while (0) + +#define ASSERT_NE(var, value) \ +do { \ + if (var =3D=3D value) { \ + ksft_test_result_fail("%s: expected not %ld, but %s has %ld\n", \ + __func__, (long) value, #var, \ + (long) var); \ + return; \ + } \ +} while (0) + +#define ASSERT_TRUE(var) \ +do { \ + if ((var) =3D=3D 0) { \ + ksft_test_result_fail("%s: expected %s to be true\n", \ + __func__, #var); \ + return; \ + } \ +} while (0) + +#define ASSERT_FALSE(var) \ +do { \ + if (var) { \ + ksft_test_result_fail("%s: expected %s to be false\n", \ + __func__, #var); \ + return; \ + } \ +} while (0) + /* * Define PASS, ERROR, and FAIL strings with and without color escape * sequences, default to no color. --=20 2.49.0 From nobody Thu Oct 9 10:42:21 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 73AD2277008; Tue, 17 Jun 2025 18:35:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750185330; cv=none; b=M4gy0Mi4E99Oo8EL06e0CuVdhsMkp3goazGblWhFSELYBzKPpAgVobeSsMO98BIuZNa+PI80vNdF+GXYw30l1qYfvAZfHhMlgagO8R+bsmsL1lmVT1MQVdrJXaE7qb3AEUoCvvt/z+QCcmKSWX4xXZhw2Xo/zWnZ/a9eEmMw9b8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750185330; c=relaxed/simple; bh=p9OxylzOn1x+Z0h4DjLFmO+VL7+e+zX+Iqn1+qvIeRg=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=AiJQxZQPrwoMf28Szbq9K44ZYOvlrGvMGXfCucmVDVlTLR5zn8iEoK68wju8r1QSfIe+0QAW6+ptyms6g6P/aavWfEgQiIsxTrSzCL9fpIfBMhldPiHzWhX5EDV7kl0MwYO95/6tN432qSsnIOw5B9ZalZYWyPu+UAtWQI/w2ZQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=O7MajGdG; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="O7MajGdG" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=a7HhJSTSoQPpCiYuDEPoOl5BDkkqhEUMv1ZWqnafcSg=; b=O7MajGdGjksxoC+z0mDcRZ/oy9 W0cabEFh0jxPdDo71lC5OD/oyjFRYtxyz6OwXedPZLCo6lHV84FqFB/EtaUPZ+zz4lemGyOh6d4q/ gAmhPiH38E/cLQY0OJK10CHBdyLirys5pfQ+JFrcNbX+1+u4NSk0lN6R2KZkdOzjw1LtekRtANmFE b1z2JMz3em+8t3dlsi6uWQKcN78Iqx7vLxNT4Bh5npIislmoFxnML07iCSdupO3AS/HXuiwRvjlII +Im65u0sh3tpsugGipMr7pdbxe/2LT34HVceOEesQBJNcyQaq0RWYo+DF+KJB1wXIBA1b3OzsRS0D YGV8bhaA==; Received: from [191.204.192.64] (helo=[192.168.15.100]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uRb9G-004j89-9X; Tue, 17 Jun 2025 20:35:06 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= Date: Tue, 17 Jun 2025 15:34:19 -0300 Subject: [PATCH RESEND v4 2/7] selftests/futex: Create test for robust list Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250617-tonyk-robust_futex-v4-2-6586f5fb9d33@igalia.com> References: <20250617-tonyk-robust_futex-v4-0-6586f5fb9d33@igalia.com> In-Reply-To: <20250617-tonyk-robust_futex-v4-0-6586f5fb9d33@igalia.com> To: Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Darren Hart , Davidlohr Bueso , Shuah Khan , Arnd Bergmann , Sebastian Andrzej Siewior , Waiman Long Cc: linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-api@vger.kernel.org, kernel-dev@igalia.com, =?utf-8?q?Andr=C3=A9_Almeida?= X-Mailer: b4 0.14.2 Create a test for the robust list mechanism. Test the following uAPI operations: - Creating a robust mutex where the lock waiter is wake by the kernel when the lock owner died - Setting a robust list to the current task - Getting a robust list from the current task - Getting a robust list from another task - Using the list_op_pending field from robust_list_head struct to test robustness when the lock owner dies before completing the locking - Setting a invalid size for syscall argument `len` - Adding multiple elements to a robust list wait waiting for each of them - Creating a circular list and checking that the kernel does not get stuck in an infinity loop This is the expected output: TAP version 13 1..7 ok 1 test_robustness ok 2 test_set_robust_list_invalid_size ok 3 test_get_robust_list_self ok 4 test_get_robust_list_child ok 5 test_set_list_op_pending ok 6 test_robust_list_multiple_elements ok 7 test_circular_list # Totals: pass:7 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Andr=C3=A9 Almeida --- .../testing/selftests/futex/functional/.gitignore | 1 + tools/testing/selftests/futex/functional/Makefile | 3 +- .../selftests/futex/functional/robust_list.c | 554 +++++++++++++++++= ++++ 3 files changed, 557 insertions(+), 1 deletion(-) diff --git a/tools/testing/selftests/futex/functional/.gitignore b/tools/te= sting/selftests/futex/functional/.gitignore index 7b24ae89594a9db211d4b8469ebcef8d1f7012d8..7f447ebfbc62bbad9add0dc86a7= 5abcdb8a4d9a7 100644 --- a/tools/testing/selftests/futex/functional/.gitignore +++ b/tools/testing/selftests/futex/functional/.gitignore @@ -11,3 +11,4 @@ futex_wait_timeout futex_wait_uninitialized_heap futex_wait_wouldblock futex_waitv +robust_list diff --git a/tools/testing/selftests/futex/functional/Makefile b/tools/test= ing/selftests/futex/functional/Makefile index 8cfb87f7f7c5059c82f1e6290c076d3f13f5ea41..e6fa66e622dee4de74c31c8b9b4= 86ca01de35737 100644 --- a/tools/testing/selftests/futex/functional/Makefile +++ b/tools/testing/selftests/futex/functional/Makefile @@ -20,7 +20,8 @@ TEST_GEN_PROGS :=3D \ futex_priv_hash \ futex_numa_mpol \ futex_waitv \ - futex_numa + futex_numa \ + robust_list =20 TEST_PROGS :=3D run.sh =20 diff --git a/tools/testing/selftests/futex/functional/robust_list.c b/tools= /testing/selftests/futex/functional/robust_list.c new file mode 100644 index 0000000000000000000000000000000000000000..42690b2440fd29a9b12c46f67f9= 645ccc93d1147 --- /dev/null +++ b/tools/testing/selftests/futex/functional/robust_list.c @@ -0,0 +1,554 @@ +// SPDX-License-Identifier: GPL-2.0-or-later +/* + * Copyright (C) 2024 Igalia S.L. + * + * Robust list test by Andr=C3=A9 Almeida + * + * The robust list uAPI allows userspace to create "robust" locks, in the = sense + * that if the lock holder thread dies, the remaining threads that are wai= ting + * for the lock won't block forever, waiting for a lock that will never be + * released. + * + * This is achieve by userspace setting a list where a thread can enter al= l the + * locks (futexes) that it is holding. The robust list is a linked list, a= nd + * userspace register the start of the list with the syscall set_robust_li= st(). + * If such thread eventually dies, the kernel will walk this list, waking = up one + * thread waiting for each futex and marking the futex word with the flag + * FUTEX_OWNER_DIED. + * + * See also + * man set_robust_list + * Documententation/locking/robust-futex-ABI.rst + * Documententation/locking/robust-futexes.rst + */ + +#define _GNU_SOURCE + +#include "futextest.h" +#include "logging.h" + +#include +#include +#include +#include +#include +#include +#include +#include + +#define STACK_SIZE (1024 * 1024) + +#define FUTEX_TIMEOUT 3 + +static pthread_barrier_t barrier, barrier2; + +int set_robust_list(struct robust_list_head *head, size_t len) +{ + return syscall(SYS_set_robust_list, head, len); +} + +int get_robust_list(int pid, struct robust_list_head **head, size_t *len_p= tr) +{ + return syscall(SYS_get_robust_list, pid, head, len_ptr); +} + +/* + * Basic lock struct, contains just the futex word and the robust list ele= ment + * Real implementations have also a *prev to easily walk in the list + */ +struct lock_struct { + _Atomic(unsigned int) futex; + struct robust_list list; +}; + +/* + * Helper function to spawn a child thread. Returns -1 on error, pid on su= ccess + */ +static int create_child(int (*fn)(void *arg), void *arg) +{ + char *stack; + pid_t pid; + + stack =3D mmap(NULL, STACK_SIZE, PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS | MAP_STACK, -1, 0); + if (stack =3D=3D MAP_FAILED) + return -1; + + stack +=3D STACK_SIZE; + + pid =3D clone(fn, stack, CLONE_VM | SIGCHLD, arg); + + if (pid =3D=3D -1) + return -1; + + return pid; +} + +/* + * Helper function to prepare and register a robust list + */ +static int set_list(struct robust_list_head *head) +{ + int ret; + + ret =3D set_robust_list(head, sizeof(struct robust_list_head)); + if (ret) + return ret; + + head->futex_offset =3D (size_t) offsetof(struct lock_struct, futex) - + (size_t) offsetof(struct lock_struct, list); + head->list.next =3D &head->list; + head->list_op_pending =3D NULL; + + return 0; +} + +/* + * A basic (and incomplete) mutex lock function with robustness + */ +static int mutex_lock(struct lock_struct *lock, struct robust_list_head *h= ead, bool error_inject) +{ + _Atomic(unsigned int) *futex =3D &lock->futex; + unsigned int zero =3D 0; + int ret =3D -1; + pid_t tid =3D gettid(); + + /* + * Set list_op_pending before starting the lock, so the kernel can catch + * the case where the thread died during the lock operation + */ + head->list_op_pending =3D &lock->list; + + if (atomic_compare_exchange_strong(futex, &zero, tid)) { + /* + * We took the lock, insert it in the robust list + */ + struct robust_list *list =3D &head->list; + + /* Error injection to test list_op_pending */ + if (error_inject) + return 0; + + while (list->next !=3D &head->list) + list =3D list->next; + + list->next =3D &lock->list; + lock->list.next =3D &head->list; + + ret =3D 0; + } else { + /* + * We didn't take the lock, wait until the owner wakes (or dies) + */ + struct timespec to; + + to.tv_sec =3D FUTEX_TIMEOUT; + to.tv_nsec =3D 0; + + tid =3D atomic_load(futex); + /* Kernel ignores futexes without the waiters flag */ + tid |=3D FUTEX_WAITERS; + atomic_store(futex, tid); + + ret =3D futex_wait((futex_t *) futex, tid, &to, 0); + + /* + * A real mutex_lock() implementation would loop here to finally + * take the lock. We don't care about that, so we stop here. + */ + } + + head->list_op_pending =3D NULL; + + return ret; +} + +/* + * This child thread will succeed taking the lock, and then will exit hold= ing it + */ +static int child_fn_lock(void *arg) +{ + struct lock_struct *lock =3D (struct lock_struct *) arg; + struct robust_list_head head; + int ret; + + ret =3D set_list(&head); + if (ret) + ksft_test_result_fail("set_robust_list error\n"); + + ret =3D mutex_lock(lock, &head, false); + if (ret) + ksft_test_result_fail("mutex_lock error\n"); + + pthread_barrier_wait(&barrier); + + /* + * There's a race here: the parent thread needs to be inside + * futex_wait() before the child thread dies, otherwise it will miss the + * wakeup from handle_futex_death() that this child will emit. We wait a + * little bit just to make sure that this happens. + */ + sleep(1); + + return 0; +} + +/* + * Spawns a child thread that will set a robust list, take the lock, regis= ter it + * in the robust list and die. The parent thread will wait on this futex, = and + * should be waken up when the child exits. + */ +static void test_robustness(void) +{ + struct lock_struct lock =3D { .futex =3D 0 }; + struct robust_list_head head; + _Atomic(unsigned int) *futex =3D &lock.futex; + int ret; + + ret =3D set_list(&head); + ASSERT_EQ(ret, 0); + + /* + * Lets use a barrier to ensure that the child thread takes the lock + * before the parent + */ + ret =3D pthread_barrier_init(&barrier, NULL, 2); + ASSERT_EQ(ret, 0); + + ret =3D create_child(&child_fn_lock, &lock); + ASSERT_NE(ret, -1); + + pthread_barrier_wait(&barrier); + ret =3D mutex_lock(&lock, &head, false); + + /* + * futex_wait() should return 0 and the futex word should be marked with + * FUTEX_OWNER_DIED + */ + ASSERT_EQ(ret, 0); + if (ret !=3D 0) + printf("futex wait returned %d", errno); + + ASSERT_TRUE(*futex | FUTEX_OWNER_DIED); + + wait(NULL); + pthread_barrier_destroy(&barrier); + + ksft_test_result_pass("%s\n", __func__); +} + +/* + * The only valid value for len is sizeof(*head) + */ +static void test_set_robust_list_invalid_size(void) +{ + struct robust_list_head head; + size_t head_size =3D sizeof(struct robust_list_head); + int ret; + + ret =3D set_robust_list(&head, head_size); + ASSERT_EQ(ret, 0); + + ret =3D set_robust_list(&head, head_size * 2); + ASSERT_EQ(ret, -1); + ASSERT_EQ(errno, EINVAL); + + ret =3D set_robust_list(&head, head_size - 1); + ASSERT_EQ(ret, -1); + ASSERT_EQ(errno, EINVAL); + + ret =3D set_robust_list(&head, 0); + ASSERT_EQ(ret, -1); + ASSERT_EQ(errno, EINVAL); + + ksft_test_result_pass("%s\n", __func__); +} + +/* + * Test get_robust_list with pid =3D 0, getting the list of the running th= read + */ +static void test_get_robust_list_self(void) +{ + struct robust_list_head head, head2, *get_head; + size_t head_size =3D sizeof(struct robust_list_head), len_ptr; + int ret; + + ret =3D set_robust_list(&head, head_size); + ASSERT_EQ(ret, 0); + + ret =3D get_robust_list(0, &get_head, &len_ptr); + ASSERT_EQ(ret, 0); + ASSERT_EQ(get_head, &head); + ASSERT_EQ(head_size, len_ptr); + + ret =3D set_robust_list(&head2, head_size); + ASSERT_EQ(ret, 0); + + ret =3D get_robust_list(0, &get_head, &len_ptr); + ASSERT_EQ(ret, 0); + ASSERT_EQ(get_head, &head2); + ASSERT_EQ(head_size, len_ptr); + + ksft_test_result_pass("%s\n", __func__); +} + +static int child_list(void *arg) +{ + struct robust_list_head *head =3D (struct robust_list_head *) arg; + int ret; + + ret =3D set_robust_list(head, sizeof(struct robust_list_head)); + if (ret) + ksft_test_result_fail("set_robust_list error\n"); + + pthread_barrier_wait(&barrier); + pthread_barrier_wait(&barrier2); + + return 0; +} + +/* + * Test get_robust_list from another thread. We use two barriers here to e= nsure + * that: + * 1) the child thread set the list before we try to get it from the + * parent + * 2) the child thread still alive when we try to get the list from it + */ +static void test_get_robust_list_child(void) +{ + pid_t tid; + int ret; + struct robust_list_head head, *get_head; + size_t len_ptr; + + ret =3D pthread_barrier_init(&barrier, NULL, 2); + ret =3D pthread_barrier_init(&barrier2, NULL, 2); + ASSERT_EQ(ret, 0); + + tid =3D create_child(&child_list, &head); + ASSERT_NE(tid, -1); + + pthread_barrier_wait(&barrier); + + ret =3D get_robust_list(tid, &get_head, &len_ptr); + ASSERT_EQ(ret, 0); + ASSERT_EQ(&head, get_head); + + pthread_barrier_wait(&barrier2); + + wait(NULL); + pthread_barrier_destroy(&barrier); + pthread_barrier_destroy(&barrier2); + + ksft_test_result_pass("%s\n", __func__); +} + +static int child_fn_lock_with_error(void *arg) +{ + struct lock_struct *lock =3D (struct lock_struct *) arg; + struct robust_list_head head; + int ret; + + ret =3D set_list(&head); + if (ret) + ksft_test_result_fail("set_robust_list error\n"); + + ret =3D mutex_lock(lock, &head, true); + if (ret) + ksft_test_result_fail("mutex_lock error\n"); + + pthread_barrier_wait(&barrier); + + sleep(1); + + return 0; +} + +/* + * Same as robustness test, but inject an error where the mutex_lock() exi= ts + * earlier, just after setting list_op_pending and taking the lock, to tes= t the + * list_op_pending mechanism + */ +static void test_set_list_op_pending(void) +{ + struct lock_struct lock =3D { .futex =3D 0 }; + struct robust_list_head head; + _Atomic(unsigned int) *futex =3D &lock.futex; + int ret; + + ret =3D set_list(&head); + ASSERT_EQ(ret, 0); + + ret =3D pthread_barrier_init(&barrier, NULL, 2); + ASSERT_EQ(ret, 0); + + ret =3D create_child(&child_fn_lock_with_error, &lock); + ASSERT_NE(ret, -1); + + pthread_barrier_wait(&barrier); + ret =3D mutex_lock(&lock, &head, false); + + ASSERT_EQ(ret, 0); + if (ret !=3D 0) + printf("futex wait returned %d", errno); + + ASSERT_TRUE(*futex | FUTEX_OWNER_DIED); + + wait(NULL); + pthread_barrier_destroy(&barrier); + + ksft_test_result_pass("%s\n", __func__); +} + +#define CHILD_NR 10 + +static int child_lock_holder(void *arg) +{ + struct lock_struct *locks =3D (struct lock_struct *) arg; + struct robust_list_head head; + int i; + + set_list(&head); + + for (i =3D 0; i < CHILD_NR; i++) { + locks[i].futex =3D 0; + mutex_lock(&locks[i], &head, false); + } + + pthread_barrier_wait(&barrier); + pthread_barrier_wait(&barrier2); + + sleep(1); + return 0; +} + +static int child_wait_lock(void *arg) +{ + struct lock_struct *lock =3D (struct lock_struct *) arg; + struct robust_list_head head; + int ret; + + pthread_barrier_wait(&barrier2); + ret =3D mutex_lock(lock, &head, false); + + if (ret) + ksft_test_result_fail("mutex_lock error\n"); + + if (!(lock->futex | FUTEX_OWNER_DIED)) + ksft_test_result_fail("futex not marked with FUTEX_OWNER_DIED\n"); + + return 0; +} + +/* + * Test a robust list of more than one element. All the waiters should wak= e when + * the holder dies + */ +static void test_robust_list_multiple_elements(void) +{ + struct lock_struct locks[CHILD_NR]; + int i, ret; + + ret =3D pthread_barrier_init(&barrier, NULL, 2); + ASSERT_EQ(ret, 0); + ret =3D pthread_barrier_init(&barrier2, NULL, CHILD_NR + 1); + ASSERT_EQ(ret, 0); + + create_child(&child_lock_holder, &locks); + + /* Wait until the locker thread takes the look */ + pthread_barrier_wait(&barrier); + + for (i =3D 0; i < CHILD_NR; i++) + create_child(&child_wait_lock, &locks[i]); + + /* Wait for all children to return */ + while (wait(NULL) > 0); + + pthread_barrier_destroy(&barrier); + pthread_barrier_destroy(&barrier2); + + ksft_test_result_pass("%s\n", __func__); +} + +static int child_circular_list(void *arg) +{ + static struct robust_list_head head; + struct lock_struct a, b, c; + int ret; + + ret =3D set_list(&head); + if (ret) + ksft_test_result_fail("set_list error\n"); + + head.list.next =3D &a.list; + + /* + * The last element should point to head list, but we short circuit it + */ + a.list.next =3D &b.list; + b.list.next =3D &c.list; + c.list.next =3D &a.list; + + return 0; +} + +/* + * Create a circular robust list. The kernel should be able to destroy the= list + * while processing it so it won't be trapped in an infinite loop while ha= ndling + * a process exit + */ +static void test_circular_list(void) +{ + create_child(child_circular_list, NULL); + + wait(NULL); + + ksft_test_result_pass("%s\n", __func__); +} + +void usage(char *prog) +{ + printf("Usage: %s\n", prog); + printf(" -c Use color\n"); + printf(" -h Display this help message\n"); + printf(" -v L Verbosity level: %d=3DQUIET %d=3DCRITICAL %d=3DINFO\n", + VQUIET, VCRITICAL, VINFO); +} + +int main(int argc, char *argv[]) +{ + int c; + + while ((c =3D getopt(argc, argv, "cht:v:")) !=3D -1) { + switch (c) { + case 'c': + log_color(1); + break; + case 'h': + usage(basename(argv[0])); + exit(0); + case 'v': + log_verbosity(atoi(optarg)); + break; + default: + usage(basename(argv[0])); + exit(1); + } + } + + ksft_print_header(); + ksft_set_plan(7); + + test_robustness(); + + test_set_robust_list_invalid_size(); + test_get_robust_list_self(); + test_get_robust_list_child(); + test_set_list_op_pending(); + test_robust_list_multiple_elements(); + test_circular_list(); + + ksft_print_cnts(); + return 0; +} --=20 2.49.0 From nobody Thu Oct 9 10:42:21 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CD0C32E8DE1; Tue, 17 Jun 2025 18:35:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750185331; cv=none; b=hrJ+9MOEzhYE4Y+AkLvghno7CkcwxRa5pdf0qkmDi4pqPB5CceBgZ4M1fLVH00Syt43CbWBtt5VBToO2p3qFGFILJoaUxw2v8exJSb4jw96rX2jDiupVxMvJb/HwLHGjijuzscMhgWyEO5vRzLIkM0OAdA64DM6Hzhml8pZbKLo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750185331; c=relaxed/simple; bh=qwe1bJBCAnMaBv7MRTYCEzTpcsZbKb+3cNfJpCZ81FE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=Xt6Hf43C1/TN6LskTCQxCETX05Lllgw19ejdaJDX6H9HGXbB835FNzWOfbTYx3KudPnGI4doo2DYt6wQ9W+SfPjSXwSjulQWKdhi542GDfknaW3y8N0G6jAxanudphE1h9+a7ogMSmo7Acrgeycl3jIZ8zs4oHk96Hy+D8wqfmA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=F3tAVQSk; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="F3tAVQSk" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=soxJF2871E2kck9CgKzCBCd4PxV461xnRNXuEkzvOew=; b=F3tAVQSkY3b+LlM5/AlCFu+k/m LF9W39/Y7ZHhDahGDgpnf3yK80itFgXLf4c2CdUJUBQ45EeXa0g5Tl018i/LtVAyUtYjLoOGYeum8 4+CWGwVtogaxkIqFRu9gFqC++IpO4WaW0QV9Jhvce+X0gehhyAqO6YeuOLQ/6ShpoQ6ugZk7F3c5B j1443NaYEA60IBWHwmXRZcHgLgVrG/UDO7pXc/LnRsRpDr+1OAKNg4l7+zsUzjpeL8lr1oRk3n+5v BNYCRTKp+vBFIKzOueixuOEd1Ykzv9FrWBeJh5RhBWSPV6DAMtAK0fSLV89/sMtSHsXgtnnEUINQM 5ESDg+Ig==; Received: from [191.204.192.64] (helo=[192.168.15.100]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uRb9J-004j89-Fc; Tue, 17 Jun 2025 20:35:09 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= Date: Tue, 17 Jun 2025 15:34:20 -0300 Subject: [PATCH RESEND v4 3/7] futex: Use explicit sizes for compat_exit_robust_list Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250617-tonyk-robust_futex-v4-3-6586f5fb9d33@igalia.com> References: <20250617-tonyk-robust_futex-v4-0-6586f5fb9d33@igalia.com> In-Reply-To: <20250617-tonyk-robust_futex-v4-0-6586f5fb9d33@igalia.com> To: Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Darren Hart , Davidlohr Bueso , Shuah Khan , Arnd Bergmann , Sebastian Andrzej Siewior , Waiman Long Cc: linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-api@vger.kernel.org, kernel-dev@igalia.com, =?utf-8?q?Andr=C3=A9_Almeida?= X-Mailer: b4 0.14.2 There are two functions for handling robust lists during the task exit: exit_robust_list() and compat_exit_robust_list(). The first one handles either 64bit or 32bit lists, depending if it's a 64bit or 32bit kernel. The compat_exit_robust_list() only exists in 64bit kernels that supports 32bit syscalls, and handles 32bit lists. For the new syscall set_robust_list2(), 64bit kernels need to be able to handle 32bit lists despite having or not support for 32bit syscalls, so make compat_exit_robust_list() exist regardless of compat_ config. Also, use explicitly sizing, otherwise in a 32bit kernel both exit_robust_list() and compat_exit_robust_list() would be the exactly same function, with none of them dealing with 64bit robust lists. Signed-off-by: Andr=C3=A9 Almeida --- include/linux/compat.h | 12 +----------- include/linux/futex.h | 11 +++++++++++ include/linux/sched.h | 2 +- kernel/futex/core.c | 44 ++++++++++++++++++++++++++++---------------- kernel/futex/syscalls.c | 4 ++-- 5 files changed, 43 insertions(+), 30 deletions(-) diff --git a/include/linux/compat.h b/include/linux/compat.h index 56cebaff0c910fda853a0e2b3d6d0517e55f8b38..968a9135ff486cf9c8be2a18b80= cd4c46e890236 100644 --- a/include/linux/compat.h +++ b/include/linux/compat.h @@ -385,16 +385,6 @@ struct compat_ifconf { compat_caddr_t ifcbuf; }; =20 -struct compat_robust_list { - compat_uptr_t next; -}; - -struct compat_robust_list_head { - struct compat_robust_list list; - compat_long_t futex_offset; - compat_uptr_t list_op_pending; -}; - #ifdef CONFIG_COMPAT_OLD_SIGACTION struct compat_old_sigaction { compat_uptr_t sa_handler; @@ -672,7 +662,7 @@ asmlinkage long compat_sys_waitid(int, compat_pid_t, struct compat_siginfo __user *, int, struct compat_rusage __user *); asmlinkage long -compat_sys_set_robust_list(struct compat_robust_list_head __user *head, +compat_sys_set_robust_list(struct robust_list_head32 __user *head, compat_size_t len); asmlinkage long compat_sys_get_robust_list(int pid, compat_uptr_t __user *head_ptr, diff --git a/include/linux/futex.h b/include/linux/futex.h index 168ffd5996b4808491c05bdc7c8d0aeca1d37ee5..cd7c5d12c846566c56f3f3ea74b= 95e437a6e8193 100644 --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -56,6 +56,17 @@ union futex_key { #define FUTEX_KEY_INIT (union futex_key) { .both =3D { .ptr =3D 0ULL } } =20 #ifdef CONFIG_FUTEX + +struct robust_list32 { + u32 next; +}; + +struct robust_list_head32 { + struct robust_list32 list; + s32 futex_offset; + u32 list_op_pending; +}; + enum { FUTEX_STATE_OK, FUTEX_STATE_EXITING, diff --git a/include/linux/sched.h b/include/linux/sched.h index 45e5953b8f326c2ff5e19de469d6cba27cc4c17d..51e5d05a9fcd407dcd53b7b7cb8= c59783660a826 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1324,7 +1324,7 @@ struct task_struct { #ifdef CONFIG_FUTEX struct robust_list_head __user *robust_list; #ifdef CONFIG_COMPAT - struct compat_robust_list_head __user *compat_robust_list; + struct robust_list_head32 __user *compat_robust_list; #endif struct list_head pi_state_list; struct futex_pi_state *pi_state_cache; diff --git a/kernel/futex/core.c b/kernel/futex/core.c index 19a2c65f3d373c0b60c864a6fe0604787221d342..8640770aadc611b7341a3abb41b= db740e6394479 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -1144,13 +1144,14 @@ static inline int fetch_robust_entry(struct robust_= list __user **entry, return 0; } =20 +#ifdef CONFIG_64BIT /* * Walk curr->robust_list (very carefully, it's a userspace list!) * and mark any locks found there dead, and notify any waiters. * * We silently return on any sign of list-walking problem. */ -static void exit_robust_list(struct task_struct *curr) +static void exit_robust_list64(struct task_struct *curr) { struct robust_list_head __user *head =3D curr->robust_list; struct robust_list __user *entry, *next_entry, *pending; @@ -1211,8 +1212,13 @@ static void exit_robust_list(struct task_struct *cur= r) curr, pip, HANDLE_DEATH_PENDING); } } +#else +static void exit_robust_list64(struct task_struct *curr) +{ + pr_warn("32bit kernel should not allow ROBUST_LIST_64BIT"); +} +#endif =20 -#ifdef CONFIG_COMPAT static void __user *futex_uaddr(struct robust_list __user *entry, compat_long_t futex_offset) { @@ -1226,13 +1232,13 @@ static void __user *futex_uaddr(struct robust_list = __user *entry, * Fetch a robust-list pointer. Bit 0 signals PI futexes: */ static inline int -compat_fetch_robust_entry(compat_uptr_t *uentry, struct robust_list __user= **entry, - compat_uptr_t __user *head, unsigned int *pi) +fetch_robust_entry32(u32 *uentry, struct robust_list __user **entry, + u32 __user *head, unsigned int *pi) { if (get_user(*uentry, head)) return -EFAULT; =20 - *entry =3D compat_ptr((*uentry) & ~1); + *entry =3D (void __user *)(unsigned long)((*uentry) & ~1); *pi =3D (unsigned int)(*uentry) & 1; =20 return 0; @@ -1244,21 +1250,21 @@ compat_fetch_robust_entry(compat_uptr_t *uentry, st= ruct robust_list __user **ent * * We silently return on any sign of list-walking problem. */ -static void compat_exit_robust_list(struct task_struct *curr) +static void exit_robust_list32(struct task_struct *curr) { - struct compat_robust_list_head __user *head =3D curr->compat_robust_list; + struct robust_list_head32 __user *head =3D curr->compat_robust_list; struct robust_list __user *entry, *next_entry, *pending; unsigned int limit =3D ROBUST_LIST_LIMIT, pi, pip; unsigned int next_pi; - compat_uptr_t uentry, next_uentry, upending; - compat_long_t futex_offset; + u32 uentry, next_uentry, upending; + s32 futex_offset; int rc; =20 /* * Fetch the list head (which was registered earlier, via * sys_set_robust_list()): */ - if (compat_fetch_robust_entry(&uentry, &entry, &head->list.next, &pi)) + if (fetch_robust_entry32((u32 *)&uentry, &entry, (u32 *)&head->list.next,= &pi)) return; /* * Fetch the relative futex offset: @@ -1269,7 +1275,7 @@ static void compat_exit_robust_list(struct task_struc= t *curr) * Fetch any possibly pending lock-add first, and handle it * if it exists: */ - if (compat_fetch_robust_entry(&upending, &pending, + if (fetch_robust_entry32(&upending, &pending, &head->list_op_pending, &pip)) return; =20 @@ -1279,8 +1285,8 @@ static void compat_exit_robust_list(struct task_struc= t *curr) * Fetch the next entry in the list before calling * handle_futex_death: */ - rc =3D compat_fetch_robust_entry(&next_uentry, &next_entry, - (compat_uptr_t __user *)&entry->next, &next_pi); + rc =3D fetch_robust_entry32(&next_uentry, &next_entry, + (u32 __user *)&entry->next, &next_pi); /* * A pending lock might already be on the list, so * dont process it twice: @@ -1311,7 +1317,6 @@ static void compat_exit_robust_list(struct task_struc= t *curr) handle_futex_death(uaddr, curr, pip, HANDLE_DEATH_PENDING); } } -#endif =20 #ifdef CONFIG_FUTEX_PI =20 @@ -1406,14 +1411,21 @@ static inline void exit_pi_state_list(struct task_s= truct *curr) { } =20 static void futex_cleanup(struct task_struct *tsk) { +#ifdef CONFIG_64BIT if (unlikely(tsk->robust_list)) { - exit_robust_list(tsk); + exit_robust_list64(tsk); tsk->robust_list =3D NULL; } +#else + if (unlikely(tsk->robust_list)) { + exit_robust_list32(tsk); + tsk->robust_list =3D NULL; + } +#endif =20 #ifdef CONFIG_COMPAT if (unlikely(tsk->compat_robust_list)) { - compat_exit_robust_list(tsk); + exit_robust_list32(tsk); tsk->compat_robust_list =3D NULL; } #endif diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c index 4b6da9116aa6c33db9796e3055ce0c90b02d7b91..dba193dfd216cc929c8f4d979aa= 2bcd99237e2d8 100644 --- a/kernel/futex/syscalls.c +++ b/kernel/futex/syscalls.c @@ -440,7 +440,7 @@ SYSCALL_DEFINE4(futex_requeue, =20 #ifdef CONFIG_COMPAT COMPAT_SYSCALL_DEFINE2(set_robust_list, - struct compat_robust_list_head __user *, head, + struct robust_list_head32 __user *, head, compat_size_t, len) { if (unlikely(len !=3D sizeof(*head))) @@ -455,7 +455,7 @@ COMPAT_SYSCALL_DEFINE3(get_robust_list, int, pid, compat_uptr_t __user *, head_ptr, compat_size_t __user *, len_ptr) { - struct compat_robust_list_head __user *head; + struct robust_list_head32 __user *head; unsigned long ret; struct task_struct *p; =20 --=20 2.49.0 From nobody Thu Oct 9 10:42:21 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8D1172EBB83; Tue, 17 Jun 2025 18:35:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750185331; cv=none; b=WXzb5xHcVefKmCgHC5GFBLltwuRbPQNsN1OOcXMNGacRFvK6dV77Mc1EUCPe0llUmFSNjR6mrwN+ShU8r6aAyIuXF/4rwu/xSHq0d731bdK33KSvI3o9+sQCMXol8FsDq1HK3wqtsV2Jms/9icJi2p728NQayyozh1mJmX1ni3Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750185331; c=relaxed/simple; bh=dw72dtOYLLv4J85mHuL0DlZqiN+kFZ4S4FKscNv+oOc=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=hNKEQeAvKtpb/QFupmVOjEAgFUU64CopQxvZlagxR+lc1Hvwk/rBCHR0425m7GCduyKW6cTG6bV3bEtjZYvZ5DReFKJSBrFxTRfdrB8CKKclrvSvC7fJu1WUn+aQ9HV3oSmO5UtIhTkYmBKIGxeVKmuflPwema458d6wdC4RKOs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=VYIt56Ld; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="VYIt56Ld" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=WkU+Q07cgIZULcLCcPXt/qEWJzmpEKKMSJ8hpdMsK7A=; b=VYIt56LdqWOhDxLnGjBc63oeCj 6Lzpww4lECAsoNED7Zp3kQ0PMu048ANqnOb2h3Ev1sbP/lbwf1VrYaSdzzfXZBgjCL+YRQzGPZRrO G36iLRfq6laTKtfxvqC3C9wPueM0cXYgwytAaHfqjOyYNkAwwMzO/Ue91FM+QocT/C6AjlpFGNMky eKTK4jdRDuY4qeQZAqulEiDFqfmMjZ4CoqON7wID2h4BTy/KojTFpNeoqRVaqUMvfKx7dffAVrkhS 3Cwa+MzVkZx/JQ/Z/NF+/zJlvCHqL85cgw7EXQPoiBJRwTFXp/vC27QFgdN+8l3GF2hzW625VzeVi rYd/DJRQ==; Received: from [191.204.192.64] (helo=[192.168.15.100]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uRb9M-004j89-Lc; Tue, 17 Jun 2025 20:35:12 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= Date: Tue, 17 Jun 2025 15:34:21 -0300 Subject: [PATCH RESEND v4 4/7] futex: Create set_robust_list2 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250617-tonyk-robust_futex-v4-4-6586f5fb9d33@igalia.com> References: <20250617-tonyk-robust_futex-v4-0-6586f5fb9d33@igalia.com> In-Reply-To: <20250617-tonyk-robust_futex-v4-0-6586f5fb9d33@igalia.com> To: Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Darren Hart , Davidlohr Bueso , Shuah Khan , Arnd Bergmann , Sebastian Andrzej Siewior , Waiman Long Cc: linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-api@vger.kernel.org, kernel-dev@igalia.com, =?utf-8?q?Andr=C3=A9_Almeida?= X-Mailer: b4 0.14.2 Create a new robust_list() syscall. The current syscall can't be expanded to cover the following use case, so a new one is needed. This new syscall allows users to set multiple robust lists per process and to have either 32bit or 64bit pointers in the list. * Interface This is the proposed interface: long set_robust_list2(void *head, int index, unsigned int flags) `head` is the head of the userspace struct robust_list_head, just as old set_robust_list(). It needs to be a void pointer since it can point to a normal robust_list_head or a compat_robust_list_head. `flags` can be used for defining the list type: enum robust_list_type { ROBUST_LIST_32BIT, ROBUST_LIST_64BIT, }; `index` is the index in the internal robust_list's linked list (the naming starts to get confusing, I reckon). If `index =3D=3D -1`, that means that user wants to set a new robust_list, and the kernel will append it in the end of the list, assign a new index and return this index to the user. If `index >=3D 0`, that means that user wants to re-set `*head` of an already existing list (similarly to what happens when you call set_robust_list() twice with different `*head`). If `index` is out of range, or it points to a non-existing robust_list, or if the internal list is full, an error is returned. Unaligned `head` addresses are refused by the kernel with -EINVAL. User cannot remove lists. * Implementation The old syscall's set/get_robust_list() are converted to use the linked list as well. When using only the old syscalls user shouldn't any difference as the internal code will handle the linked list insertion as usual. When mixing old and new interfaces users should be aware that one of the elements of the list was created by another syscall and they should have special care handling this element index. On exit, the linked list is parsed and all robust lists regardless of which interface it was used to create them are handled. Signed-off-by: Andr=C3=A9 Almeida --- include/linux/futex.h | 5 +- include/linux/sched.h | 5 +- include/uapi/asm-generic/unistd.h | 2 + include/uapi/linux/futex.h | 24 +++++++++ kernel/futex/core.c | 111 ++++++++++++++++++++++++++++++----= ---- kernel/futex/futex.h | 5 ++ kernel/futex/syscalls.c | 81 ++++++++++++++++++++++++++-- 7 files changed, 204 insertions(+), 29 deletions(-) diff --git a/include/linux/futex.h b/include/linux/futex.h index cd7c5d12c846566c56f3f3ea74b95e437a6e8193..7721629926535c775bd7b05b528= 3a3d0b51262d6 100644 --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -75,10 +75,11 @@ enum { =20 static inline void futex_init_task(struct task_struct *tsk) { - tsk->robust_list =3D NULL; + tsk->robust_list_index =3D -1; #ifdef CONFIG_COMPAT - tsk->compat_robust_list =3D NULL; + tsk->compat_robust_list_index =3D -1; #endif + INIT_LIST_HEAD(&tsk->robust_list2); INIT_LIST_HEAD(&tsk->pi_state_list); tsk->pi_state_cache =3D NULL; tsk->futex_state =3D FUTEX_STATE_OK; diff --git a/include/linux/sched.h b/include/linux/sched.h index 51e5d05a9fcd407dcd53b7b7cb8c59783660a826..a37c55cf0a4d942ec1fbedb8bcd= 4be5a3ebb20bb 100644 --- a/include/linux/sched.h +++ b/include/linux/sched.h @@ -1322,10 +1322,11 @@ struct task_struct { u32 rmid; #endif #ifdef CONFIG_FUTEX - struct robust_list_head __user *robust_list; + int robust_list_index; #ifdef CONFIG_COMPAT - struct robust_list_head32 __user *compat_robust_list; + int compat_robust_list_index; #endif + struct list_head robust_list2; struct list_head pi_state_list; struct futex_pi_state *pi_state_cache; struct mutex futex_exit_mutex; diff --git a/include/uapi/asm-generic/unistd.h b/include/uapi/asm-generic/u= nistd.h index 2892a45023af6d3eb941623d4fed04841ab07e02..ebe68c2c88eb5390dda184ce926= 8a8d3a606c9e5 100644 --- a/include/uapi/asm-generic/unistd.h +++ b/include/uapi/asm-generic/unistd.h @@ -852,6 +852,8 @@ __SYSCALL(__NR_removexattrat, sys_removexattrat) #define __NR_open_tree_attr 467 __SYSCALL(__NR_open_tree_attr, sys_open_tree_attr) =20 +#define __NR_set_robust_list2 467 + #undef __NR_syscalls #define __NR_syscalls 468 =20 diff --git a/include/uapi/linux/futex.h b/include/uapi/linux/futex.h index 7e2744ec89336a260e89883e95222eda199eeb7f..cbd321eca03afb6bdcf47e95347= 61d82f9de7e43 100644 --- a/include/uapi/linux/futex.h +++ b/include/uapi/linux/futex.h @@ -153,6 +153,30 @@ struct robust_list_head { struct robust_list __user *list_op_pending; }; =20 +#define ROBUST_LISTS_PER_TASK 10 + +enum robust_list2_type { + ROBUST_LIST_32BIT, + ROBUST_LIST_64BIT, +}; + +#define ROBUST_LIST_TYPE_MASK (ROBUST_LIST_32BIT | ROBUST_LIST_64BIT) + +/* + * This is an entry of a linked list of robust lists. + * + * @head: can point to a 64bit list or a 32bit list + * @list_type: determine the size of the futex pointers in the list + * @index: the index of this entry in the list + * @list: linked list element + */ +struct robust_list2_entry { + void __user *head; + enum robust_list2_type list_type; + unsigned int index; + struct list_head list; +}; + /* * Are there any waiters for this robust futex: */ diff --git a/kernel/futex/core.c b/kernel/futex/core.c index 8640770aadc611b7341a3abb41bdb740e6394479..49b3bc592948a811f995017027f= 33ad8f285531f 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -1151,9 +1151,9 @@ static inline int fetch_robust_entry(struct robust_li= st __user **entry, * * We silently return on any sign of list-walking problem. */ -static void exit_robust_list64(struct task_struct *curr) +static void exit_robust_list64(struct task_struct *curr, + struct robust_list_head __user *head) { - struct robust_list_head __user *head =3D curr->robust_list; struct robust_list __user *entry, *next_entry, *pending; unsigned int limit =3D ROBUST_LIST_LIMIT, pi, pip; unsigned int next_pi; @@ -1213,7 +1213,8 @@ static void exit_robust_list64(struct task_struct *cu= rr) } } #else -static void exit_robust_list64(struct task_struct *curr) +static void exit_robust_list64(struct task_struct *curr, + struct robust_list_head __user *head) { pr_warn("32bit kernel should not allow ROBUST_LIST_64BIT"); } @@ -1250,9 +1251,9 @@ fetch_robust_entry32(u32 *uentry, struct robust_list = __user **entry, * * We silently return on any sign of list-walking problem. */ -static void exit_robust_list32(struct task_struct *curr) +static void exit_robust_list32(struct task_struct *curr, + struct robust_list_head32 __user *head) { - struct robust_list_head32 __user *head =3D curr->compat_robust_list; struct robust_list __user *entry, *next_entry, *pending; unsigned int limit =3D ROBUST_LIST_LIMIT, pi, pip; unsigned int next_pi; @@ -1318,6 +1319,70 @@ static void exit_robust_list32(struct task_struct *c= urr) } } =20 +long do_set_robust_list2(struct robust_list_head __user *head, + int index, unsigned int type) +{ + struct list_head *list2 =3D ¤t->robust_list2; + struct robust_list2_entry *prev, *new =3D NULL; + + if (index =3D=3D -1) { + if (list_empty(list2)) { + index =3D 0; + } else { + prev =3D list_last_entry(list2, struct robust_list2_entry, list); + index =3D prev->index + 1; + } + + if (index >=3D ROBUST_LISTS_PER_TASK) + return -EINVAL; + + new =3D kmalloc(sizeof(struct robust_list2_entry), GFP_KERNEL); + if (!new) + return -ENOMEM; + + list_add_tail(&new->list, list2); + new->index =3D index; + + } else if (index >=3D 0) { + struct robust_list2_entry *curr; + + if (list_empty(list2)) + return -ENOENT; + + list_for_each_entry(curr, list2, list) { + if (index =3D=3D curr->index) { + new =3D curr; + break; + } + } + + if (!new) + return -ENOENT; + } + + BUG_ON(!new); + new->head =3D head; + new->list_type =3D type; + + return index; +} + +struct robust_list_head __user *get_robust_list2(int index, struct task_st= ruct *task) +{ + struct list_head *list2 =3D &task->robust_list2; + struct robust_list2_entry *curr; + + if (list_empty(list2) || index =3D=3D -1) + return NULL; + + list_for_each_entry(curr, list2, list) { + if (index =3D=3D curr->index) + return curr->head; + } + + return NULL; +} + #ifdef CONFIG_FUTEX_PI =20 /* @@ -1411,24 +1476,28 @@ static inline void exit_pi_state_list(struct task_s= truct *curr) { } =20 static void futex_cleanup(struct task_struct *tsk) { -#ifdef CONFIG_64BIT - if (unlikely(tsk->robust_list)) { - exit_robust_list64(tsk); - tsk->robust_list =3D NULL; - } -#else - if (unlikely(tsk->robust_list)) { - exit_robust_list32(tsk); - tsk->robust_list =3D NULL; - } -#endif + struct robust_list2_entry *curr, *n; + struct list_head *list2 =3D &tsk->robust_list2; =20 -#ifdef CONFIG_COMPAT - if (unlikely(tsk->compat_robust_list)) { - exit_robust_list32(tsk); - tsk->compat_robust_list =3D NULL; + /* + * Walk through the linked list, parsing robust lists and freeing the + * allocated lists + */ + if (unlikely(!list_empty(list2))) { + list_for_each_entry_safe(curr, n, list2, list) { + if (curr->head !=3D NULL) { + if (curr->list_type =3D=3D ROBUST_LIST_64BIT) + exit_robust_list64(tsk, curr->head); + else if (curr->list_type =3D=3D ROBUST_LIST_32BIT) + exit_robust_list32(tsk, curr->head); + curr->head =3D NULL; + } + list_del_init(&curr->list); + kfree(curr); + } } -#endif + + tsk->robust_list_index =3D -1; =20 if (unlikely(!list_empty(&tsk->pi_state_list))) exit_pi_state_list(tsk); diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index fcd1617212eed0e3c2367d2b463a0e019eda6d13..67201e51fa1798a21ff68f60b1e= 35977b9bd267b 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -467,6 +467,11 @@ extern int __futex_wait(u32 __user *uaddr, unsigned in= t flags, u32 val, extern int futex_wait(u32 __user *uaddr, unsigned int flags, u32 val, ktime_t *abs_time, u32 bitset); =20 +extern long do_set_robust_list2(struct robust_list_head __user *head, + int index, unsigned int type); + +extern struct robust_list_head __user *get_robust_list2(int index, struct = task_struct *task); + /** * struct futex_vector - Auxiliary struct for futex_waitv() * @w: Userspace provided data diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c index dba193dfd216cc929c8f4d979aa2bcd99237e2d8..56ee1123cbd8ea26c8d22aa74e5= faed2974ec577 100644 --- a/kernel/futex/syscalls.c +++ b/kernel/futex/syscalls.c @@ -20,6 +20,18 @@ * the list. There can only be one such pending lock. */ =20 +#ifdef CONFIG_64BIT +static inline int robust_list_native_type(void) +{ + return ROBUST_LIST_64BIT; +} +#else +static inline int robust_list_native_type(void) +{ + return ROBUST_LIST_32BIT; +} +#endif + /** * sys_set_robust_list() - Set the robust-futex list head of a task * @head: pointer to the list-head @@ -28,17 +40,63 @@ SYSCALL_DEFINE2(set_robust_list, struct robust_list_head __user *, head, size_t, len) { + unsigned int type =3D robust_list_native_type(); + int ret; + /* * The kernel knows only one size for now: */ if (unlikely(len !=3D sizeof(*head))) return -EINVAL; =20 - current->robust_list =3D head; + ret =3D do_set_robust_list2(head, current->robust_list_index, type); + if (ret < 0) + return ret; + + current->robust_list_index =3D ret; =20 return 0; } =20 +#define ROBUST_LIST_FLAGS ROBUST_LIST_TYPE_MASK + +/* + * sys_set_robust_list2() + * + * When index =3D=3D -1, create a new list for user. When index >=3D 0, tr= y to find + * the corresponding list and re-set the head there. + * + * Return values: + * >=3D 0: success, index of the robust list + * -EINVAL: invalid flags, invalid index + * -ENOENT: requested index no where to be found + * -ENOMEM: error allocating new list + * -ESRCH: too many allocated lists + */ +SYSCALL_DEFINE3(set_robust_list2, struct robust_list_head __user *, head, + int, index, unsigned int, flags) +{ + unsigned int type; + + type =3D flags & ROBUST_LIST_TYPE_MASK; + + if (index < -1 || index >=3D ROBUST_LISTS_PER_TASK) + return -EINVAL; + + if ((flags & ~ROBUST_LIST_FLAGS) !=3D 0) + return -EINVAL; + + if (((uintptr_t) head % sizeof(u32)) !=3D 0) + return -EINVAL; + +#ifndef CONFIG_64BIT + if (type =3D=3D ROBUST_LIST_64BIT) + return -EINVAL; +#endif + + return do_set_robust_list2(head, index, type); +} + /** * sys_get_robust_list() - Get the robust-futex list head of a task * @pid: pid of the process [zero for current task] @@ -52,6 +110,7 @@ SYSCALL_DEFINE3(get_robust_list, int, pid, struct robust_list_head __user *head; unsigned long ret; struct task_struct *p; + int index; =20 rcu_read_lock(); =20 @@ -68,9 +127,11 @@ SYSCALL_DEFINE3(get_robust_list, int, pid, if (!ptrace_may_access(p, PTRACE_MODE_READ_REALCREDS)) goto err_unlock; =20 - head =3D p->robust_list; + index =3D p->robust_list_index; rcu_read_unlock(); =20 + head =3D get_robust_list2(index, p); + if (put_user(sizeof(*head), len_ptr)) return -EFAULT; return put_user(head, head_ptr); @@ -443,10 +504,19 @@ COMPAT_SYSCALL_DEFINE2(set_robust_list, struct robust_list_head32 __user *, head, compat_size_t, len) { + unsigned int type =3D ROBUST_LIST_32BIT; + int ret; + if (unlikely(len !=3D sizeof(*head))) return -EINVAL; =20 - current->compat_robust_list =3D head; + ret =3D do_set_robust_list2((struct robust_list_head __user *) head, + current->robust_list_index, type); + if (ret < 0) + return ret; + + current->robust_list_index =3D ret; + =20 return 0; } @@ -458,6 +528,7 @@ COMPAT_SYSCALL_DEFINE3(get_robust_list, int, pid, struct robust_list_head32 __user *head; unsigned long ret; struct task_struct *p; + int index; =20 rcu_read_lock(); =20 @@ -474,9 +545,11 @@ COMPAT_SYSCALL_DEFINE3(get_robust_list, int, pid, if (!ptrace_may_access(p, PTRACE_MODE_READ_REALCREDS)) goto err_unlock; =20 - head =3D p->compat_robust_list; + index =3D p->compat_robust_list_index; rcu_read_unlock(); =20 + head =3D (struct robust_list_head32 __user *) get_robust_list2(index, p); + if (put_user(sizeof(*head), len_ptr)) return -EFAULT; return put_user(ptr_to_compat(head), head_ptr); --=20 2.49.0 From nobody Thu Oct 9 10:42:21 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 03C492EAB80; Tue, 17 Jun 2025 18:35:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750185333; cv=none; b=PtkN+6Yxrq1ikeOaniwXtzjUXoDVBdT51qojFXeVb9MM0AR51ZfdWr7ByJZ5G1iYHBLXAjy78aOZ9uTjki8cQ92uP18pOueM5trMJhaRpD3evF+o2IEnJZB2FeSwFJFEg/ulMx7sxrw63LTsaIv4uFi5RriG1DVibcVKC5h90oM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750185333; c=relaxed/simple; bh=eE2PyitkntoNMoKoFD0RAgwDFQxOKSGVNOA1GV/xd28=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=VNTvg1aFUoGKRs0iPtORexDgb/nC1A0N90vKDbZ6l7v6kqNwUwh5L3X2l+jD58n9oTXvwriwVrrD6/0hpAGDE0owNVTTULC5EihS8qu4INsr/KHIBoU/mdKC7Hv0WM2vdw5+WFbHrZrDblMZEnYOv/I4wQ4J6bQo6AVXwmy+EJw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=nGajPdbm; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="nGajPdbm" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=l/grAn1V/PWLVqAuWAXwa7AZ2lhCZoBiGV75dOFYz/c=; b=nGajPdbm/o9UG8gkwPaJXL0ceP mgeCaBTnOtzCuiyULoun5LdH8KE8v3r3NrgnINYa57dbydR2NWU7GTktg02q/RQhNn6Jl9pY4YQC2 p1662CYyJOIae3l5Tgok/AGKakCcAwlSXqJymaucmTlx7t/aK+QM3JD7PLhlm/tOLVG5n3Ws2kX9g Z3RF/h3qoP9qDtwYHskJ6GqpUXfIT1FYsbWKoSDv6OWEienELIvbOaWcPBqX02TcFx6pB5LibtocH Ay84ewCgYbFh50+VbdXFIecL/2ILBLxRRCLPhBYNkzCq9bKJxNwUMWEbTN0y3tx0SRwWk/brz2Ntk 5/3pv3nw==; Received: from [191.204.192.64] (helo=[192.168.15.100]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uRb9P-004j89-SC; Tue, 17 Jun 2025 20:35:16 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= Date: Tue, 17 Jun 2025 15:34:22 -0300 Subject: [PATCH RESEND v4 5/7] futex: Wire up set_robust_list2 syscall Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250617-tonyk-robust_futex-v4-5-6586f5fb9d33@igalia.com> References: <20250617-tonyk-robust_futex-v4-0-6586f5fb9d33@igalia.com> In-Reply-To: <20250617-tonyk-robust_futex-v4-0-6586f5fb9d33@igalia.com> To: Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Darren Hart , Davidlohr Bueso , Shuah Khan , Arnd Bergmann , Sebastian Andrzej Siewior , Waiman Long Cc: linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-api@vger.kernel.org, kernel-dev@igalia.com, =?utf-8?q?Andr=C3=A9_Almeida?= X-Mailer: b4 0.14.2 Wire up the new set_robust_list2 syscall in all available architectures. Signed-off-by: Andr=C3=A9 Almeida --- arch/alpha/kernel/syscalls/syscall.tbl | 1 + arch/arm/tools/syscall.tbl | 1 + arch/m68k/kernel/syscalls/syscall.tbl | 1 + arch/microblaze/kernel/syscalls/syscall.tbl | 1 + arch/mips/kernel/syscalls/syscall_n32.tbl | 1 + arch/mips/kernel/syscalls/syscall_n64.tbl | 1 + arch/mips/kernel/syscalls/syscall_o32.tbl | 1 + arch/parisc/kernel/syscalls/syscall.tbl | 1 + arch/powerpc/kernel/syscalls/syscall.tbl | 1 + arch/s390/kernel/syscalls/syscall.tbl | 1 + arch/sh/kernel/syscalls/syscall.tbl | 1 + arch/sparc/kernel/syscalls/syscall.tbl | 1 + arch/x86/entry/syscalls/syscall_32.tbl | 1 + arch/x86/entry/syscalls/syscall_64.tbl | 1 + arch/xtensa/kernel/syscalls/syscall.tbl | 1 + kernel/sys_ni.c | 1 + scripts/syscall.tbl | 1 + 17 files changed, 17 insertions(+) diff --git a/arch/alpha/kernel/syscalls/syscall.tbl b/arch/alpha/kernel/sys= calls/syscall.tbl index 2dd6340de6b4efddc406f0c235701c15cf02f650..aecc167ac7706d25da73db8099f= 0813e268b820c 100644 --- a/arch/alpha/kernel/syscalls/syscall.tbl +++ b/arch/alpha/kernel/syscalls/syscall.tbl @@ -507,3 +507,4 @@ 575 common listxattrat sys_listxattrat 576 common removexattrat sys_removexattrat 577 common open_tree_attr sys_open_tree_attr +578 common set_robust_list2 sys_robust_list2 diff --git a/arch/arm/tools/syscall.tbl b/arch/arm/tools/syscall.tbl index 27c1d5ebcd91c8c296dc6676307f66bfdf4ab78d..2e47ae5dc9a426d8e5e9dacf29c= aa54223cf2f5a 100644 --- a/arch/arm/tools/syscall.tbl +++ b/arch/arm/tools/syscall.tbl @@ -482,3 +482,4 @@ 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat 467 common open_tree_attr sys_open_tree_attr +468 common set_robust_list2 sys_set_robust_list2 diff --git a/arch/m68k/kernel/syscalls/syscall.tbl b/arch/m68k/kernel/sysca= lls/syscall.tbl index 9fe47112c586f152662af38a9a7f90957cb96cf8..7bcc8cc628c80a44fea2b53d5c6= 9ab5e5f10a1d2 100644 --- a/arch/m68k/kernel/syscalls/syscall.tbl +++ b/arch/m68k/kernel/syscalls/syscall.tbl @@ -467,3 +467,4 @@ 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat 467 common open_tree_attr sys_open_tree_attr +468 common set_robust_list2 sys_set_robust_list2 diff --git a/arch/microblaze/kernel/syscalls/syscall.tbl b/arch/microblaze/= kernel/syscalls/syscall.tbl index 7b6e97828e552d4da90046ddfcd4a55723e522bb..cd23608afe7e7dadfbf8e21df04= 86b85bfcb99ce 100644 --- a/arch/microblaze/kernel/syscalls/syscall.tbl +++ b/arch/microblaze/kernel/syscalls/syscall.tbl @@ -473,3 +473,4 @@ 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat 467 common open_tree_attr sys_open_tree_attr +468 common set_robust_list2 sys_set_robust_list2 diff --git a/arch/mips/kernel/syscalls/syscall_n32.tbl b/arch/mips/kernel/s= yscalls/syscall_n32.tbl index aa70e371bb54ab5d9c8dd8923b6ecf9693ee914d..0a31452ef6ed8fee8f1e2ead5d4= 4acfbbe275fe9 100644 --- a/arch/mips/kernel/syscalls/syscall_n32.tbl +++ b/arch/mips/kernel/syscalls/syscall_n32.tbl @@ -406,3 +406,4 @@ 465 n32 listxattrat sys_listxattrat 466 n32 removexattrat sys_removexattrat 467 n32 open_tree_attr sys_open_tree_attr +468 n32 set_robust_list2 sys_set_robust_list2 diff --git a/arch/mips/kernel/syscalls/syscall_n64.tbl b/arch/mips/kernel/s= yscalls/syscall_n64.tbl index 1e8c44c7b61492eabf00c777831e457a7a6e579c..4cb5a72256338f6fb407f940f18= 83d523113d609 100644 --- a/arch/mips/kernel/syscalls/syscall_n64.tbl +++ b/arch/mips/kernel/syscalls/syscall_n64.tbl @@ -382,3 +382,4 @@ 465 n64 listxattrat sys_listxattrat 466 n64 removexattrat sys_removexattrat 467 n64 open_tree_attr sys_open_tree_attr +468 n64 set_robust_list2 sys_set_robust_list2 diff --git a/arch/mips/kernel/syscalls/syscall_o32.tbl b/arch/mips/kernel/s= yscalls/syscall_o32.tbl index 114a5a1a62302e32dd74d1679ff423a2d57c3c6b..c46238e9edd00d2861edcfa87c5= ce7a62bfdc3d4 100644 --- a/arch/mips/kernel/syscalls/syscall_o32.tbl +++ b/arch/mips/kernel/syscalls/syscall_o32.tbl @@ -455,3 +455,4 @@ 465 o32 listxattrat sys_listxattrat 466 o32 removexattrat sys_removexattrat 467 o32 open_tree_attr sys_open_tree_attr +468 o32 set_robust_list2 sys_set_robust_list2 diff --git a/arch/parisc/kernel/syscalls/syscall.tbl b/arch/parisc/kernel/s= yscalls/syscall.tbl index 94df3cb957e9d547d192e8732c0cf23ef2b5ce5d..71071489a18375013bbfbe26578= a634283c1e07b 100644 --- a/arch/parisc/kernel/syscalls/syscall.tbl +++ b/arch/parisc/kernel/syscalls/syscall.tbl @@ -466,3 +466,4 @@ 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat 467 common open_tree_attr sys_open_tree_attr +468 common set_robust_list2 sys_set_robust_list2 diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel= /syscalls/syscall.tbl index 9a084bdb892694bc562f514b55212d167cbac12f..edc4d0bef3f1c7ab826ea8180e7= f5ceba4774c07 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -558,3 +558,4 @@ 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat 467 common open_tree_attr sys_open_tree_attr +468 common set_robust_list2 sys_set_robust_list2 diff --git a/arch/s390/kernel/syscalls/syscall.tbl b/arch/s390/kernel/sysca= lls/syscall.tbl index a4569b96ef06c54ce7aa795d039541c90a38284f..ff8c594073ec8c3486cc61544d1= 4a338d3f3a906 100644 --- a/arch/s390/kernel/syscalls/syscall.tbl +++ b/arch/s390/kernel/syscalls/syscall.tbl @@ -470,3 +470,4 @@ 465 common listxattrat sys_listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat sys_removexattrat 467 common open_tree_attr sys_open_tree_attr sys_open_tree_attr +468 common set_robust_list2 sys_set_robust_list2 sys_set_robust_list2 diff --git a/arch/sh/kernel/syscalls/syscall.tbl b/arch/sh/kernel/syscalls/= syscall.tbl index 52a7652fcff6394b96ace1f3b0ed72250ee5e669..507789194570a9e7b492b210be3= 0bb41021be289 100644 --- a/arch/sh/kernel/syscalls/syscall.tbl +++ b/arch/sh/kernel/syscalls/syscall.tbl @@ -471,3 +471,4 @@ 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat 467 common open_tree_attr sys_open_tree_attr +468 common set_robust_list2 sys_set_robust_list2 diff --git a/arch/sparc/kernel/syscalls/syscall.tbl b/arch/sparc/kernel/sys= calls/syscall.tbl index 83e45eb6c095a36baaf749927628e6052fe900e6..8d1122c2235b8d5082a11392e68= 787efe55f58be 100644 --- a/arch/sparc/kernel/syscalls/syscall.tbl +++ b/arch/sparc/kernel/syscalls/syscall.tbl @@ -513,3 +513,4 @@ 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat 467 common open_tree_attr sys_open_tree_attr +468 common set_robust_list2 sys_set_robust_list2 diff --git a/arch/x86/entry/syscalls/syscall_32.tbl b/arch/x86/entry/syscal= ls/syscall_32.tbl index ac007ea00979dc28b0ef7c002a0615ce86dd3101..cbc0c469e66ecf7b8a61e82c38b= 07ecc63f6fe23 100644 --- a/arch/x86/entry/syscalls/syscall_32.tbl +++ b/arch/x86/entry/syscalls/syscall_32.tbl @@ -473,3 +473,4 @@ 465 i386 listxattrat sys_listxattrat 466 i386 removexattrat sys_removexattrat 467 i386 open_tree_attr sys_open_tree_attr +468 i386 set_robust_list2 sys_set_robust_list2 diff --git a/arch/x86/entry/syscalls/syscall_64.tbl b/arch/x86/entry/syscal= ls/syscall_64.tbl index cfb5ca41e30de1a4e073750096f5b51a2ec137d2..b420217c72fc50ad90f29181297= 2019606c5ff69 100644 --- a/arch/x86/entry/syscalls/syscall_64.tbl +++ b/arch/x86/entry/syscalls/syscall_64.tbl @@ -391,6 +391,7 @@ 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat 467 common open_tree_attr sys_open_tree_attr +468 common set_robust_list2 sys_set_robust_list2 =20 # # Due to a historical design error, certain syscalls are numbered differen= tly diff --git a/arch/xtensa/kernel/syscalls/syscall.tbl b/arch/xtensa/kernel/s= yscalls/syscall.tbl index f657a77314f8667fa019a01e10c84ea270024adc..6b852ee8a1621c7dd24f6cd37fd= 990f5ff8d8527 100644 --- a/arch/xtensa/kernel/syscalls/syscall.tbl +++ b/arch/xtensa/kernel/syscalls/syscall.tbl @@ -438,3 +438,4 @@ 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat 467 common open_tree_attr sys_open_tree_attr +468 common set_robust_list2 sys_set_robust_list2 diff --git a/kernel/sys_ni.c b/kernel/sys_ni.c index c00a86931f8c6cb30d35a9d56cbcc5994add90e1..71fbac6176c8886f4fa8dd437b0= aedd5f14e9f74 100644 --- a/kernel/sys_ni.c +++ b/kernel/sys_ni.c @@ -195,6 +195,7 @@ COND_SYSCALL(move_pages); COND_SYSCALL(set_mempolicy_home_node); COND_SYSCALL(cachestat); COND_SYSCALL(mseal); +COND_SYSCALL(set_robust_list2); =20 COND_SYSCALL(perf_event_open); COND_SYSCALL(accept4); diff --git a/scripts/syscall.tbl b/scripts/syscall.tbl index 580b4e246aecd5f07d542943ba68fc4ed5961660..07d7e776d0329659e70a9a55fff= f7ac18eb3ff87 100644 --- a/scripts/syscall.tbl +++ b/scripts/syscall.tbl @@ -408,3 +408,4 @@ 465 common listxattrat sys_listxattrat 466 common removexattrat sys_removexattrat 467 common open_tree_attr sys_open_tree_attr +468 common set_robust_list2 sys_set_robust_list2 --=20 2.49.0 From nobody Thu Oct 9 10:42:21 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3E7A723AE84; Tue, 17 Jun 2025 18:35:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750185329; cv=none; b=Hz651GsAaiMuMB/KSwOYZS27u+2u/G8GBpC3CkDBpHP+v+2d8FqpdjASlzy2e4o8PybSas6Crxlu1MZIEN+6DjuIyoViOyvrv+RgWPtnabJ0HWkfTDIAmPKgXpASw1gW36YISgdGoiHjq6R/LSB0DQSxLMZsCNAvFFttiqLdv9A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750185329; c=relaxed/simple; bh=1Te3gUCb3wx4Xfv+yeVdknw42AFZu1cU2lHKKErDv+k=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=CK8VbvGWPukcsPU8r1tDg15NKeQp39CIBBaQ0pPisLuSoPWZbfch+a+JDEP4AjhPtR3oU3zoXJIGtrQUF56OeMksNXrK6n5GuRXRs9jllNm/alHpHlNZJ//q378uWd70vxKl8SzBdT6zlQIMUcFaoSwyyoh1jkEgNL6Ol/Co3lc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=RuGGDr/5; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="RuGGDr/5" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=mf6hTPlKjBS7l4gbTn3lLHJV8+IbbJlPwtkPqOBsY2A=; b=RuGGDr/56zGcsKtYz20WEhQQ3k 31ODKV9V5vwm9OvH6X8CxEnnoj5r1bBLjmotARxCvkVQlsNbzC2eZMQsRI/xgUqv73jAG5CX21nDo yUvoFdS6Jn4wO9hsx7PvZjz3yaI5ts+amFUWufqmYK4ZW0TYOerWedVnkDum7hj0smC2stZFJ23IE 9rp2OCtzE/oTnFO2pneOMwGZouqxtok/j8+9eMrn3XF9T2Mw/957ckfgmiIH4EDJ1vBPxENDOjRDk Kdr6ubfNNvhSzyTCKwXzIyhsU31Uj0fFXJUKe599mHdxsLn5GsKQgKa0DDz2YmNjHsv/2tmNQ6ILN OBn9Sg1A==; Received: from [191.204.192.64] (helo=[192.168.15.100]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uRb9T-004j89-2J; Tue, 17 Jun 2025 20:35:19 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= Date: Tue, 17 Jun 2025 15:34:23 -0300 Subject: [PATCH RESEND v4 6/7] futex: Remove the limit of elements for sys_set_robust_list2 lists Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250617-tonyk-robust_futex-v4-6-6586f5fb9d33@igalia.com> References: <20250617-tonyk-robust_futex-v4-0-6586f5fb9d33@igalia.com> In-Reply-To: <20250617-tonyk-robust_futex-v4-0-6586f5fb9d33@igalia.com> To: Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Darren Hart , Davidlohr Bueso , Shuah Khan , Arnd Bergmann , Sebastian Andrzej Siewior , Waiman Long Cc: linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-api@vger.kernel.org, kernel-dev@igalia.com, =?utf-8?q?Andr=C3=A9_Almeida?= X-Mailer: b4 0.14.2 Remove the limit of ROBUST_LIST_LIMIT elements that a robust list can have, for the ones created with the new interface. This is done by overwritten the list as it's proceeded in a way that we avoid circular lists. For the old interface, we keep the limited behavior to avoid changing the API. Signed-off-by: Andr=C3=A9 Almeida --- kernel/futex/core.c | 50 ++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 36 insertions(+), 14 deletions(-) diff --git a/kernel/futex/core.c b/kernel/futex/core.c index 49b3bc592948a811f995017027f33ad8f285531f..61f0b48a2bcd8ab926754980ab3= 454b9ec13a344 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -1152,7 +1152,8 @@ static inline int fetch_robust_entry(struct robust_li= st __user **entry, * We silently return on any sign of list-walking problem. */ static void exit_robust_list64(struct task_struct *curr, - struct robust_list_head __user *head) + struct robust_list_head __user *head, + bool destroyable) { struct robust_list __user *entry, *next_entry, *pending; unsigned int limit =3D ROBUST_LIST_LIMIT, pi, pip; @@ -1196,13 +1197,17 @@ static void exit_robust_list64(struct task_struct *= curr, } if (rc) return; - entry =3D next_entry; - pi =3D next_pi; + /* * Avoid excessively long or circular lists: */ - if (!--limit) + if (!destroyable && !--limit) break; + else + put_user(&head->list, &entry->next); + + entry =3D next_entry; + pi =3D next_pi; =20 cond_resched(); } @@ -1214,7 +1219,8 @@ static void exit_robust_list64(struct task_struct *cu= rr, } #else static void exit_robust_list64(struct task_struct *curr, - struct robust_list_head __user *head) + struct robust_list_head __user *head, + bool destroyable) { pr_warn("32bit kernel should not allow ROBUST_LIST_64BIT"); } @@ -1252,7 +1258,8 @@ fetch_robust_entry32(u32 *uentry, struct robust_list = __user **entry, * We silently return on any sign of list-walking problem. */ static void exit_robust_list32(struct task_struct *curr, - struct robust_list_head32 __user *head) + struct robust_list_head32 __user *head, + bool destroyable) { struct robust_list __user *entry, *next_entry, *pending; unsigned int limit =3D ROBUST_LIST_LIMIT, pi, pip; @@ -1301,14 +1308,17 @@ static void exit_robust_list32(struct task_struct *= curr, } if (rc) return; - uentry =3D next_uentry; - entry =3D next_entry; - pi =3D next_pi; /* * Avoid excessively long or circular lists: */ - if (!--limit) + if (!destroyable && !--limit) break; + else + put_user((struct robust_list __user *) &head->list, &entry->next); + + uentry =3D next_uentry; + entry =3D next_entry; + pi =3D next_pi; =20 cond_resched(); } @@ -1474,26 +1484,38 @@ static void exit_pi_state_list(struct task_struct *= curr) static inline void exit_pi_state_list(struct task_struct *curr) { } #endif =20 +/* + * futex_cleanup - After the task exists, process the robust lists + * + * Walk through the linked list, parsing robust lists and freeing the + * allocated lists. Lists created with the set_robust_list2 don't have a l= imit + * for sizing and can be destroyed while we walk on it to avoid circular l= ist. + */ static void futex_cleanup(struct task_struct *tsk) { struct robust_list2_entry *curr, *n; struct list_head *list2 =3D &tsk->robust_list2; + bool destroyable =3D true; + int i =3D 0; =20 /* - * Walk through the linked list, parsing robust lists and freeing the - * allocated lists */ if (unlikely(!list_empty(list2))) { list_for_each_entry_safe(curr, n, list2, list) { + destroyable =3D true; + if (tsk->robust_list_index =3D=3D i) + destroyable =3D false; + if (curr->head !=3D NULL) { if (curr->list_type =3D=3D ROBUST_LIST_64BIT) - exit_robust_list64(tsk, curr->head); + exit_robust_list64(tsk, curr->head, destroyable); else if (curr->list_type =3D=3D ROBUST_LIST_32BIT) - exit_robust_list32(tsk, curr->head); + exit_robust_list32(tsk, curr->head, destroyable); curr->head =3D NULL; } list_del_init(&curr->list); kfree(curr); + i++; } } =20 --=20 2.49.0 From nobody Thu Oct 9 10:42:21 2025 Received: from fanzine2.igalia.com (fanzine2.igalia.com [213.97.179.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8397C2F94A1; Tue, 17 Jun 2025 18:35:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=213.97.179.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750185334; cv=none; b=sfD30NOzFY2RT6fUIAXqf37G3DAH5xQbpXPJLKTLDCPo3uavv1M6nePXyGfLaFJdfoYz2sTfdzhzVpremR9nh+xmw/5v/LdRe9VU/RzvyuDUyA/RlbX5TO5PO09yT5ddA/3fAaqW5ptYWcFcbzqujrNwTUE0q8btGSH11y2DXuc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1750185334; c=relaxed/simple; bh=PecueBhb3qx1RHFBN98VuO4lhdtWIZM9jToHkTc/HLE=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=CCWPivZ5t3xoloCACOa/veBi9ObqRZBcQbdlVotdBPN8ABlo5qhXxqy9FQSWsNCHRvDUbNldCPBDWOJnE6jHZAiWzgek9fPke7FrOs5Ua6SKeVI+NE/SRu8nuXHafqcaY0Cl0lfgjDrZnCdypU2dG6vS0x54zJod6mnf7LSlUmU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=WiU+lJis; arc=none smtp.client-ip=213.97.179.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="WiU+lJis" DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Cc:To:In-Reply-To:References:Message-Id: Content-Transfer-Encoding:Content-Type:MIME-Version:Subject:Date:From:Sender: Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From:Resent-Sender :Resent-To:Resent-Cc:Resent-Message-ID:List-Id:List-Help:List-Unsubscribe: List-Subscribe:List-Post:List-Owner:List-Archive; bh=N5VA/SwJ6w5fExHAtI2s26hUsQkblQRmCfj6Ybce2s4=; b=WiU+lJis5rhkm3wXROIn8wWZN6 1I6y8gGWwWryS5WSvWNumDuNqyc+S+r+S3/hpmGqdolYgxzetUifU6aYcfNdW496COWhFdhGtfelK drv4oKxqI90UzEs+pV068X9ZRlvP4VBuLWG6yWEB6TZW3sJOswkDY/b9sNpvRZHiRU5Su/uVMUbaz i3DRjLxUpV3LVuX6qv79iye5WOwPC+LOrBHAy/8QLNPB2EZ8eL8fUstEp+gkdqA62FBGPu+ts9UI2 tAOEai3yqYV8uG+8nIotb5bbTyZjE/oWM6BQiXpJI7A7UobWIXrtHJJB25pRqAwA3MGqNzRsN7Ypp 1vjWUvvg==; Received: from [191.204.192.64] (helo=[192.168.15.100]) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1uRb9W-004j89-8N; Tue, 17 Jun 2025 20:35:22 +0200 From: =?utf-8?q?Andr=C3=A9_Almeida?= Date: Tue, 17 Jun 2025 15:34:24 -0300 Subject: [PATCH RESEND v4 7/7] selftests: futex: Expand robust list test for the new interface Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250617-tonyk-robust_futex-v4-7-6586f5fb9d33@igalia.com> References: <20250617-tonyk-robust_futex-v4-0-6586f5fb9d33@igalia.com> In-Reply-To: <20250617-tonyk-robust_futex-v4-0-6586f5fb9d33@igalia.com> To: Thomas Gleixner , Ingo Molnar , Peter Zijlstra , Darren Hart , Davidlohr Bueso , Shuah Khan , Arnd Bergmann , Sebastian Andrzej Siewior , Waiman Long Cc: linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, linux-api@vger.kernel.org, kernel-dev@igalia.com, =?utf-8?q?Andr=C3=A9_Almeida?= X-Mailer: b4 0.14.2 Expand the current robust list test for the new set_robust_list2 syscall. Create an option to make it possible to run the same tests using the new syscall, and also add two new relevant test: test long lists (bigger than ROBUST_LIST_LIMIT) and for unaligned addresses. Signed-off-by: Andr=C3=A9 Almeida --- .../selftests/futex/functional/robust_list.c | 160 +++++++++++++++++= +++- 1 file changed, 156 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/futex/functional/robust_list.c b/tools= /testing/selftests/futex/functional/robust_list.c index 42690b2440fd29a9b12c46f67f9645ccc93d1147..004ad79ff6171c411fd47e699e3= c38889544218e 100644 --- a/tools/testing/selftests/futex/functional/robust_list.c +++ b/tools/testing/selftests/futex/functional/robust_list.c @@ -35,16 +35,45 @@ #include #include #include +#include =20 #define STACK_SIZE (1024 * 1024) =20 #define FUTEX_TIMEOUT 3 =20 +#define SYS_set_robust_list2 468 + +enum robust_list2_type { + ROBUST_LIST_32BIT, + ROBUST_LIST_64BIT, +}; + static pthread_barrier_t barrier, barrier2; =20 +bool robust2 =3D false; + int set_robust_list(struct robust_list_head *head, size_t len) { - return syscall(SYS_set_robust_list, head, len); + int ret, flags; + + if (!robust2) { + return syscall(SYS_set_robust_list, head, len); + } + + if (sizeof(head) =3D=3D 8) + flags =3D ROBUST_LIST_64BIT; + else + flags =3D ROBUST_LIST_32BIT; + + /* + * We act as we have just one list here. We try to use the first slot, + * but if it hasn't been alocated yet we allocate it. + */ + ret =3D syscall(SYS_set_robust_list2, head, 0, flags); + if (ret =3D=3D -1 && errno =3D=3D ENOENT) + ret =3D syscall(SYS_set_robust_list2, head, -1, flags); + + return ret; } =20 int get_robust_list(int pid, struct robust_list_head **head, size_t *len_p= tr) @@ -246,6 +275,11 @@ static void test_set_robust_list_invalid_size(void) size_t head_size =3D sizeof(struct robust_list_head); int ret; =20 + if (robust2) { + ksft_test_result_skip("This test is only for old robust interface\n"); + return; + } + ret =3D set_robust_list(&head, head_size); ASSERT_EQ(ret, 0); =20 @@ -321,6 +355,11 @@ static void test_get_robust_list_child(void) struct robust_list_head head, *get_head; size_t len_ptr; =20 + if (robust2) { + ksft_test_result_skip("Not implemented in the new robust interface\n"); + return; + } + ret =3D pthread_barrier_init(&barrier, NULL, 2); ret =3D pthread_barrier_init(&barrier2, NULL, 2); ASSERT_EQ(ret, 0); @@ -332,7 +371,7 @@ static void test_get_robust_list_child(void) =20 ret =3D get_robust_list(tid, &get_head, &len_ptr); ASSERT_EQ(ret, 0); - ASSERT_EQ(&head, get_head); + ASSERT_EQ(get_head, &head); =20 pthread_barrier_wait(&barrier2); =20 @@ -507,11 +546,119 @@ static void test_circular_list(void) ksft_test_result_pass("%s\n", __func__); } =20 +#define ROBUST_LIST_LIMIT 2048 +#define CHILD_LIST_LIMIT (ROBUST_LIST_LIMIT + 10) + +static int child_robust_list_limit(void *arg) +{ + struct lock_struct *locks; + struct robust_list *list; + struct robust_list_head head; + int ret, i; + + locks =3D (struct lock_struct *) arg; + + ret =3D set_list(&head); + if (ret) + ksft_test_result_fail("set_list error\n"); + + /* + * Create a very long list of locks + */ + head.list.next =3D &locks[0].list; + + list =3D head.list.next; + for (i =3D 0; i < CHILD_LIST_LIMIT - 1; i++) { + list->next =3D &locks[i+1].list; + list =3D list->next; + } + list->next =3D &head.list; + + /* + * Grab the lock in the last one, and die without releasing it + */ + mutex_lock(&locks[CHILD_LIST_LIMIT], &head, false); + pthread_barrier_wait(&barrier); + + sleep(1); + + return 0; +} + +/* + * The old robust list used to have a limit of 2048 items from the kernel = side. + * After this limit the kernel stops walking the list and ignore the other + * futexes, causing deadlocks. + * + * For the new interface, test if we can wait for a list of more than 2048 + * elements. + */ +static void test_robust_list_limit(void) +{ + struct lock_struct locks[CHILD_LIST_LIMIT + 1]; + _Atomic(unsigned int) *futex =3D &locks[CHILD_LIST_LIMIT].futex; + struct robust_list_head head; + int ret; + + if (!robust2) { + ksft_test_result_skip("This test is only for new robust interface\n"); + return; + } + + *futex =3D 0; + + ret =3D set_list(&head); + ASSERT_EQ(ret, 0); + + ret =3D pthread_barrier_init(&barrier, NULL, 2); + ASSERT_EQ(ret, 0); + + create_child(child_robust_list_limit, locks); + + /* + * After the child thread creates the very long list of locks, wait on + * the last one. + */ + pthread_barrier_wait(&barrier); + ret =3D mutex_lock(&locks[CHILD_LIST_LIMIT], &head, false); + + if (ret !=3D 0) + printf("futex wait returned %d\n", errno); + ASSERT_EQ(ret, 0); + + ASSERT_TRUE(*futex | FUTEX_OWNER_DIED); + + wait(NULL); + pthread_barrier_destroy(&barrier); + + ksft_test_result_pass("%s\n", __func__); +} + +/* + * The kernel should refuse an unaligned head pointer + */ +static void test_unaligned_address(void) +{ + struct robust_list_head head, *h; + int ret; + + if (!robust2) { + ksft_test_result_skip("This test is only for new robust interface\n"); + return; + } + + h =3D (struct robust_list_head *) ((uintptr_t) &head + 1); + ret =3D set_list(h); + ASSERT_EQ(ret, -1); + ASSERT_EQ(errno, EINVAL); +} + void usage(char *prog) { printf("Usage: %s\n", prog); printf(" -c Use color\n"); printf(" -h Display this help message\n"); + printf(" -n Use robust2 syscall\n"); printf(" -v L Verbosity level: %d=3DQUIET %d=3DCRITICAL %d=3DINFO\n", VQUIET, VCRITICAL, VINFO); } @@ -520,7 +667,7 @@ int main(int argc, char *argv[]) { int c; =20 - while ((c =3D getopt(argc, argv, "cht:v:")) !=3D -1) { + while ((c =3D getopt(argc, argv, "chnt:v:")) !=3D -1) { switch (c) { case 'c': log_color(1); @@ -531,6 +678,9 @@ int main(int argc, char *argv[]) case 'v': log_verbosity(atoi(optarg)); break; + case 'n': + robust2 =3D true; + break; default: usage(basename(argv[0])); exit(1); @@ -538,7 +688,7 @@ int main(int argc, char *argv[]) } =20 ksft_print_header(); - ksft_set_plan(7); + ksft_set_plan(8); =20 test_robustness(); =20 @@ -548,6 +698,8 @@ int main(int argc, char *argv[]) test_set_list_op_pending(); test_robust_list_multiple_elements(); test_circular_list(); + test_robust_list_limit(); + test_unaligned_address(); =20 ksft_print_cnts(); return 0; --=20 2.49.0