From nobody Tue Apr 7 19:55:53 2026 Received: from smtpout.efficios.com (smtpout.efficios.com [158.69.130.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7BBAA2BE034 for ; Wed, 11 Mar 2026 19:02:12 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=158.69.130.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773255734; cv=none; b=BIp+gkCj6Bt0lP1wT8FeVNzLiHhZEE2V7tfrsPtTa7efbdvx4p3+8xgtgl7RnCCc3Ulsbd8nJoh24NY7vKF2of0fOEIbixMyfDU6nNszuIYATwOtfQVkf9GID0rkzqcizaESYLYYzUTbSyac1fTPEf0HlYD+bX3aERsBsTW+PGE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773255734; c=relaxed/simple; bh=XXBF10hICvheWqn4Jqu5fzu3jVC9hkxuaQB4mR22zNM=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version:Content-Type; b=TqYPwUfpv0QWSZZ1G4aGvM5Z0fTehyhdwm2n99v96DX5UHJrl4zeNKtnKnf+l6GuPHTbTf77Gn44jx70TREaAZqPNeIq/YF2NjwwFjafledLbof4HLxVOe4eOV5Moz2wLc9acNrXnpK0AHJu2Sm2R+svslhZ+hVb5xTzcDEGGDA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com; spf=pass smtp.mailfrom=efficios.com; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b=mrWK7kgW; arc=none smtp.client-ip=158.69.130.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=efficios.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=efficios.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=efficios.com header.i=@efficios.com header.b="mrWK7kgW" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=smtpout1; t=1773255269; bh=1XivdGmjH5/eV9S3MxD2fHiRiPjz4O0khJX1Xg0dB/E=; h=From:To:Cc:Subject:Date:From; b=mrWK7kgWYQGTNc/+MnzzxmqTRk/QIf5ZYrQPWqiP0TYRm6ExS87MfKSJY21ePafYy 0rCkVy3WdvLi+CVMZw1gvlPk/+jEz5oT0q3c1tNAP2GUjOUATMhb+4fWnuesJqPedV gn+2+WJz+jC5rC0vXc1hMij9M7wRrcvKmTgQra5QOrgu1mrER+s45WbISneJhv5jMa EzMrlwTe+v/mExu7UroVm5eCGt6Q7nRULR0SUXcI4CSiySngrWKQ2b/CnBU/Prs+GI H0+QscvxDZ6w5o8eWmX71/cMRYAsvp7nZgd6znVIvTP4zPc7yPzeLfVLAE8wc/06gM yzKJcNZ5h3EpA== Received: from thinkos.internal.efficios.com (unknown [IPv6:2606:6d00:100:4000:7221:3f71:6685:3ee6]) by smtpout.efficios.com (Postfix) with ESMTPSA id 4fWKfs3bJczNR3; Wed, 11 Mar 2026 14:54:29 -0400 (EDT) From: Mathieu Desnoyers To: =?UTF-8?q?Andr=C3=A9=20Almeida?= Cc: linux-kernel@vger.kernel.org, Mathieu Desnoyers , Carlos O'Donell , Sebastian Andrzej Siewior , Peter Zijlstra , Florian Weimer , Rich Felker , Torvald Riegel , Darren Hart , Thomas Gleixner , Ingo Molnar , Davidlohr Bueso , Arnd Bergmann , "Liam R . Howlett" Subject: [RFC PATCH] futex: Introduce __vdso_robust_futex_unlock Date: Wed, 11 Mar 2026 14:54:09 -0400 Message-Id: <20260311185409.1988269-1-mathieu.desnoyers@efficios.com> X-Mailer: git-send-email 2.39.5 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable This vDSO unlocks the robust futex by exchanging the content of *uaddr with 0 with a store-release semantic. If the futex has waiters, it sets bit 1 of *op_pending_addr, else it clears *op_pending_addr. Those operations are within a code region known by the kernel, making them safe with respect to asynchronous program termination either from thread context or from a nested signal handler. Expected use of this vDSO: if ((__vdso_robust_futex_unlock((u32 *) &mutex->__data.__lock, &pd->robust_= head.list_op_pending) & FUTEX_WAITERS) !=3D 0) futex_wake((u32 *) &mutex->__data.__lock, 1, private); WRITE_ONCE(pd->robust_head.list_op_pending, 0); This fixes a long standing data corruption race condition with robust futexes, as pointed out here: "File corruption race condition in robust mutex unlocking" https://sourceware.org/bugzilla/show_bug.cgi?id=3D14485 Known limitation: this only takes care of non-PI futexes. The approach taken by this vDSO is to extend the x86 vDSO exception table to track the relevant ip range. The two kernel execution paths impacted by this change are: 1) Process exit 2) Signal delivery [ This patch is lightly compiled tested only, submitted for feedback. ] Link: https://lore.kernel.org/lkml/20260220202620.139584-1-andrealmeid@igal= ia.com/ Signed-off-by: Mathieu Desnoyers Cc: "Andr=C3=A9 Almeida" Cc: Carlos O'Donell Cc: Sebastian Andrzej Siewior Cc: Peter Zijlstra Cc: Florian Weimer Cc: Rich Felker Cc: Torvald Riegel Cc: Darren Hart Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Davidlohr Bueso Cc: Arnd Bergmann Cc: "Liam R . Howlett" --- arch/x86/entry/vdso/common/vfutex.c | 51 ++++++++++ arch/x86/entry/vdso/extable.c | 54 +++++++++- arch/x86/entry/vdso/extable.h | 26 +++-- arch/x86/entry/vdso/vdso64/Makefile | 1 + arch/x86/entry/vdso/vdso64/vfutex.c | 1 + arch/x86/entry/vdso/vdso64/vsgx.S | 2 +- arch/x86/include/asm/vdso.h | 3 + arch/x86/kernel/signal.c | 4 + include/linux/futex.h | 1 + include/vdso/futex.h | 35 +++++++ kernel/futex/core.c | 151 ++++++++++++++++++++++++---- 11 files changed, 296 insertions(+), 33 deletions(-) create mode 100644 arch/x86/entry/vdso/common/vfutex.c create mode 100644 arch/x86/entry/vdso/vdso64/vfutex.c create mode 100644 include/vdso/futex.h diff --git a/arch/x86/entry/vdso/common/vfutex.c b/arch/x86/entry/vdso/comm= on/vfutex.c new file mode 100644 index 000000000000..fe730e0d3dfa --- /dev/null +++ b/arch/x86/entry/vdso/common/vfutex.c @@ -0,0 +1,51 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Copyright (C) 2026 Mathieu Desnoyers + */ +#include +#include +#include "extable.h" + +#ifdef CONFIG_X86_64 +# define ASM_PTR_BIT_SET "btsq " +# define ASM_PTR_SET "movq " +#else +# define ASM_PTR_BIT_SET "btsl " +# define ASM_PTR_SET "movl " +#endif + +u32 __vdso_robust_futex_unlock(u32 *uaddr, uintptr_t *op_pending_addr) +{ + u32 val =3D 0; + + /* + * Within the ip range identified by the futex exception table, + * the register "eax" contains the value loaded by xchg. This is + * expected by futex_vdso_exception() to check whether waiters + * need to be woken up. This register state is transferred to + * bit 1 (NEED_WAKEUP) of *op_pending_addr before the ip range + * ends. + */ + asm volatile ( _ASM_VDSO_EXTABLE_FUTEX_HANDLE(1f, 3f) + /* Exchange uaddr (store-release). */ + "xchg %[uaddr], %[val]\n\t" + "1:\n\t" + /* Test if FUTEX_WAITERS (0x80000000) is set. */ + "test %[val], %[val]\n\t" + "js 2f\n\t" + /* Clear *op_pending_addr if there are no waiters. */ + ASM_PTR_SET "$0, %[op_pending_addr]\n\t" + "jmp 3f\n\t" + "2:\n\t" + /* Set bit 1 (NEED_WAKEUP) in *op_pending_addr. */ + ASM_PTR_BIT_SET "$1, %[op_pending_addr]\n\t" + "3:\n\t" + : [val] "+a" (val), + [uaddr] "+m" (*uaddr) + : [op_pending_addr] "m" (*op_pending_addr) + : "memory"); + return val; +} + +u32 robust_futex_unlock(u32 *, uintptr_t *) + __attribute__((weak, alias("__vdso_robust_futex_unlock"))); diff --git a/arch/x86/entry/vdso/extable.c b/arch/x86/entry/vdso/extable.c index afcf5b65beef..a668fc2c93dd 100644 --- a/arch/x86/entry/vdso/extable.c +++ b/arch/x86/entry/vdso/extable.c @@ -1,12 +1,26 @@ // SPDX-License-Identifier: GPL-2.0 #include #include +#include #include #include #include =20 +enum vdso_extable_entry_type { + VDSO_EXTABLE_ENTRY_FIXUP =3D 0, + VDSO_EXTABLE_ENTRY_FUTEX =3D 1, +}; + struct vdso_exception_table_entry { - int insn, fixup; + int type; /* enum vdso_extable_entry_type */ + union { + struct { + int insn, fixup_insn; + } fixup; + struct { + int start, end; + } futex; + }; }; =20 bool fixup_vdso_exception(struct pt_regs *regs, int trapnr, @@ -33,8 +47,10 @@ bool fixup_vdso_exception(struct pt_regs *regs, int trap= nr, extable =3D image->extable; =20 for (i =3D 0; i < nr_entries; i++) { - if (regs->ip =3D=3D base + extable[i].insn) { - regs->ip =3D base + extable[i].fixup; + if (extable[i].type !=3D VDSO_EXTABLE_ENTRY_FIXUP) + continue; + if (regs->ip =3D=3D base + extable[i].fixup.insn) { + regs->ip =3D base + extable[i].fixup.fixup_insn; regs->di =3D trapnr; regs->si =3D error_code; regs->dx =3D fault_addr; @@ -44,3 +60,35 @@ bool fixup_vdso_exception(struct pt_regs *regs, int trap= nr, =20 return false; } + +void futex_vdso_exception(struct pt_regs *regs, + bool *_in_futex_vdso, + bool *_need_wakeup) +{ + const struct vdso_image *image =3D current->mm->context.vdso_image; + const struct vdso_exception_table_entry *extable; + bool in_futex_vdso =3D false, need_wakeup =3D false; + unsigned int nr_entries, i; + unsigned long base; + + if (!current->mm->context.vdso) + goto end; + + base =3D (unsigned long)current->mm->context.vdso + image->extable_base; + nr_entries =3D image->extable_len / (sizeof(*extable)); + extable =3D image->extable; + + for (i =3D 0; i < nr_entries; i++) { + if (extable[i].type !=3D VDSO_EXTABLE_ENTRY_FUTEX) + continue; + if (regs->ip >=3D base + extable[i].futex.start && + regs->ip < base + extable[i].futex.end) { + in_futex_vdso =3D true; + if (regs->ax & FUTEX_WAITERS) + need_wakeup =3D true; + } + } +end: + *_in_futex_vdso =3D in_futex_vdso; + *_need_wakeup =3D need_wakeup; +} diff --git a/arch/x86/entry/vdso/extable.h b/arch/x86/entry/vdso/extable.h index baba612b832c..7251467ad210 100644 --- a/arch/x86/entry/vdso/extable.h +++ b/arch/x86/entry/vdso/extable.h @@ -8,20 +8,32 @@ * exception table, not each individual entry. */ #ifdef __ASSEMBLER__ -#define _ASM_VDSO_EXTABLE_HANDLE(from, to) \ - ASM_VDSO_EXTABLE_HANDLE from to +#define _ASM_VDSO_EXTABLE_FIXUP_HANDLE(from, to) \ + ASM_VDSO_EXTABLE_FIXUP_HANDLE from to =20 -.macro ASM_VDSO_EXTABLE_HANDLE from:req to:req +.macro ASM_VDSO_EXTABLE_FIXUP_HANDLE from:req to:req .pushsection __ex_table, "a" + .long 0 /* type: fixup */ .long (\from) - __ex_table .long (\to) - __ex_table .popsection .endm #else -#define _ASM_VDSO_EXTABLE_HANDLE(from, to) \ - ".pushsection __ex_table, \"a\"\n" \ - ".long (" #from ") - __ex_table\n" \ - ".long (" #to ") - __ex_table\n" \ +#define _ASM_VDSO_EXTABLE_FIXUP_HANDLE(from, to) \ + ".pushsection __ex_table, \"a\"\n" \ + ".long 0\n" /* type: fixup */ \ + ".long (" #from ") - __ex_table\n" \ + ".long (" #to ") - __ex_table\n" \ + ".popsection\n" + +/* + * Identify robust futex unlock critical section. + */ +#define _ASM_VDSO_EXTABLE_FUTEX_HANDLE(start, end) \ + ".pushsection __ex_table, \"a\"\n" \ + ".long 1\n" /* type: futex */ \ + ".long (" #start ") - __ex_table\n" \ + ".long (" #end ") - __ex_table\n" \ ".popsection\n" #endif =20 diff --git a/arch/x86/entry/vdso/vdso64/Makefile b/arch/x86/entry/vdso/vdso= 64/Makefile index bfffaf1aeecc..df53c2d0037d 100644 --- a/arch/x86/entry/vdso/vdso64/Makefile +++ b/arch/x86/entry/vdso/vdso64/Makefile @@ -10,6 +10,7 @@ vdsos-$(CONFIG_X86_X32_ABI) +=3D x32 # Files to link into the vDSO: vobjs-y :=3D note.o vclock_gettime.o vgetcpu.o vobjs-y +=3D vgetrandom.o vgetrandom-chacha.o +vobjs-y +=3D vfutex.o vobjs-$(CONFIG_X86_SGX) +=3D vsgx.o =20 # Compilation flags diff --git a/arch/x86/entry/vdso/vdso64/vfutex.c b/arch/x86/entry/vdso/vdso= 64/vfutex.c new file mode 100644 index 000000000000..940a6ee30026 --- /dev/null +++ b/arch/x86/entry/vdso/vdso64/vfutex.c @@ -0,0 +1 @@ +#include "common/vfutex.c" diff --git a/arch/x86/entry/vdso/vdso64/vsgx.S b/arch/x86/entry/vdso/vdso64= /vsgx.S index 37a3d4c02366..0ea5a1ebd455 100644 --- a/arch/x86/entry/vdso/vdso64/vsgx.S +++ b/arch/x86/entry/vdso/vdso64/vsgx.S @@ -145,6 +145,6 @@ SYM_FUNC_START(__vdso_sgx_enter_enclave) =20 .cfi_endproc =20 -_ASM_VDSO_EXTABLE_HANDLE(.Lenclu_eenter_eresume, .Lhandle_exception) +_ASM_VDSO_EXTABLE_FIXUP_HANDLE(.Lenclu_eenter_eresume, .Lhandle_exception) =20 SYM_FUNC_END(__vdso_sgx_enter_enclave) diff --git a/arch/x86/include/asm/vdso.h b/arch/x86/include/asm/vdso.h index e8afbe9faa5b..77e465fb373c 100644 --- a/arch/x86/include/asm/vdso.h +++ b/arch/x86/include/asm/vdso.h @@ -38,6 +38,9 @@ extern int map_vdso_once(const struct vdso_image *image, = unsigned long addr); extern bool fixup_vdso_exception(struct pt_regs *regs, int trapnr, unsigned long error_code, unsigned long fault_addr); +extern void futex_vdso_exception(struct pt_regs *regs, + bool *in_futex_vdso, + bool *need_wakeup); #endif /* __ASSEMBLER__ */ =20 #endif /* _ASM_X86_VDSO_H */ diff --git a/arch/x86/kernel/signal.c b/arch/x86/kernel/signal.c index 2404233336ab..c2e4db89f16d 100644 --- a/arch/x86/kernel/signal.c +++ b/arch/x86/kernel/signal.c @@ -28,6 +28,7 @@ #include #include #include +#include =20 #include #include @@ -235,6 +236,9 @@ unsigned long get_sigframe_size(void) static int setup_rt_frame(struct ksignal *ksig, struct pt_regs *regs) { + /* Handle futex robust list fixup. */ + futex_signal_deliver(ksig, regs); + /* Perform fixup for the pre-signal frame. */ rseq_signal_deliver(ksig, regs); =20 diff --git a/include/linux/futex.h b/include/linux/futex.h index 9e9750f04980..6c274c79e176 100644 --- a/include/linux/futex.h +++ b/include/linux/futex.h @@ -81,6 +81,7 @@ void futex_exec_release(struct task_struct *tsk); long do_futex(u32 __user *uaddr, int op, u32 val, ktime_t *timeout, u32 __user *uaddr2, u32 val2, u32 val3); int futex_hash_prctl(unsigned long arg2, unsigned long arg3, unsigned long= arg4); +void futex_signal_deliver(struct ksignal *ksig, struct pt_regs *regs); =20 #ifdef CONFIG_FUTEX_PRIVATE_HASH int futex_hash_allocate_default(void); diff --git a/include/vdso/futex.h b/include/vdso/futex.h new file mode 100644 index 000000000000..1e949ac1ed85 --- /dev/null +++ b/include/vdso/futex.h @@ -0,0 +1,35 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Copyright (C) 2026 Mathieu Desnoyers + */ + +#ifndef _VDSO_FUTEX_H +#define _VDSO_FUTEX_H + +#include + +/** + * __vdso_robust_futex_unlock - Architecture-specific vDSO implementation = of robust futex unlock. + * @uaddr: Lock address (points to a 32-bit unsigned integer type). + * @op_pending_addr: Robust list operation pending address (points to a po= inter type). + * + * This vDSO unlocks the robust futex by exchanging the content of + * *uaddr with 0 with a store-release semantic. If the futex has + * waiters, it sets bit 1 of *op_pending_addr, else it clears + * *op_pending_addr. Those operations are within a code region + * known by the kernel, making them safe with respect to asynchronous + * program termination either from thread context or from a nested + * signal handler. + * + * Expected use of this vDSO: + * + * if ((__vdso_robust_futex_unlock((u32 *) &mutex->__data.__lock, &pd->rob= ust_head.list_op_pending) + * & FUTEX_WAITERS) !=3D 0) + * futex_wake((u32 *) &mutex->__data.__lock, 1, private); + * WRITE_ONCE(pd->robust_head.list_op_pending, 0); + * + * Returns: The old value present at *uaddr. + */ +extern u32 __vdso_robust_futex_unlock(u32 *uaddr, uintptr_t *op_pending_ad= dr); + +#endif /* _VDSO_FUTEX_H */ diff --git a/kernel/futex/core.c b/kernel/futex/core.c index cf7e610eac42..92c0f94c8077 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -48,6 +48,10 @@ #include "futex.h" #include "../locking/rtmutex_common.h" =20 +#define FUTEX_UADDR_PI (1UL << 0) +#define FUTEX_UADDR_NEED_WAKEUP (1UL << 1) +#define FUTEX_UADDR_MASK (~(FUTEX_UADDR_PI | FUTEX_UADDR_NEED_WAKEUP)) + /* * The base of the bucket array and its size are always used together * (after initialization only in futex_hash()), so ensure that they @@ -1004,6 +1008,77 @@ void futex_unqueue_pi(struct futex_q *q) q->pi_state =3D NULL; } =20 +/* + * Transfer the need wakeup state from vDSO stack to the + * FUTEX_UADDR_NEED_WAKEUP list_op_pending bit so it's observed if the + * program is terminated while executing the signal handler. + */ +static void signal_delivery_fixup_robust_list(struct task_struct *curr, st= ruct pt_regs *regs) +{ + struct robust_list_head __user *head =3D curr->robust_list; + bool in_futex_vdso, need_wakeup; + unsigned long pending; + + if (!head) + return; + futex_vdso_exception(regs, &in_futex_vdso, &need_wakeup); + if (!in_futex_vdso) + return; + if (need_wakeup) { + if (get_user(pending, (unsigned long __user *)&head->list_op_pending)) + goto fault; + pending |=3D FUTEX_UADDR_NEED_WAKEUP; + if (put_user(pending, (unsigned long __user *)&head->list_op_pending)) + goto fault; + } else { + if (put_user(0UL, (unsigned long __user *)&head->list_op_pending)) + goto fault; + } + return; +fault: + force_sig(SIGSEGV); +} + +#ifdef CONFIG_COMPAT +static void compat_signal_delivery_fixup_robust_list(struct task_struct *c= urr, struct pt_regs *regs) +{ + struct compat_robust_list_head __user *head =3D curr->compat_robust_list; + bool in_futex_vdso, need_wakeup; + unsigned int pending; + + if (!head) + return; + futex_vdso_exception(regs, &in_futex_vdso, &need_wakeup); + if (!in_futex_vdso) + return; + if (need_wakeup) { + if (get_user(pending, (compat_uptr_t __user *)&head->list_op_pending)) + goto fault; + pending |=3D FUTEX_UADDR_NEED_WAKEUP; + if (put_user(pending, (compat_uptr_t __user *)&head->list_op_pending)) + goto fault; + } else { + if (put_user(0U, (compat_uptr_t __user *)&head->list_op_pending)) + goto fault; + } + return; +fault: + force_sig(SIGSEGV); +} +#endif + +void futex_signal_deliver(struct ksignal *ksig, struct pt_regs *regs) +{ + struct task_struct *tsk =3D current; + + if (unlikely(tsk->robust_list)) + signal_delivery_fixup_robust_list(tsk, regs); +#ifdef CONFIG_COMPAT + if (unlikely(tsk->compat_robust_list)) + compat_signal_delivery_fixup_robust_list(tsk, regs); +#endif +} + /* Constants for the pending_op argument of handle_futex_death */ #define HANDLE_DEATH_PENDING true #define HANDLE_DEATH_LIST false @@ -1013,12 +1088,31 @@ void futex_unqueue_pi(struct futex_q *q) * dying task, and do notification if so: */ static int handle_futex_death(u32 __user *uaddr, struct task_struct *curr, - bool pi, bool pending_op) + bool pi, bool pending_op, bool need_wakeup) { + bool unlock_store_done =3D false; u32 uval, nval, mval; pid_t owner; int err; =20 + /* + * Process dies after the store unlocking futex, before clearing + * the pending ops. Wake up one waiter if needed. Prevent + * storing to the futex after it was unlocked. Only handle + * non-PI futex. + */ + if (pending_op && !pi) { + bool in_futex_vdso, vdso_need_wakeup; + + futex_vdso_exception(task_pt_regs(curr), &in_futex_vdso, &vdso_need_wake= up); + if (need_wakeup || vdso_need_wakeup) { + futex_wake(uaddr, FLAGS_SIZE_32 | FLAGS_SHARED, 1, + FUTEX_BITSET_MATCH_ANY); + } + if (need_wakeup || in_futex_vdso) + return 0; + } + /* Futex address must be 32bit aligned */ if ((((unsigned long)uaddr) % sizeof(*uaddr)) !=3D 0) return -1; @@ -1071,6 +1165,13 @@ static int handle_futex_death(u32 __user *uaddr, str= uct task_struct *curr, return 0; } =20 + /* + * Terminated after the unlock store is done. Wake up waiters, + * but do not change the lock state. + */ + if (unlock_store_done) + return 0; + if (owner !=3D task_pid_vnr(curr)) return 0; =20 @@ -1128,19 +1229,23 @@ static int handle_futex_death(u32 __user *uaddr, st= ruct task_struct *curr, } =20 /* - * Fetch a robust-list pointer. Bit 0 signals PI futexes: + * Fetch a robust-list pointer. Bit 0 signals PI futexes, bit 1 signals + * need wakeup: */ static inline int fetch_robust_entry(struct robust_list __user **entry, struct robust_list __user * __user *head, - unsigned int *pi) + unsigned int *pi, + unsigned int *need_wakeup) { unsigned long uentry; =20 if (get_user(uentry, (unsigned long __user *)head)) return -EFAULT; =20 - *entry =3D (void __user *)(uentry & ~1UL); - *pi =3D uentry & 1; + *entry =3D (void __user *)(uentry & FUTEX_UADDR_MASK); + *pi =3D uentry & FUTEX_UADDR_PI; + if (need_wakeup) + *need_wakeup =3D uentry & FUTEX_UADDR_NEED_WAKEUP; =20 return 0; } @@ -1155,7 +1260,7 @@ static void exit_robust_list(struct task_struct *curr) { struct robust_list_head __user *head =3D curr->robust_list; struct robust_list __user *entry, *next_entry, *pending; - unsigned int limit =3D ROBUST_LIST_LIMIT, pi, pip; + unsigned int limit =3D ROBUST_LIST_LIMIT, pi, pip, need_wakeup; unsigned int next_pi; unsigned long futex_offset; int rc; @@ -1164,7 +1269,7 @@ static void exit_robust_list(struct task_struct *curr) * Fetch the list head (which was registered earlier, via * sys_set_robust_list()): */ - if (fetch_robust_entry(&entry, &head->list.next, &pi)) + if (fetch_robust_entry(&entry, &head->list.next, &pi, NULL)) return; /* * Fetch the relative futex offset: @@ -1175,7 +1280,7 @@ static void exit_robust_list(struct task_struct *curr) * Fetch any possibly pending lock-add first, and handle it * if it exists: */ - if (fetch_robust_entry(&pending, &head->list_op_pending, &pip)) + if (fetch_robust_entry(&pending, &head->list_op_pending, &pip, &need_wake= up)) return; =20 next_entry =3D NULL; /* avoid warning with gcc */ @@ -1184,14 +1289,14 @@ static void exit_robust_list(struct task_struct *cu= rr) * Fetch the next entry in the list before calling * handle_futex_death: */ - rc =3D fetch_robust_entry(&next_entry, &entry->next, &next_pi); + rc =3D fetch_robust_entry(&next_entry, &entry->next, &next_pi, NULL); /* * A pending lock might already be on the list, so * don't process it twice: */ if (entry !=3D pending) { if (handle_futex_death((void __user *)entry + futex_offset, - curr, pi, HANDLE_DEATH_LIST)) + curr, pi, HANDLE_DEATH_LIST, false)) return; } if (rc) @@ -1209,7 +1314,7 @@ static void exit_robust_list(struct task_struct *curr) =20 if (pending) { handle_futex_death((void __user *)pending + futex_offset, - curr, pip, HANDLE_DEATH_PENDING); + curr, pip, HANDLE_DEATH_PENDING, need_wakeup); } } =20 @@ -1224,17 +1329,20 @@ static void __user *futex_uaddr(struct robust_list = __user *entry, } =20 /* - * Fetch a robust-list pointer. Bit 0 signals PI futexes: + * Fetch a robust-list pointer. Bit 0 signals PI futexes, bit 1 signals + * need wakeup: */ static inline int compat_fetch_robust_entry(compat_uptr_t *uentry, struct robust_list __user= **entry, - compat_uptr_t __user *head, unsigned int *pi) + compat_uptr_t __user *head, unsigned int *pi, unsigned int *need_wake= up) { if (get_user(*uentry, head)) return -EFAULT; =20 - *entry =3D compat_ptr((*uentry) & ~1); - *pi =3D (unsigned int)(*uentry) & 1; + *entry =3D compat_ptr((*uentry) & FUTEX_UADDR_MASK); + *pi =3D (unsigned int)(*uentry) & FUTEX_UADDR_PI; + if (need_wakeup) + *need_wakeup =3D (unsigned int)(*uentry) & FUTEX_UADDR_NEED_WAKEUP; =20 return 0; } @@ -1249,7 +1357,7 @@ static void compat_exit_robust_list(struct task_struc= t *curr) { struct compat_robust_list_head __user *head =3D curr->compat_robust_list; struct robust_list __user *entry, *next_entry, *pending; - unsigned int limit =3D ROBUST_LIST_LIMIT, pi, pip; + unsigned int limit =3D ROBUST_LIST_LIMIT, pi, pip, need_wakeup; unsigned int next_pi; compat_uptr_t uentry, next_uentry, upending; compat_long_t futex_offset; @@ -1259,7 +1367,7 @@ static void compat_exit_robust_list(struct task_struc= t *curr) * Fetch the list head (which was registered earlier, via * sys_set_robust_list()): */ - if (compat_fetch_robust_entry(&uentry, &entry, &head->list.next, &pi)) + if (compat_fetch_robust_entry(&uentry, &entry, &head->list.next, &pi, NUL= L)) return; /* * Fetch the relative futex offset: @@ -1271,7 +1379,7 @@ static void compat_exit_robust_list(struct task_struc= t *curr) * if it exists: */ if (compat_fetch_robust_entry(&upending, &pending, - &head->list_op_pending, &pip)) + &head->list_op_pending, &pip, &need_wakeup)) return; =20 next_entry =3D NULL; /* avoid warning with gcc */ @@ -1281,7 +1389,7 @@ static void compat_exit_robust_list(struct task_struc= t *curr) * handle_futex_death: */ rc =3D compat_fetch_robust_entry(&next_uentry, &next_entry, - (compat_uptr_t __user *)&entry->next, &next_pi); + (compat_uptr_t __user *)&entry->next, &next_pi, NULL); /* * A pending lock might already be on the list, so * dont process it twice: @@ -1289,8 +1397,7 @@ static void compat_exit_robust_list(struct task_struc= t *curr) if (entry !=3D pending) { void __user *uaddr =3D futex_uaddr(entry, futex_offset); =20 - if (handle_futex_death(uaddr, curr, pi, - HANDLE_DEATH_LIST)) + if (handle_futex_death(uaddr, curr, pi, HANDLE_DEATH_LIST, false)) return; } if (rc) @@ -1309,7 +1416,7 @@ static void compat_exit_robust_list(struct task_struc= t *curr) if (pending) { void __user *uaddr =3D futex_uaddr(pending, futex_offset); =20 - handle_futex_death(uaddr, curr, pip, HANDLE_DEATH_PENDING); + handle_futex_death(uaddr, curr, pip, HANDLE_DEATH_PENDING, need_wakeup); } } #endif --=20 2.39.5