From nobody Thu Apr 9 16:20:33 2026 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E4868279917 for ; Tue, 3 Mar 2026 03:02:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772506927; cv=none; b=e4nAqrOSoFftmA/XnSRUwbmOJ7Gwf7WwXNnYvVchGEQ+hJ73ddwt/YDmkHxT76n3W2BTfyr9jcmTVm43Ad5pZSsg7CnR6mc+iTl2CAJ/IdYqJsmtovLTLLbeW/nL33O3krOylF6zqlJMZwvpPeCig+WKj4D5OtHR/nuENaawQ+Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1772506927; c=relaxed/simple; bh=sozevw8Fd7pd9tP5pEfWjMDsEs30LBg9qUXiweHtMVM=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=GtIMPgZPTO8qpmB+lwsQTQ2j98Yr/pOcskabpX8xxpwFdaEaZbdLfLMC7OI7w30YuiaIjMfbJWwyKeiriVCUlUd97GCKYwj73knzm5fLw44mGQNczFM4/ru9vXF/t31vTnR42OiXOLtqhqBnDWA3Jnb/h27++NivXOGkO/e8jGk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=F1D1ykId; arc=none smtp.client-ip=209.85.214.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="F1D1ykId" Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-2ae3afe0192so721275ad.0 for ; Mon, 02 Mar 2026 19:02:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1772506925; x=1773111725; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:from:to:cc:subject:date:message-id:reply-to; bh=wnHsUSm2Hk8j6XA2nejKZ0zEu5MYewBoOclv0khVbl0=; b=F1D1ykIdAvYNetO3ofVGvHBAH0LzOOCjUy1uCTiW9tM027nkFn2ypjIdG3SCBNnsG2 9aW7qB4Ta7IXEloU71v16rofBOMuaznJL99bhLE1v2ID/asLD2UkJeoqYwJ86vKVrIlQ UVKdHLWS4b6HEKkxSnCFEMdNhYNwFHAYmvk4QJKDiAKr4CeRYF85uaIbAHKjLxI4hViD jW1dtFw4SsAr9PEIf5V4I1y1F1bllk+BfW8RvJ0o533K8p6tECMymEf9B7zTWMmuAQrV F8/ETo7PqkrN+PWx6XSC11KqUQpaWEEycgTvsrPAUKE5cRMilM+RuzHInSNoMT8GfEpg W2vA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772506925; x=1773111725; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=wnHsUSm2Hk8j6XA2nejKZ0zEu5MYewBoOclv0khVbl0=; b=mu/tlUKIZ4Fa8lbPrDgO+1HpVISJBW7aX2TweCYhvSabZxghCDjhTKMgsPF4i4pdpf 4I+qgIovI2AFfm4ykoZ+juuiE637K4CGXX80p3cDfFBTW7/jbKD0s4hz6n9XxZaWwNAY I42XQn0qecSEfmGSSYJJ2yPm+VUmTVORtmbb0jxAvZHyEJpx0L8I3Jk1o9kTSf6+U1Hq +uUIrGbBMTghl5kJGF9hUaXoUWow/c5lkarjuQOCEh7DBIbAsVtB5AtZBRDUiTa2p4WQ GwatOSE8JlGY25j8llkAjEqQoKv5GKOsKp2aQ/DxbHUZTw2sGxJnUlGMIOTP97Mw/slz iiNQ== X-Forwarded-Encrypted: i=1; AJvYcCX6ZWl+SEs33IhhHcTTPpYXvXLdzf2PEJQqZwV7hmQOFfdPRpvxhLms4tIH8gDG58VhoT1UTxNL2geBeUg=@vger.kernel.org X-Gm-Message-State: AOJu0YzYCZ20utv+mR8vxjYpdvvXi+SqiBsgfjN2j0fOb28MQsMuvJre qdbnmJ88dLxormnaEsVrHauIRHKyZMC9PaUL7u+ABK+/45lDUtKSXTbM X-Gm-Gg: ATEYQzxwybnMmQAvZd0NxDeC0K7emLba7ccazPB2Ubn3zlfazi/4JjspTk6iuFbCBvX lKvbsGxI8cCJf06+UXKnU0r2lN+w+Z3c/w/u6zeAbZTHuW6BKBDom9ly93N8b4zUXb7xrcjAJ9A aan9HiAuhRU39MilerlBuWRbTXauYrEg/OMFkzQ95fY4RMkPAonqgyw7R3sB2XlQqbY0MseS0FJ +WDS2xBmqDJSjQmb2HoAx17Jc+/HSdjoPQXXF/sv8IH7BJtfBXFsN5rD8Hiz9QW0/wNfzdN6FGd 3pzxRHAPohvV6GN4vGcuo2B615aCwnxqc+FWnTeLaQqSsLLq+mlElQxBqw6Kes9sW1+lhiQTInH LiS+JAHqysUiqrBrVRW21xRq3yWLeqaVqZIuvQ7Ii8NJL1BblIe7VPiVSKT5LrkulBoRl0EzsMH ucioWA1m5ZTgPCo3FLTsw+wYJRZSoXhFi4kTNU+PSH4JzvOSK0dbGOir5bw/A9a3U7TX22dHd+E zczZovoVQ== X-Received: by 2002:a17:902:f103:b0:2ae:3f3f:67c4 with SMTP id d9443c01a7336-2ae3f3f7019mr41136625ad.0.1772506924872; Mon, 02 Mar 2026 19:02:04 -0800 (PST) Received: from 3ce1e5d2d1b2.cse.ust.hk (191host009.mobilenet.cse.ust.hk. [143.89.191.9]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-82739ff372fsm14583211b3a.31.2026.03.02.19.02.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 02 Mar 2026 19:02:04 -0800 (PST) From: Chengfeng Ye To: tglx@kernel.org, mingo@redhat.com, peterz@infradead.org, linux-kernel@vger.kernel.org Cc: dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, bigeasy@linutronix.de, security@kernel.org, Chengfeng Ye Subject: [PATCH] futex: fix NUMA node publication race causing missed wakeups Date: Tue, 3 Mar 2026 03:01:00 +0000 Message-Id: <20260303030100.819744-1-dg573847474@gmail.com> X-Mailer: git-send-email 2.25.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" get_futex_key() publishes the FUTEX2_NUMA node side word in userspace. The publication path used a non-atomic read/compute/write sequence, so concurrent callers could overwrite each other during initialization. This race can make concurrent operations on the same futex derive different node values while the NUMA hint is being initialized, resulting in inconsistent futex keying between wait and wake sides. In practice this can lead to missed wakeups; at user level, missed wakeups can manifest as threads waiting indefinitely (application-level deadlock/hang). PoC description (see Link below): - two threads repeatedly exercising FUTEX2_NUMA wait/wake on the same futex, - waiter and waker pinned to CPUs from different NUMA nodes, - waker continuously issuing wake calls while waiter performs 10-second timed waits. PoC output on unpatched kernel (wake sigal missed and waiter timeout): - observed on Linux v7.0-rc2 running in qemu-system-x86_64 with 4 vCPUs Using CPU 0 (waiter) and CPU 2 (waker) from different NUMA nodes [TRIGGER EVENT #1] iter=3D38 timed out (futex.node=3D1) [TRIGGER EVENT #2] iter=3D85 timed out (futex.node=3D1) [TRIGGER EVENT #3] iter=3D95 timed out (futex.node=3D1) Fix by making node-hint publication publish-once via atomic cmpxchg on naddr (FUTEX_NO_NODE -> computed node), retrying transient -EAGAIN, and adopting/validating the winner value on contention. Fixes: c042c505210d ("futex: Implement FUTEX2_MPOL") Link: https://gist.github.com/Ychame/d4a5e95401a471f4211a751734b5d164 Signed-off-by: Chengfeng Ye --- kernel/futex/core.c | 20 ++++++++++++++++++-- 1 file changed, 18 insertions(+), 2 deletions(-) diff --git a/kernel/futex/core.c b/kernel/futex/core.c index cf7e610eac42..d45612b36e30 100644 --- a/kernel/futex/core.c +++ b/kernel/futex/core.c @@ -596,13 +596,29 @@ int get_futex_key(u32 __user *uaddr, unsigned int fla= gs, union futex_key *key, =20 if (flags & FLAGS_NUMA) { u32 __user *naddr =3D (void *)uaddr + size / 2; + u32 old_node; =20 if (node =3D=3D FUTEX_NO_NODE) { node =3D numa_node_id(); node_updated =3D true; } - if (node_updated && put_user_inline(node, naddr)) - return -EFAULT; + if (node_updated) { +retry_numa_node: + err =3D futex_cmpxchg_value_locked(&old_node, naddr, + FUTEX_NO_NODE, (u32)node); + if (err =3D=3D -EAGAIN) { + cond_resched(); + goto retry_numa_node; + } + if (err) + return err; + if (old_node !=3D FUTEX_NO_NODE) { + node =3D old_node; + if ((unsigned int)node >=3D MAX_NUMNODES || + !node_possible(node)) + return -EINVAL; + } + } } =20 key->both.node =3D node; --=20 2.25.1