From nobody Tue Dec 16 10:29:53 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A3B8C001DE for ; Wed, 12 Jul 2023 00:47:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231213AbjGLArR (ORCPT ); Tue, 11 Jul 2023 20:47:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38996 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231154AbjGLArO (ORCPT ); Tue, 11 Jul 2023 20:47:14 -0400 Received: from mail-pl1-x632.google.com (mail-pl1-x632.google.com [IPv6:2607:f8b0:4864:20::632]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 47D6A1720 for ; Tue, 11 Jul 2023 17:47:13 -0700 (PDT) Received: by mail-pl1-x632.google.com with SMTP id d9443c01a7336-1b8c364ad3bso11362185ad.1 for ; Tue, 11 Jul 2023 17:47:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1689122833; x=1691714833; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8G3IrOMNGlFczpaPrC8wvkVroG8Z2Lzb2MR2ORlnT9Y=; b=Vf1GAj6GvM+KaDdgiHg64eKqsxyAT6AMQg5TB9LHR+eDiSwMVsx4LaD8iOrRhentya AudG97oDq9deM0vTzfBmWPseEjk4/5TUQ1gnTjSicsRSl99fsfZD3IJ+2ywyp/9otCQI 0wUpyNtBeoA3N6UTxwqAs7t+eFcb3OAEnMD1jmgRn9M3HYGG2LBLgP1ncJd2kLhsgEBj c+R4XxjMub1ZM3aMdj1IyVy19gNw6KZT05pa7YJImPzgNAy7YqVq6hKNmtEr917lGs5k TPaozL/JMy0nsJ8sr4d3GOgfr9I1JinC+n1z0zswWwr552vv7lVBXwaDCqlVkzcssXzs 2vkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689122833; x=1691714833; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=8G3IrOMNGlFczpaPrC8wvkVroG8Z2Lzb2MR2ORlnT9Y=; b=knlpz6cAR+cZbQqEbsuZAzfhIwIbz5cvOsgAA7+xXQvc4o0/7QtoQyVfESzEZbkUZ2 oJY8umS7L5SPf4PSfA0LUhTEzAF2/to+IaGdJdZLG1sDBjoHBEmzARVA4t9LF517Pzp7 9g2wKRoSAm7i0+Yfgl7k3I2tyt9Ihr1UNB5jarznVJgdoeA/sLlzB9LRMyH2xLHACP04 L5bZ/EIOKKIGKwWxVuaR4iCw/xETkB1LdnxuTAcJbsVQrtqIM3wvdXJuxKcpeZloPZtj vNXCNI16NZD8ZVKNpAXbEBfWk7pnweycpN5Hm8ZwhV7/QPDUwJkt6+BG9cnLeLiw1k2N 8sLw== X-Gm-Message-State: ABy/qLZp7ddfmwnK/oYAErdZFlCXbyqYuyOZJ2XnrDBv3Tuly66A2AmH dbB9a5618O6nItNSh7u/iFv2Rg== X-Google-Smtp-Source: APBJJlG8unC4NI3vyoRqEhxk2e87ftX0VFPToHU5Br3bIxKcqYJe9ZSi3mf0hLaBfoR5QF9T8q+oog== X-Received: by 2002:a17:902:f683:b0:1b1:9272:55e2 with SMTP id l3-20020a170902f68300b001b1927255e2mr21727777plg.3.1689122832734; Tue, 11 Jul 2023 17:47:12 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id s8-20020a170902b18800b001b694140d96sm2543542plr.170.2023.07.11.17.47.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 17:47:11 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, Jens Axboe Subject: [PATCH 1/7] futex: abstract out futex_op_to_flags() helper Date: Tue, 11 Jul 2023 18:46:59 -0600 Message-Id: <20230712004705.316157-2-axboe@kernel.dk> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230712004705.316157-1-axboe@kernel.dk> References: <20230712004705.316157-1-axboe@kernel.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Rather than needing to duplicate this for the io_uring hook of futexes, abstract out a helper. No functional changes intended in this patch. Signed-off-by: Jens Axboe --- kernel/futex/futex.h | 15 +++++++++++++++ kernel/futex/syscalls.c | 11 ++--------- 2 files changed, 17 insertions(+), 9 deletions(-) diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index b5379c0e6d6d..d2949fca37d1 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -291,4 +291,19 @@ extern int futex_unlock_pi(u32 __user *uaddr, unsigned= int flags); =20 extern int futex_lock_pi(u32 __user *uaddr, unsigned int flags, ktime_t *t= ime, int trylock); =20 +static inline bool futex_op_to_flags(int op, int cmd, unsigned int *flags) +{ + if (!(op & FUTEX_PRIVATE_FLAG)) + *flags |=3D FLAGS_SHARED; + + if (op & FUTEX_CLOCK_REALTIME) { + *flags |=3D FLAGS_CLOCKRT; + if (cmd !=3D FUTEX_WAIT_BITSET && cmd !=3D FUTEX_WAIT_REQUEUE_PI && + cmd !=3D FUTEX_LOCK_PI2) + return false; + } + + return true; +} + #endif /* _FUTEX_H */ diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c index a8074079b09e..75ca8c41cc94 100644 --- a/kernel/futex/syscalls.c +++ b/kernel/futex/syscalls.c @@ -88,15 +88,8 @@ long do_futex(u32 __user *uaddr, int op, u32 val, ktime_= t *timeout, int cmd =3D op & FUTEX_CMD_MASK; unsigned int flags =3D 0; =20 - if (!(op & FUTEX_PRIVATE_FLAG)) - flags |=3D FLAGS_SHARED; - - if (op & FUTEX_CLOCK_REALTIME) { - flags |=3D FLAGS_CLOCKRT; - if (cmd !=3D FUTEX_WAIT_BITSET && cmd !=3D FUTEX_WAIT_REQUEUE_PI && - cmd !=3D FUTEX_LOCK_PI2) - return -ENOSYS; - } + if (!futex_op_to_flags(op, cmd, &flags)) + return -ENOSYS; =20 switch (cmd) { case FUTEX_WAIT: --=20 2.40.1 From nobody Tue Dec 16 10:29:53 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 875D0EB64DD for ; Wed, 12 Jul 2023 00:47:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229531AbjGLArU (ORCPT ); Tue, 11 Jul 2023 20:47:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38996 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229512AbjGLArP (ORCPT ); Tue, 11 Jul 2023 20:47:15 -0400 Received: from mail-pl1-x62b.google.com (mail-pl1-x62b.google.com [IPv6:2607:f8b0:4864:20::62b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4CFB10D4 for ; Tue, 11 Jul 2023 17:47:14 -0700 (PDT) Received: by mail-pl1-x62b.google.com with SMTP id d9443c01a7336-1b898cfa6a1so9140965ad.1 for ; Tue, 11 Jul 2023 17:47:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1689122834; x=1689727634; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=hJHEVsPXGWHqsJoWMs2SXGP/dgciG32UfjRi0VGCdeE=; b=PZeJ886W/oAqgibJDnI4kMYditvxSoDVVQKUvsrNa6hJC2RciZA4/XDfJbR/GFiJcN ziVXQ6rMuYZcDOBUm+002+A8xu5/3nUOR+nkF0SskrybnMXQlzcoOyf9QgVa6gOo7Hr5 vH+/r53KV/aDbsyM/bey7jiPFwKUvp3X7j2A/TcHXzjYybSjN2nSQBAutTZ4J0md1epR d2tTzmRdCc6X06ovP75BVgVzDlBQNm9eHwaMb2fa7UeeHFHzmu3gXWREQuOW1djjmt25 VongB2IiDsFM//BzOqjvdT6HPbjxrnHtUf35tzLaB3HKf9zQZEZBiU1Ulvty3mlv22fa ei8A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689122834; x=1689727634; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=hJHEVsPXGWHqsJoWMs2SXGP/dgciG32UfjRi0VGCdeE=; b=JVTRtnpS96RC1V9GS1gSUAQHUcKvU4WCqf7i/pSZ0chfBBNfQxLEDqzDjt8rFCDSos LmgBTTXJYedhfua8Mb6qcy5GI5RjPAaiN3ResfPijasLM9T9iYQVLQ1tWaEMEjYl+dxm Vh+sFvFAR4KwwNysI+ubj8eigTex0WYEIWFVGgS0L1dtn/CicDEnc7wkXrxld+LlULQM hfaBhhvvRwGocLNkpsL3A8YrTS3t0Oicotqz2QsjQY1Av09P0BXEizVmj73XX6rMmusT Dw1jvr9dxxQFQ0q/ElCoZN8Tp8LOP+k198TuC6ZDlA6I8EFXsqsAjROFZzYnRgjIwDUV Tnhg== X-Gm-Message-State: ABy/qLabme9Nd4j8MEkEEO3QknKvXBLxQ7jtRt4nJU4JugLw5ue+HPs0 vQAKrYCIJT2tyLakkbKp2x9FIw== X-Google-Smtp-Source: APBJJlHkC3A+zq7LnnwLbWkAjzljLxsmHgWHDTPfiPM2JKusBK2JcHM6qTyQrQZ+Ys67Lt9qbLklAA== X-Received: by 2002:a17:902:f68c:b0:1b8:17e8:547e with SMTP id l12-20020a170902f68c00b001b817e8547emr21541502plg.1.1689122834238; Tue, 11 Jul 2023 17:47:14 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id s8-20020a170902b18800b001b694140d96sm2543542plr.170.2023.07.11.17.47.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 17:47:13 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, Jens Axboe Subject: [PATCH 2/7] futex: factor out the futex wake handling Date: Tue, 11 Jul 2023 18:47:00 -0600 Message-Id: <20230712004705.316157-3-axboe@kernel.dk> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230712004705.316157-1-axboe@kernel.dk> References: <20230712004705.316157-1-axboe@kernel.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" In preparation for having another waker that isn't futex_wake_mark(), add a wake handler in futex_q. No extra data is associated with the handler outside of struct futex_q itself. futex_wake_mark() is defined as the standard wakeup helper, now set through futex_q_init like other defaults. Signed-off-by: Jens Axboe --- kernel/futex/futex.h | 4 ++++ kernel/futex/requeue.c | 3 ++- kernel/futex/waitwake.c | 6 +++--- 3 files changed, 9 insertions(+), 4 deletions(-) diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index d2949fca37d1..8eaf1a5ce967 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -69,6 +69,9 @@ struct futex_pi_state { union futex_key key; } __randomize_layout; =20 +struct futex_q; +typedef void (futex_wake_fn)(struct wake_q_head *wake_q, struct futex_q *q= ); + /** * struct futex_q - The hashed futex queue entry, one per waiting task * @list: priority-sorted list of tasks waiting on this futex @@ -98,6 +101,7 @@ struct futex_q { =20 struct task_struct *task; spinlock_t *lock_ptr; + futex_wake_fn *wake; union futex_key key; struct futex_pi_state *pi_state; struct rt_mutex_waiter *rt_waiter; diff --git a/kernel/futex/requeue.c b/kernel/futex/requeue.c index cba8b1a6a4cc..e892bc6c41d8 100644 --- a/kernel/futex/requeue.c +++ b/kernel/futex/requeue.c @@ -58,6 +58,7 @@ enum { =20 const struct futex_q futex_q_init =3D { /* list gets initialized in futex_queue()*/ + .wake =3D futex_wake_mark, .key =3D FUTEX_KEY_INIT, .bitset =3D FUTEX_BITSET_MATCH_ANY, .requeue_state =3D ATOMIC_INIT(Q_REQUEUE_PI_NONE), @@ -591,7 +592,7 @@ int futex_requeue(u32 __user *uaddr1, unsigned int flag= s, u32 __user *uaddr2, /* Plain futexes just wake or requeue and are done */ if (!requeue_pi) { if (++task_count <=3D nr_wake) - futex_wake_mark(&wake_q, this); + this->wake(&wake_q, this); else requeue_futex(this, hb1, hb2, &key2); continue; diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c index ba01b9408203..3471af87cb7d 100644 --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -174,7 +174,7 @@ int futex_wake(u32 __user *uaddr, unsigned int flags, i= nt nr_wake, u32 bitset) if (!(this->bitset & bitset)) continue; =20 - futex_wake_mark(&wake_q, this); + this->wake(&wake_q, this); if (++ret >=3D nr_wake) break; } @@ -289,7 +289,7 @@ int futex_wake_op(u32 __user *uaddr1, unsigned int flag= s, u32 __user *uaddr2, ret =3D -EINVAL; goto out_unlock; } - futex_wake_mark(&wake_q, this); + this->wake(&wake_q, this); if (++ret >=3D nr_wake) break; } @@ -303,7 +303,7 @@ int futex_wake_op(u32 __user *uaddr1, unsigned int flag= s, u32 __user *uaddr2, ret =3D -EINVAL; goto out_unlock; } - futex_wake_mark(&wake_q, this); + this->wake(&wake_q, this); if (++op_ret >=3D nr_wake2) break; } --=20 2.40.1 From nobody Tue Dec 16 10:29:53 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2FA4DC001DC for ; Wed, 12 Jul 2023 00:47:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231334AbjGLArY (ORCPT ); Tue, 11 Jul 2023 20:47:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39036 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231218AbjGLArS (ORCPT ); Tue, 11 Jul 2023 20:47:18 -0400 Received: from mail-pl1-x62f.google.com (mail-pl1-x62f.google.com [IPv6:2607:f8b0:4864:20::62f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3D74D10CF for ; Tue, 11 Jul 2023 17:47:16 -0700 (PDT) Received: by mail-pl1-x62f.google.com with SMTP id d9443c01a7336-1b9d9cbcc70so5005815ad.0 for ; Tue, 11 Jul 2023 17:47:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1689122836; x=1689727636; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ceQzGCPNst+4GlWUlUZP0RISAT734e1HDBQMt1evUDU=; b=ddq+ANYGvujKOUtXjecxaLI5DXztouTYpNq8GAIrz1H6iPTUv3MbTzUbk2ARr2Iw3w AHC3UAiff3umcY/Rugtx+vIf8vf40IPc+HaEqBqg9fjGYhIMpSodwAeMaz3lOtYXtGVy xhH3Fm9wxFdBKlS9DBiQFCnPah5j75Pr2hmpj/KUftmIq+fhdCLJERx/zpTvPFgkyVnS 48wL7TrrfwKKLkNFV39q0lLQ/A41hWEDvMnnrod2dZbyFZXJjbmMpWJRsPZLCFfs9BqX Ihw6AmZSgTFUNMLabbbHp3aF9dZblE9Hk+hlX8K98THxo3NO0ue3Ymqo8T8Ze83U3mYy gtJQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689122836; x=1689727636; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ceQzGCPNst+4GlWUlUZP0RISAT734e1HDBQMt1evUDU=; b=SJK3rNsFTu5uCzSFJjnRFjWp9u7yZuRPDG8vQz1Q/RgriV50uSwj+y6N6CnX8YCmBD 3ld0yTCz7RcwLe2NKq5yICOk0qk76eXUqq+ASHJ+98FUzhb7vky0GynJ1yu3Pa2sr18f 9mmZwQDLhtPh4bT8weFo0yGEAkPge8NKLAZ8DmK3xFxTYujOQnvrt+ClvCcg88QT63sJ OcSbDmvokrgDu/WyohAzVgw1FdlwgkU+OPzDUAHBAEG2p90vabEg/mPs6FGbBDHjBJoE u8xZ8MfemNvv/TGYkbWpTpBxawZv8Kyy3ThCodzsnR3QQcNnRSAFWtq4RessNXZ9zKDp dQpw== X-Gm-Message-State: ABy/qLbIH2G8u2Y35xIs+KcXEamXQcyrklAidIciZWE8k2sOc7c8csv2 FOnJvB/chJHXLCMdOY3+WGiYz01HFOFZbQUEv8k= X-Google-Smtp-Source: APBJJlGhtVtPfEatzJ1RbPxVjY0PqbRjHsR20pVh4fV0TVhAUMItaUwjSOBxObRTA2DsaDE3Zu4SRg== X-Received: by 2002:a17:902:da92:b0:1b3:d8ac:8db3 with SMTP id j18-20020a170902da9200b001b3d8ac8db3mr21326179plx.6.1689122835577; Tue, 11 Jul 2023 17:47:15 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id s8-20020a170902b18800b001b694140d96sm2543542plr.170.2023.07.11.17.47.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 17:47:14 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, Jens Axboe Subject: [PATCH 3/7] io_uring: add support for futex wake and wait Date: Tue, 11 Jul 2023 18:47:01 -0600 Message-Id: <20230712004705.316157-4-axboe@kernel.dk> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230712004705.316157-1-axboe@kernel.dk> References: <20230712004705.316157-1-axboe@kernel.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add support for FUTEX_WAKE/WAIT primitives. IORING_OP_FUTEX_WAKE is mix of FUTEX_WAKE and FUTEX_WAKE_BITSET, as it does support passing in a bitset. Similary, IORING_OP_FUTEX_WAIT is a mix of FUTEX_WAIT and FUTEX_WAIT_BITSET. FUTEX_WAKE is straight forward, as we can always just do those inline. FUTEX_WAIT will queue the futex with an appropriate callback, and that callback will in turn post a CQE when it has triggered. Cancelations are supported, both from the application point-of-view, but also to be able to cancel pending waits if the ring exits before all events have occurred. This is just the barebones wait/wake support. PI or REQUEUE support is not added at this point, unclear if we might look into that later. Likewise, explicit timeouts are not supported either. It is expected that users that need timeouts would do so via the usual io_uring mechanism to do that using linked timeouts. Signed-off-by: Jens Axboe --- include/linux/io_uring_types.h | 3 + include/uapi/linux/io_uring.h | 3 + io_uring/Makefile | 4 +- io_uring/cancel.c | 5 + io_uring/cancel.h | 4 + io_uring/futex.c | 232 +++++++++++++++++++++++++++++++++ io_uring/futex.h | 34 +++++ io_uring/io_uring.c | 5 + io_uring/opdef.c | 24 +++- 9 files changed, 312 insertions(+), 2 deletions(-) create mode 100644 io_uring/futex.c create mode 100644 io_uring/futex.h diff --git a/include/linux/io_uring_types.h b/include/linux/io_uring_types.h index f04ce513fadb..a7f03d8d879f 100644 --- a/include/linux/io_uring_types.h +++ b/include/linux/io_uring_types.h @@ -273,6 +273,9 @@ struct io_ring_ctx { struct io_wq_work_list locked_free_list; unsigned int locked_free_nr; =20 + struct hlist_head futex_list; + struct io_alloc_cache futex_cache; + const struct cred *sq_creds; /* cred used for __io_sq_thread() */ struct io_sq_data *sq_data; /* if using sq thread polling */ =20 diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 36f9c73082de..3bd2d765f593 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -65,6 +65,7 @@ struct io_uring_sqe { __u32 xattr_flags; __u32 msg_ring_flags; __u32 uring_cmd_flags; + __u32 futex_flags; }; __u64 user_data; /* data to be passed back at completion time */ /* pack this to avoid bogus arm OABI complaints */ @@ -235,6 +236,8 @@ enum io_uring_op { IORING_OP_URING_CMD, IORING_OP_SEND_ZC, IORING_OP_SENDMSG_ZC, + IORING_OP_FUTEX_WAIT, + IORING_OP_FUTEX_WAKE, =20 /* this goes last, obviously */ IORING_OP_LAST, diff --git a/io_uring/Makefile b/io_uring/Makefile index 8cc8e5387a75..2e4779bc550c 100644 --- a/io_uring/Makefile +++ b/io_uring/Makefile @@ -7,5 +7,7 @@ obj-$(CONFIG_IO_URING) +=3D io_uring.o xattr.o nop.o fs.o = splice.o \ openclose.o uring_cmd.o epoll.o \ statx.o net.o msg_ring.o timeout.o \ sqpoll.o fdinfo.o tctx.o poll.o \ - cancel.o kbuf.o rsrc.o rw.o opdef.o notif.o + cancel.o kbuf.o rsrc.o rw.o opdef.o \ + notif.o obj-$(CONFIG_IO_WQ) +=3D io-wq.o +obj-$(CONFIG_FUTEX) +=3D futex.o diff --git a/io_uring/cancel.c b/io_uring/cancel.c index 7b23607cf4af..3dba8ccb1cd8 100644 --- a/io_uring/cancel.c +++ b/io_uring/cancel.c @@ -15,6 +15,7 @@ #include "tctx.h" #include "poll.h" #include "timeout.h" +#include "futex.h" #include "cancel.h" =20 struct io_cancel { @@ -119,6 +120,10 @@ int io_try_cancel(struct io_uring_task *tctx, struct i= o_cancel_data *cd, if (ret !=3D -ENOENT) return ret; =20 + ret =3D io_futex_cancel(ctx, cd, issue_flags); + if (ret !=3D -ENOENT) + return ret; + spin_lock(&ctx->completion_lock); if (!(cd->flags & IORING_ASYNC_CANCEL_FD)) ret =3D io_timeout_cancel(ctx, cd); diff --git a/io_uring/cancel.h b/io_uring/cancel.h index fc98622e6166..c0a8e7c520b6 100644 --- a/io_uring/cancel.h +++ b/io_uring/cancel.h @@ -1,4 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 +#ifndef IORING_CANCEL_H +#define IORING_CANCEL_H =20 #include =20 @@ -22,3 +24,5 @@ void init_hash_table(struct io_hash_table *table, unsigne= d size); =20 int io_sync_cancel(struct io_ring_ctx *ctx, void __user *arg); bool io_cancel_req_match(struct io_kiocb *req, struct io_cancel_data *cd); + +#endif diff --git a/io_uring/futex.c b/io_uring/futex.c new file mode 100644 index 000000000000..ff0f6b394756 --- /dev/null +++ b/io_uring/futex.c @@ -0,0 +1,232 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include + +#include + +#include "../kernel/futex/futex.h" +#include "io_uring.h" +#include "rsrc.h" +#include "futex.h" + +struct io_futex { + struct file *file; + u32 __user *uaddr; + int futex_op; + unsigned int futex_val; + unsigned int futex_flags; + unsigned int futex_mask; +}; + +struct io_futex_data { + union { + struct futex_q q; + struct io_cache_entry cache; + }; + struct io_kiocb *req; +}; + +void io_futex_cache_init(struct io_ring_ctx *ctx) +{ + io_alloc_cache_init(&ctx->futex_cache, IO_NODE_ALLOC_CACHE_MAX, + sizeof(struct io_futex_data)); +} + +static void io_futex_cache_entry_free(struct io_cache_entry *entry) +{ + kfree(container_of(entry, struct io_futex_data, cache)); +} + +void io_futex_cache_free(struct io_ring_ctx *ctx) +{ + io_alloc_cache_free(&ctx->futex_cache, io_futex_cache_entry_free); +} + +static void io_futex_complete(struct io_kiocb *req, struct io_tw_state *ts) +{ + struct io_futex_data *ifd =3D req->async_data; + struct io_ring_ctx *ctx =3D req->ctx; + + io_tw_lock(ctx, ts); + if (!io_alloc_cache_put(&ctx->futex_cache, &ifd->cache)) + kfree(ifd); + req->async_data =3D NULL; + hlist_del_init(&req->hash_node); + io_req_task_complete(req, ts); +} + +static bool __io_futex_cancel(struct io_ring_ctx *ctx, struct io_kiocb *re= q) +{ + struct io_futex_data *ifd =3D req->async_data; + + /* futex wake already done or in progress */ + if (!futex_unqueue(&ifd->q)) + return false; + + hlist_del_init(&req->hash_node); + io_req_set_res(req, -ECANCELED, 0); + req->io_task_work.func =3D io_futex_complete; + io_req_task_work_add(req); + return true; +} + +int io_futex_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, + unsigned int issue_flags) +{ + struct hlist_node *tmp; + struct io_kiocb *req; + int nr =3D 0; + + if (cd->flags & (IORING_ASYNC_CANCEL_FD|IORING_ASYNC_CANCEL_FD_FIXED)) + return -ENOENT; + + io_ring_submit_lock(ctx, issue_flags); + hlist_for_each_entry_safe(req, tmp, &ctx->futex_list, hash_node) { + if (req->cqe.user_data !=3D cd->data && + !(cd->flags & IORING_ASYNC_CANCEL_ANY)) + continue; + if (__io_futex_cancel(ctx, req)) + nr++; + if (!(cd->flags & IORING_ASYNC_CANCEL_ALL)) + break; + } + io_ring_submit_unlock(ctx, issue_flags); + + if (nr) + return nr; + + return -ENOENT; +} + +bool io_futex_remove_all(struct io_ring_ctx *ctx, struct task_struct *task, + bool cancel_all) +{ + struct hlist_node *tmp; + struct io_kiocb *req; + bool found =3D false; + + lockdep_assert_held(&ctx->uring_lock); + + hlist_for_each_entry_safe(req, tmp, &ctx->futex_list, hash_node) { + if (!io_match_task_safe(req, task, cancel_all)) + continue; + __io_futex_cancel(ctx, req); + found =3D true; + } + + return found; +} + +int io_futex_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + struct io_futex *iof =3D io_kiocb_to_cmd(req, struct io_futex); + + if (unlikely(sqe->addr2 || sqe->buf_index || sqe->addr3)) + return -EINVAL; + + iof->futex_op =3D READ_ONCE(sqe->fd); + iof->uaddr =3D u64_to_user_ptr(READ_ONCE(sqe->addr)); + iof->futex_val =3D READ_ONCE(sqe->len); + iof->futex_mask =3D READ_ONCE(sqe->file_index); + iof->futex_flags =3D READ_ONCE(sqe->futex_flags); + if (iof->futex_flags & FUTEX_CMD_MASK) + return -EINVAL; + + return 0; +} + +static void io_futex_wake_fn(struct wake_q_head *wake_q, struct futex_q *q) +{ + struct io_futex_data *ifd =3D container_of(q, struct io_futex_data, q); + struct io_kiocb *req =3D ifd->req; + + __futex_unqueue(q); + smp_store_release(&q->lock_ptr, NULL); + + io_req_set_res(req, 0, 0); + req->io_task_work.func =3D io_futex_complete; + io_req_task_work_add(req); +} + +static struct io_futex_data *io_alloc_ifd(struct io_ring_ctx *ctx) +{ + struct io_cache_entry *entry; + + entry =3D io_alloc_cache_get(&ctx->futex_cache); + if (entry) + return container_of(entry, struct io_futex_data, cache); + + return kmalloc(sizeof(struct io_futex_data), GFP_NOWAIT); +} + +int io_futex_wait(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_futex *iof =3D io_kiocb_to_cmd(req, struct io_futex); + struct io_ring_ctx *ctx =3D req->ctx; + struct io_futex_data *ifd =3D NULL; + struct futex_hash_bucket *hb; + unsigned int flags =3D 0; + int ret; + + if (!iof->futex_mask) { + ret =3D -EINVAL; + goto done; + } + if (!futex_op_to_flags(FUTEX_WAIT, iof->futex_flags, &flags)) { + ret =3D -ENOSYS; + goto done; + } + + io_ring_submit_lock(ctx, issue_flags); + ifd =3D io_alloc_ifd(ctx); + if (!ifd) { + ret =3D -ENOMEM; + goto done_unlock; + } + + req->async_data =3D ifd; + ifd->q =3D futex_q_init; + ifd->q.bitset =3D iof->futex_mask; + ifd->q.wake =3D io_futex_wake_fn; + ifd->req =3D req; + + ret =3D futex_wait_setup(iof->uaddr, iof->futex_val, flags, &ifd->q, &hb); + if (!ret) { + hlist_add_head(&req->hash_node, &ctx->futex_list); + io_ring_submit_unlock(ctx, issue_flags); + + futex_queue(&ifd->q, hb); + return IOU_ISSUE_SKIP_COMPLETE; + } + +done_unlock: + io_ring_submit_unlock(ctx, issue_flags); +done: + if (ret < 0) + req_set_fail(req); + io_req_set_res(req, ret, 0); + kfree(ifd); + return IOU_OK; +} + +int io_futex_wake(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_futex *iof =3D io_kiocb_to_cmd(req, struct io_futex); + unsigned int flags =3D 0; + int ret; + + if (!futex_op_to_flags(FUTEX_WAKE, iof->futex_flags, &flags)) { + ret =3D -ENOSYS; + goto done; + } + + ret =3D futex_wake(iof->uaddr, flags, iof->futex_val, iof->futex_mask); +done: + if (ret < 0) + req_set_fail(req); + io_req_set_res(req, ret, 0); + return IOU_OK; +} diff --git a/io_uring/futex.h b/io_uring/futex.h new file mode 100644 index 000000000000..ddc9e0d73c52 --- /dev/null +++ b/io_uring/futex.h @@ -0,0 +1,34 @@ +// SPDX-License-Identifier: GPL-2.0 + +#include "cancel.h" + +int io_futex_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_futex_wait(struct io_kiocb *req, unsigned int issue_flags); +int io_futex_wake(struct io_kiocb *req, unsigned int issue_flags); + +#if defined(CONFIG_FUTEX) +int io_futex_cancel(struct io_ring_ctx *ctx, struct io_cancel_data *cd, + unsigned int issue_flags); +bool io_futex_remove_all(struct io_ring_ctx *ctx, struct task_struct *task, + bool cancel_all); +void io_futex_cache_init(struct io_ring_ctx *ctx); +void io_futex_cache_free(struct io_ring_ctx *ctx); +#else +static inline int io_futex_cancel(struct io_ring_ctx *ctx, + struct io_cancel_data *cd, + unsigned int issue_flags) +{ + return 0; +} +static inline bool io_futex_remove_all(struct io_ring_ctx *ctx, + struct task_struct *task, bool cancel_all) +{ + return false; +} +static inline void io_futex_cache_init(struct io_ring_ctx *ctx) +{ +} +static inline void io_futex_cache_free(struct io_ring_ctx *ctx) +{ +} +#endif diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index e8096d502a7c..67ff148bc394 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -92,6 +92,7 @@ #include "cancel.h" #include "net.h" #include "notif.h" +#include "futex.h" =20 #include "timeout.h" #include "poll.h" @@ -314,6 +315,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(str= uct io_uring_params *p) sizeof(struct async_poll)); io_alloc_cache_init(&ctx->netmsg_cache, IO_ALLOC_CACHE_MAX, sizeof(struct io_async_msghdr)); + io_futex_cache_init(ctx); init_completion(&ctx->ref_comp); xa_init_flags(&ctx->personalities, XA_FLAGS_ALLOC1); mutex_init(&ctx->uring_lock); @@ -333,6 +335,7 @@ static __cold struct io_ring_ctx *io_ring_ctx_alloc(str= uct io_uring_params *p) INIT_LIST_HEAD(&ctx->tctx_list); ctx->submit_state.free_list.next =3D NULL; INIT_WQ_LIST(&ctx->locked_free_list); + INIT_HLIST_HEAD(&ctx->futex_list); INIT_DELAYED_WORK(&ctx->fallback_work, io_fallback_req_func); INIT_WQ_LIST(&ctx->submit_state.compl_reqs); return ctx; @@ -2842,6 +2845,7 @@ static __cold void io_ring_ctx_free(struct io_ring_ct= x *ctx) io_eventfd_unregister(ctx); io_alloc_cache_free(&ctx->apoll_cache, io_apoll_cache_free); io_alloc_cache_free(&ctx->netmsg_cache, io_netmsg_cache_free); + io_futex_cache_free(ctx); io_destroy_buffers(ctx); mutex_unlock(&ctx->uring_lock); if (ctx->sq_creds) @@ -3254,6 +3258,7 @@ static __cold bool io_uring_try_cancel_requests(struc= t io_ring_ctx *ctx, ret |=3D io_cancel_defer_files(ctx, task, cancel_all); mutex_lock(&ctx->uring_lock); ret |=3D io_poll_remove_all(ctx, task, cancel_all); + ret |=3D io_futex_remove_all(ctx, task, cancel_all); mutex_unlock(&ctx->uring_lock); ret |=3D io_kill_timeouts(ctx, task, cancel_all); if (task) diff --git a/io_uring/opdef.c b/io_uring/opdef.c index 3b9c6489b8b6..c9f23c21a031 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -33,6 +33,7 @@ #include "poll.h" #include "cancel.h" #include "rw.h" +#include "futex.h" =20 static int io_no_issue(struct io_kiocb *req, unsigned int issue_flags) { @@ -426,11 +427,26 @@ const struct io_issue_def io_issue_defs[] =3D { .issue =3D io_sendmsg_zc, #else .prep =3D io_eopnotsupp_prep, +#endif + }, + [IORING_OP_FUTEX_WAIT] =3D { +#if defined(CONFIG_FUTEX) + .prep =3D io_futex_prep, + .issue =3D io_futex_wait, +#else + .prep =3D io_eopnotsupp_prep, +#endif + }, + [IORING_OP_FUTEX_WAKE] =3D { +#if defined(CONFIG_FUTEX) + .prep =3D io_futex_prep, + .issue =3D io_futex_wake, +#else + .prep =3D io_eopnotsupp_prep, #endif }, }; =20 - const struct io_cold_def io_cold_defs[] =3D { [IORING_OP_NOP] =3D { .name =3D "NOP", @@ -648,6 +664,12 @@ const struct io_cold_def io_cold_defs[] =3D { .fail =3D io_sendrecv_fail, #endif }, + [IORING_OP_FUTEX_WAIT] =3D { + .name =3D "FUTEX_WAIT", + }, + [IORING_OP_FUTEX_WAKE] =3D { + .name =3D "FUTEX_WAKE", + }, }; =20 const char *io_uring_get_opcode(u8 opcode) --=20 2.40.1 From nobody Tue Dec 16 10:29:53 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E94F8EB64DC for ; Wed, 12 Jul 2023 00:47:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231272AbjGLAr0 (ORCPT ); Tue, 11 Jul 2023 20:47:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231154AbjGLArS (ORCPT ); Tue, 11 Jul 2023 20:47:18 -0400 Received: from mail-pl1-x633.google.com (mail-pl1-x633.google.com [IPv6:2607:f8b0:4864:20::633]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0F16172C for ; Tue, 11 Jul 2023 17:47:17 -0700 (PDT) Received: by mail-pl1-x633.google.com with SMTP id d9443c01a7336-1b8c364ad3bso11362395ad.1 for ; Tue, 11 Jul 2023 17:47:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1689122837; x=1691714837; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=USpRQzApdKpSjiyBkjEmkDnIwRqj5vH3splpyWQ1un8=; b=FpZRUoq0aBYl9CAG+w+52gtPelgfB27iLDuiaLxgUVSRo0wfLIQu3VEnoXzkS21Xdq m2jQmSMLiebVrVI9WO6ahU0wvGJn/wLM2FxTrd9cUM9WY4c+wcnNp67maehJN6ewabhh dVK3RGPyvWiLbGnt4Qo07jrG56bi2emgJqA1ILLMMFhyvloYjZbVc+3Tsh0eEojhdfSx Xzby3UClIYLVNsK9oAJUNiNmY2DF9sC0AQdDgP1zbgR+eX1Dy8ibt4t9ZPK+1jt4zUJS dDCtTrGBlpkFiOwIMzYzUn/KTkuI44aU08Q8pOq37fL0TdJ7sMpjmm2L1cMFu309BpEl Dl9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689122837; x=1691714837; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=USpRQzApdKpSjiyBkjEmkDnIwRqj5vH3splpyWQ1un8=; b=NxlqmxjYgKL0sg7M72cjVSnZkHeawiG8kDLp6G3jKj17Mb0zVtqWkISh5MworgZx+z eV6PSnwm8Y3fZYZPjjj4EfrCxDIytYw8RzBGUzlLRbXfPW3egbEdva3wXdQnuuZ4HBqB DN6H3pax1RC7/A3EIFtgRUrLyw+7HouHqvIo87fh17j94enqz48AhUt7RdGHFmVu+f1w trTsLcPNLs9+wJ+xsirriJ+i9aIzeDaZQNt/3jz7utPXIALnf75T8PIWYU5AuOZ7mInP r9/WwVQYZj1DzGhoc0bP7b0KdW0rDjoj90Ww4uX/zsjcBRB44f/yVCZiwAD7QQwkEFu7 9mzw== X-Gm-Message-State: ABy/qLZeSUTpwUS4C6XkI6wx8BSwuc4uvB/d7iH+AaAwhXZ1AZPut/bH N8fEnmfz/zmlKFdGdCk2A+eEhA== X-Google-Smtp-Source: APBJJlGcQTb4zl90ols3sb/WEm7O0/Lzfmu6C2MX3GYDEZjK7TfuE80XBAJnU+feK9BQZF1fr1oReQ== X-Received: by 2002:a17:903:244e:b0:1b8:b4f6:1327 with SMTP id l14-20020a170903244e00b001b8b4f61327mr21565526pls.6.1689122837255; Tue, 11 Jul 2023 17:47:17 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id s8-20020a170902b18800b001b694140d96sm2543542plr.170.2023.07.11.17.47.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 17:47:16 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, Jens Axboe Subject: [PATCH 4/7] futex: add wake_data to struct futex_q Date: Tue, 11 Jul 2023 18:47:02 -0600 Message-Id: <20230712004705.316157-5-axboe@kernel.dk> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230712004705.316157-1-axboe@kernel.dk> References: <20230712004705.316157-1-axboe@kernel.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" With handling multiple futex_q for waitv, we cannot easily go from the futex_q to data related to that request or queue. Add a wake_data argument that belongs to the wake handler assigned. Signed-off-by: Jens Axboe --- kernel/futex/futex.h | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index 8eaf1a5ce967..75dec2ec7469 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -102,6 +102,7 @@ struct futex_q { struct task_struct *task; spinlock_t *lock_ptr; futex_wake_fn *wake; + void *wake_data; union futex_key key; struct futex_pi_state *pi_state; struct rt_mutex_waiter *rt_waiter; --=20 2.40.1 From nobody Tue Dec 16 10:29:53 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9AF81EB64DD for ; Wed, 12 Jul 2023 00:47:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231454AbjGLArb (ORCPT ); Tue, 11 Jul 2023 20:47:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39110 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229931AbjGLArV (ORCPT ); Tue, 11 Jul 2023 20:47:21 -0400 Received: from mail-pg1-x52e.google.com (mail-pg1-x52e.google.com [IPv6:2607:f8b0:4864:20::52e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB3A31736 for ; Tue, 11 Jul 2023 17:47:18 -0700 (PDT) Received: by mail-pg1-x52e.google.com with SMTP id 41be03b00d2f7-55b741fd0c5so1012335a12.0 for ; Tue, 11 Jul 2023 17:47:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1689122838; x=1691714838; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=wAI3I/Dhvc4kFPF9aMsZuUCg5MncxXQjosQul2Nohhs=; b=df7itfq6b68/usvHEi91e1MzZBznscTgw/J0lMxPEjugpJIES1PAZeC1++4h+nEL73 hF+zuH761pjLrXKQpoIJkDL+LT+9G23AGGt33Admf11D+pplifWgUjWVTThi3WpgwHCO VxnxqurUC3FRQw+FWp8InVuWdBVU+xc49C5pfnXq4QjskLSfD2SG3w1CEPCSVao7WuFR kDsOdHs18B1OBSsIgbeEq870s4F9L0+4RC507KZsTFkPntUQnv5uM1x4bgmDW2yu4o4o M+JTd8NS3GwLh2iaS8FT/bN6jNFYa/baEeCgtdmpgaS+PJD47Os/ouyIY42q6CayXWI+ OQUA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689122838; x=1691714838; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=wAI3I/Dhvc4kFPF9aMsZuUCg5MncxXQjosQul2Nohhs=; b=lawHGNPT7dP48Wr4WTIvVf5VyC8JuPFWmb/mhMDmc7QEknEkIyUyRxcMS1Fu34MUA5 V2E2EuLnny3iI26OX8DYGGFL7FWRc4TM+vckoL7dA1Eq/oqBzA/7OLSjQ3du6NxQgbwN DqfYprryZmxSq4v30rd07t12ltz6luNIu28leV+kqrn37Bir+rSYv1klEvJgJmWsIkqb is9yxzeA/cdsWn1ciYh7GD3GHU/IGQ+8m06aNxJPwfeAK107NM/HQV8g0XLYZ6t3JKn0 9QvAz44j6LIdTHWsrgtthCxHEpHKxCiS8XIcUrs61E/7e6elJnMPJAvWD71zFtJfOhv2 Bvbg== X-Gm-Message-State: ABy/qLY6a1iuum5A3xqUXU5bTMZVFkRbRqG7kqSEzVBqcJjhpfoYWXqP CWzB/cNHs3D/8c0hososyyKT5g== X-Google-Smtp-Source: APBJJlF0OcXmextAbH9ZOz7q7HkUMlDfdRnt7GYOrghw54ttVQIxzCRuTPnMc4lyuqCnqyjb0gi/RQ== X-Received: by 2002:a17:902:d4cd:b0:1b8:17e8:5472 with SMTP id o13-20020a170902d4cd00b001b817e85472mr21890265plg.1.1689122838441; Tue, 11 Jul 2023 17:47:18 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id s8-20020a170902b18800b001b694140d96sm2543542plr.170.2023.07.11.17.47.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 17:47:17 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, Jens Axboe Subject: [PATCH 5/7] futex: make futex_parse_waitv() available as a helper Date: Tue, 11 Jul 2023 18:47:03 -0600 Message-Id: <20230712004705.316157-6-axboe@kernel.dk> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230712004705.316157-1-axboe@kernel.dk> References: <20230712004705.316157-1-axboe@kernel.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" To make it more generically useful, augment it with allowing the caller to pass in the wake handler and wake data. Convert the futex_waitv() syscall, passing in the default handlers. Since we now provide a way to pass in a wake handler and data, ensure we use __futex_queue() to avoid having futex_queue() overwrite our wait data. Signed-off-by: Jens Axboe --- kernel/futex/futex.h | 5 +++++ kernel/futex/syscalls.c | 14 ++++++++++---- kernel/futex/waitwake.c | 3 ++- 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index 75dec2ec7469..ed5a7ccd2e99 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -284,6 +284,11 @@ struct futex_vector { struct futex_q q; }; =20 +extern int futex_parse_waitv(struct futex_vector *futexv, + struct futex_waitv __user *uwaitv, + unsigned int nr_futexes, futex_wake_fn *wake, + void *wake_data); + extern int futex_wait_multiple(struct futex_vector *vs, unsigned int count, struct hrtimer_sleeper *to); =20 diff --git a/kernel/futex/syscalls.c b/kernel/futex/syscalls.c index 75ca8c41cc94..8ac70bfb89fc 100644 --- a/kernel/futex/syscalls.c +++ b/kernel/futex/syscalls.c @@ -184,12 +184,15 @@ SYSCALL_DEFINE6(futex, u32 __user *, uaddr, int, op, = u32, val, * @futexv: Kernel side list of waiters to be filled * @uwaitv: Userspace list to be parsed * @nr_futexes: Length of futexv + * @wake: Wake to call when futex is woken + * @wake_data: Data for the wake handler * * Return: Error code on failure, 0 on success */ -static int futex_parse_waitv(struct futex_vector *futexv, - struct futex_waitv __user *uwaitv, - unsigned int nr_futexes) +int futex_parse_waitv(struct futex_vector *futexv, + struct futex_waitv __user *uwaitv, + unsigned int nr_futexes, futex_wake_fn *wake, + void *wake_data) { struct futex_waitv aux; unsigned int i; @@ -208,6 +211,8 @@ static int futex_parse_waitv(struct futex_vector *futex= v, futexv[i].w.val =3D aux.val; futexv[i].w.uaddr =3D aux.uaddr; futexv[i].q =3D futex_q_init; + futexv[i].q.wake =3D wake; + futexv[i].q.wake_data =3D wake_data; } =20 return 0; @@ -284,7 +289,8 @@ SYSCALL_DEFINE5(futex_waitv, struct futex_waitv __user = *, waiters, goto destroy_timer; } =20 - ret =3D futex_parse_waitv(futexv, waiters, nr_futexes); + ret =3D futex_parse_waitv(futexv, waiters, nr_futexes, futex_wake_mark, + NULL); if (!ret) ret =3D futex_wait_multiple(futexv, nr_futexes, timeout ? &to : NULL); =20 diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c index 3471af87cb7d..dfd02ca5ecfa 100644 --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -446,7 +446,8 @@ static int futex_wait_multiple_setup(struct futex_vecto= r *vs, int count, int *wo * next futex. Queue each futex at this moment so hb can * be unlocked. */ - futex_queue(q, hb); + __futex_queue(q, hb); + spin_unlock(&hb->lock); continue; } =20 --=20 2.40.1 From nobody Tue Dec 16 10:29:53 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E560FEB64DD for ; Wed, 12 Jul 2023 00:47:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231550AbjGLAre (ORCPT ); Tue, 11 Jul 2023 20:47:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231192AbjGLArV (ORCPT ); Tue, 11 Jul 2023 20:47:21 -0400 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38ADF1734 for ; Tue, 11 Jul 2023 17:47:20 -0700 (PDT) Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-1b9d9cbcc70so5005965ad.0 for ; Tue, 11 Jul 2023 17:47:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1689122839; x=1689727639; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dWVSKxrnKu5hFU+vCQFzf5LgVGgpoqEzW6iDZaj+u6I=; b=pOpXts18eWWzVgfvcp1gjSlrjNuWBS8ap8AAX3uZJZfqqP+3gL8MLcAIJmy8JdLwI7 kAgvvKI+lElpiyuOIABzKxKW/9YfLkQ+KoWd8zriwzKl5/WTBrjBoG/NKbHSZNBjcz6+ DzCtwiesO2ZhE3bssOr0dVsgabQPoqBbmrBWpXDSFW84tCSHdvnaEZen/11NCaAUIhFq sm5gXEoWwibq3mVuj9nW1f/EJk82UAe8fO1eeGbFZP8EotUKX7V9LEVsS76qLnSddEmY FAHOsXqy+1jQA2aZM0a3G80veYKKMudX9CP8VdYxvYm9XwxUk6M4b8+o2hTNPPVv0C9U 4WlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689122840; x=1689727640; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dWVSKxrnKu5hFU+vCQFzf5LgVGgpoqEzW6iDZaj+u6I=; b=gTIbs4GkJwnIBbF1xinzxcOklShJT7P8/k4FKJHWWjAOw2Rv1tz7rOYthtt9T5TzE5 kDSaE2vnRKvfu1lIjceSopFXstcYH8Twx0rqc+EQToAv1pTFryW4LxyEnaH5BCGAUvFY CEzKm1Sc0wyu87W1ccEwGrg27M6lMY3lf8tgKa6Cbpp0/BAxUDfiekqnL8XAmmRSvClC EEuuieSBReWuMSm2qmSGRCp3aaYjYT6P19qaixk2uhsxkC1mAVAVlUeuWzOP/fZZ0Nc+ Yp/w763Pyx8rTdSHJFyIz+2lPQWT5DTM4OmMAMpp4JtKnv55XL49uPXfgMUJjMk1Qf21 rh5Q== X-Gm-Message-State: ABy/qLZUdlng1bepHalttu0WN7nmiFNNWcVg7yyrR27bCoC5YYP5Q9Xx DPjFDYJFL7Vs2wVZ5ayDdHodDA== X-Google-Smtp-Source: APBJJlEzylA84a9lM6BmQj4WdfDc+r9lDrVXJXDh0eEcAQi3h0fuO3XsOZah6+YncygvoD/FCTXRrw== X-Received: by 2002:a17:902:cecd:b0:1b8:9fc4:2733 with SMTP id d13-20020a170902cecd00b001b89fc42733mr21519265plg.3.1689122839695; Tue, 11 Jul 2023 17:47:19 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id s8-20020a170902b18800b001b694140d96sm2543542plr.170.2023.07.11.17.47.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 17:47:19 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, Jens Axboe Subject: [PATCH 6/7] futex: make the vectored futex operations available Date: Tue, 11 Jul 2023 18:47:04 -0600 Message-Id: <20230712004705.316157-7-axboe@kernel.dk> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230712004705.316157-1-axboe@kernel.dk> References: <20230712004705.316157-1-axboe@kernel.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Rename unqueue_multiple() as futex_unqueue_multiple(), and make both that and futex_wait_multiple_setup() available for external users. This is in preparation for wiring up vectored waits in io_uring. Signed-off-by: Jens Axboe --- kernel/futex/futex.h | 5 +++++ kernel/futex/waitwake.c | 10 +++++----- 2 files changed, 10 insertions(+), 5 deletions(-) diff --git a/kernel/futex/futex.h b/kernel/futex/futex.h index ed5a7ccd2e99..b06e23c4900e 100644 --- a/kernel/futex/futex.h +++ b/kernel/futex/futex.h @@ -289,6 +289,11 @@ extern int futex_parse_waitv(struct futex_vector *fute= xv, unsigned int nr_futexes, futex_wake_fn *wake, void *wake_data); =20 +extern int futex_wait_multiple_setup(struct futex_vector *vs, int count, + int *woken); + +extern int futex_unqueue_multiple(struct futex_vector *v, int count); + extern int futex_wait_multiple(struct futex_vector *vs, unsigned int count, struct hrtimer_sleeper *to); =20 diff --git a/kernel/futex/waitwake.c b/kernel/futex/waitwake.c index dfd02ca5ecfa..b2b762acc997 100644 --- a/kernel/futex/waitwake.c +++ b/kernel/futex/waitwake.c @@ -358,7 +358,7 @@ void futex_wait_queue(struct futex_hash_bucket *hb, str= uct futex_q *q, } =20 /** - * unqueue_multiple - Remove various futexes from their hash bucket + * futex_unqueue_multiple - Remove various futexes from their hash bucket * @v: The list of futexes to unqueue * @count: Number of futexes in the list * @@ -368,7 +368,7 @@ void futex_wait_queue(struct futex_hash_bucket *hb, str= uct futex_q *q, * - >=3D0 - Index of the last futex that was awoken; * - -1 - No futex was awoken */ -static int unqueue_multiple(struct futex_vector *v, int count) +int futex_unqueue_multiple(struct futex_vector *v, int count) { int ret =3D -1, i; =20 @@ -396,7 +396,7 @@ static int unqueue_multiple(struct futex_vector *v, int= count) * - 0 - Success * - <0 - -EFAULT, -EWOULDBLOCK or -EINVAL */ -static int futex_wait_multiple_setup(struct futex_vector *vs, int count, i= nt *woken) +int futex_wait_multiple_setup(struct futex_vector *vs, int count, int *wok= en) { struct futex_hash_bucket *hb; bool retry =3D false; @@ -459,7 +459,7 @@ static int futex_wait_multiple_setup(struct futex_vecto= r *vs, int count, int *wo * was woken, we don't return error and return this index to * userspace */ - *woken =3D unqueue_multiple(vs, i); + *woken =3D futex_unqueue_multiple(vs, i); if (*woken >=3D 0) return 1; =20 @@ -544,7 +544,7 @@ int futex_wait_multiple(struct futex_vector *vs, unsign= ed int count, =20 __set_current_state(TASK_RUNNING); =20 - ret =3D unqueue_multiple(vs, count); + ret =3D futex_unqueue_multiple(vs, count); if (ret >=3D 0) return ret; =20 --=20 2.40.1 From nobody Tue Dec 16 10:29:53 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6E73C001DC for ; Wed, 12 Jul 2023 00:47:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231322AbjGLAri (ORCPT ); Tue, 11 Jul 2023 20:47:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39110 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231226AbjGLArY (ORCPT ); Tue, 11 Jul 2023 20:47:24 -0400 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7AB611989 for ; Tue, 11 Jul 2023 17:47:21 -0700 (PDT) Received: by mail-pl1-x634.google.com with SMTP id d9443c01a7336-1b898cfa6a1so9141285ad.1 for ; Tue, 11 Jul 2023 17:47:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20221208.gappssmtp.com; s=20221208; t=1689122841; x=1689727641; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ybabf6Mx4VMuhxCixGmKPdJebzTyct2xBcgC4A4QTHE=; b=jplCC0xsZilG74DILpEjVEcJAUe8+fabKs3V5dzMOA/MRxevrMTrfCsq5J8kwB4it9 yaq6LQmO55fkvLM9PkUBtS8Vd+9hPjkLC/z+ctQwPYiD+IXWSn5VSM6ZOMB8w+S82lnN zJsNx2XDTbzxqfwZnAlQtzeb3c+aQ+7CPf4uEAoAY/3TnZ7CHx30kYV0pdFx8ugI2WpN +7zLN+EyS2SmVPA0PobdaLBphljLcWlQHCbTxGF33YU1BSOIkP1puvtC13xCFHSQ/FY5 pkAZ3E12S/zah08ptHaKyTMOQW92q0+qKaFslkh39kRHH1BXj2m+B66uV9mLDQa3gkro VN4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1689122841; x=1689727641; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ybabf6Mx4VMuhxCixGmKPdJebzTyct2xBcgC4A4QTHE=; b=jYevEzIwdxb733gHiOArj6x9SZNnNV/Sl7Og+GgLRbW6V1KFAB0iBwfLPsX1Oh4A0S MTCSo7Z2Gq1U6CNHB5/lmi83btYgWWS7wjkCcC66CPGXmzMIPMf3K3HPGOnjiHmpEYgH MVTiAH60ioHpN9sOujvuT3PovflPA/R2xsR1F1ZH52j5nlORy10+SPe/svTtduFF7g9T 0WEelj2OChvAvJgjThHAzrMilMovqJgIbtToMm3Msv/kwTdO5DCvq2N/Vgi1h7SK2FEQ yNrsJfm0AjFa6ZnXGwY5KAh62FFbgiHUzpST+qxY9FqWGXD27igyx5akxXo5ki1qQMJl K45A== X-Gm-Message-State: ABy/qLZfxaFcP+BrL/nAxq0OV1r78RdN7fzCl8uNXi6a7ytFYSKw2zsR 7C+9zG+869V8F6aH3h/fxVClRA== X-Google-Smtp-Source: APBJJlHEg/AwTm78Dq6O8WVJlRDk8pyUyhJoGvYjko3I2P2q+wSFXyqKnOsqjZ0AQ/vpoX0AWye6Uw== X-Received: by 2002:a17:902:da92:b0:1b3:d8ac:8db3 with SMTP id j18-20020a170902da9200b001b3d8ac8db3mr21326390plx.6.1689122840987; Tue, 11 Jul 2023 17:47:20 -0700 (PDT) Received: from localhost.localdomain ([198.8.77.157]) by smtp.gmail.com with ESMTPSA id s8-20020a170902b18800b001b694140d96sm2543542plr.170.2023.07.11.17.47.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 11 Jul 2023 17:47:20 -0700 (PDT) From: Jens Axboe To: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, Jens Axboe Subject: [PATCH 7/7] io_uring: add futex waitv Date: Tue, 11 Jul 2023 18:47:05 -0600 Message-Id: <20230712004705.316157-8-axboe@kernel.dk> X-Mailer: git-send-email 2.40.1 In-Reply-To: <20230712004705.316157-1-axboe@kernel.dk> References: <20230712004705.316157-1-axboe@kernel.dk> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Needs a bit of splitting and a few hunks should go further back (like the wake handler typedef). WIP, adds IORING_OP_FUTEX_WAITV - pass in an array of futex addresses, and wait on all of them until one of them triggers. Signed-off-by: Jens Axboe --- include/uapi/linux/io_uring.h | 1 + io_uring/futex.c | 165 +++++++++++++++++++++++++++++++--- io_uring/futex.h | 2 + io_uring/opdef.c | 11 +++ 4 files changed, 169 insertions(+), 10 deletions(-) diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 3bd2d765f593..420f38675769 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -238,6 +238,7 @@ enum io_uring_op { IORING_OP_SENDMSG_ZC, IORING_OP_FUTEX_WAIT, IORING_OP_FUTEX_WAKE, + IORING_OP_FUTEX_WAITV, =20 /* this goes last, obviously */ IORING_OP_LAST, diff --git a/io_uring/futex.c b/io_uring/futex.c index ff0f6b394756..b22120545d31 100644 --- a/io_uring/futex.c +++ b/io_uring/futex.c @@ -14,11 +14,16 @@ =20 struct io_futex { struct file *file; - u32 __user *uaddr; + union { + u32 __user *uaddr; + struct futex_waitv __user *uwaitv; + }; int futex_op; unsigned int futex_val; unsigned int futex_flags; unsigned int futex_mask; + unsigned int futex_nr; + unsigned long futexv_owned; }; =20 struct io_futex_data { @@ -45,6 +50,13 @@ void io_futex_cache_free(struct io_ring_ctx *ctx) io_alloc_cache_free(&ctx->futex_cache, io_futex_cache_entry_free); } =20 +static void __io_futex_complete(struct io_kiocb *req, struct io_tw_state *= ts) +{ + req->async_data =3D NULL; + hlist_del_init(&req->hash_node); + io_req_task_complete(req, ts); +} + static void io_futex_complete(struct io_kiocb *req, struct io_tw_state *ts) { struct io_futex_data *ifd =3D req->async_data; @@ -53,22 +65,59 @@ static void io_futex_complete(struct io_kiocb *req, str= uct io_tw_state *ts) io_tw_lock(ctx, ts); if (!io_alloc_cache_put(&ctx->futex_cache, &ifd->cache)) kfree(ifd); - req->async_data =3D NULL; - hlist_del_init(&req->hash_node); - io_req_task_complete(req, ts); + __io_futex_complete(req, ts); } =20 -static bool __io_futex_cancel(struct io_ring_ctx *ctx, struct io_kiocb *re= q) +static void io_futexv_complete(struct io_kiocb *req, struct io_tw_state *t= s) { - struct io_futex_data *ifd =3D req->async_data; + struct io_futex *iof =3D io_kiocb_to_cmd(req, struct io_futex); + struct futex_vector *futexv =3D req->async_data; + struct io_ring_ctx *ctx =3D req->ctx; + int res =3D 0; =20 - /* futex wake already done or in progress */ - if (!futex_unqueue(&ifd->q)) + io_tw_lock(ctx, ts); + + res =3D futex_unqueue_multiple(futexv, iof->futex_nr); + if (res !=3D -1) + io_req_set_res(req, res, 0); + + kfree(req->async_data); + req->flags &=3D ~REQ_F_ASYNC_DATA; + __io_futex_complete(req, ts); +} + +static bool io_futexv_claimed(struct io_futex *iof) +{ + return test_bit(0, &iof->futexv_owned); +} + +static bool io_futexv_claim(struct io_futex *iof) +{ + if (test_bit(0, &iof->futexv_owned) || + test_and_set_bit(0, &iof->futexv_owned)) return false; + return true; +} + +static bool __io_futex_cancel(struct io_ring_ctx *ctx, struct io_kiocb *re= q) +{ + /* futex wake already done or in progress */ + if (req->opcode =3D=3D IORING_OP_FUTEX_WAIT) { + struct io_futex_data *ifd =3D req->async_data; + + if (!futex_unqueue(&ifd->q)) + return false; + req->io_task_work.func =3D io_futex_complete; + } else { + struct io_futex *iof =3D io_kiocb_to_cmd(req, struct io_futex); + + if (!io_futexv_claim(iof)) + return false; + req->io_task_work.func =3D io_futexv_complete; + } =20 hlist_del_init(&req->hash_node); io_req_set_res(req, -ECANCELED, 0); - req->io_task_work.func =3D io_futex_complete; io_req_task_work_add(req); return true; } @@ -124,7 +173,7 @@ int io_futex_prep(struct io_kiocb *req, const struct io= _uring_sqe *sqe) { struct io_futex *iof =3D io_kiocb_to_cmd(req, struct io_futex); =20 - if (unlikely(sqe->addr2 || sqe->buf_index || sqe->addr3)) + if (unlikely(sqe->buf_index || sqe->addr3)) return -EINVAL; =20 iof->futex_op =3D READ_ONCE(sqe->fd); @@ -135,6 +184,53 @@ int io_futex_prep(struct io_kiocb *req, const struct i= o_uring_sqe *sqe) if (iof->futex_flags & FUTEX_CMD_MASK) return -EINVAL; =20 + iof->futexv_owned =3D 0; + return 0; +} + +static void io_futex_wakev_fn(struct wake_q_head *wake_q, struct futex_q *= q) +{ + struct io_kiocb *req =3D q->wake_data; + struct io_futex *iof =3D io_kiocb_to_cmd(req, struct io_futex); + + if (!io_futexv_claim(iof)) + return; + + __futex_unqueue(q); + smp_store_release(&q->lock_ptr, NULL); + + io_req_set_res(req, 0, 0); + req->io_task_work.func =3D io_futexv_complete; + io_req_task_work_add(req); +} + +int io_futexv_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe) +{ + struct io_futex *iof =3D io_kiocb_to_cmd(req, struct io_futex); + struct futex_vector *futexv; + int ret; + + ret =3D io_futex_prep(req, sqe); + if (ret) + return ret; + + iof->futex_nr =3D READ_ONCE(sqe->off); + if (!iof->futex_nr || iof->futex_nr > FUTEX_WAITV_MAX) + return -EINVAL; + + futexv =3D kcalloc(iof->futex_nr, sizeof(*futexv), GFP_KERNEL); + if (!futexv) + return -ENOMEM; + + ret =3D futex_parse_waitv(futexv, iof->uwaitv, iof->futex_nr, + io_futex_wakev_fn, req); + if (ret) { + kfree(futexv); + return ret; + } + + req->flags |=3D REQ_F_ASYNC_DATA; + req->async_data =3D futexv; return 0; } =20 @@ -162,6 +258,55 @@ static struct io_futex_data *io_alloc_ifd(struct io_ri= ng_ctx *ctx) return kmalloc(sizeof(struct io_futex_data), GFP_NOWAIT); } =20 +int io_futex_waitv(struct io_kiocb *req, unsigned int issue_flags) +{ + struct io_futex *iof =3D io_kiocb_to_cmd(req, struct io_futex); + struct futex_vector *futexv =3D req->async_data; + struct io_ring_ctx *ctx =3D req->ctx; + int ret, woken =3D -1; + + io_ring_submit_lock(ctx, issue_flags); + + ret =3D futex_wait_multiple_setup(futexv, iof->futex_nr, &woken); + + /* + * The above call leaves us potentially non-running. This is fine + * for the sync syscall as it'll be blocking unless we already got + * one of the futexes woken, but it obviously won't work for an async + * invocation. Mark is runnable again. + */ + __set_current_state(TASK_RUNNING); + + /* + * We got woken while setting up, let that side do the completion + */ + if (io_futexv_claimed(iof)) { +skip: + io_ring_submit_unlock(ctx, issue_flags); + return IOU_ISSUE_SKIP_COMPLETE; + } + + /* + * 0 return means that we successfully setup the waiters, and that + * nobody triggered a wakeup while we were doing so. < 0 or 1 return + * is either an error or we got a wakeup while setting up. + */ + if (!ret) { + hlist_add_head(&req->hash_node, &ctx->futex_list); + goto skip; + } + + io_ring_submit_unlock(ctx, issue_flags); + if (ret < 0) + req_set_fail(req); + else if (woken !=3D -1) + ret =3D woken; + io_req_set_res(req, ret, 0); + kfree(futexv); + req->flags &=3D ~REQ_F_ASYNC_DATA; + return IOU_OK; +} + int io_futex_wait(struct io_kiocb *req, unsigned int issue_flags) { struct io_futex *iof =3D io_kiocb_to_cmd(req, struct io_futex); diff --git a/io_uring/futex.h b/io_uring/futex.h index ddc9e0d73c52..7828e27e4184 100644 --- a/io_uring/futex.h +++ b/io_uring/futex.h @@ -3,7 +3,9 @@ #include "cancel.h" =20 int io_futex_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); +int io_futexv_prep(struct io_kiocb *req, const struct io_uring_sqe *sqe); int io_futex_wait(struct io_kiocb *req, unsigned int issue_flags); +int io_futex_waitv(struct io_kiocb *req, unsigned int issue_flags); int io_futex_wake(struct io_kiocb *req, unsigned int issue_flags); =20 #if defined(CONFIG_FUTEX) diff --git a/io_uring/opdef.c b/io_uring/opdef.c index c9f23c21a031..2034acfe10d0 100644 --- a/io_uring/opdef.c +++ b/io_uring/opdef.c @@ -443,6 +443,14 @@ const struct io_issue_def io_issue_defs[] =3D { .issue =3D io_futex_wake, #else .prep =3D io_eopnotsupp_prep, +#endif + }, + [IORING_OP_FUTEX_WAITV] =3D { +#if defined(CONFIG_FUTEX) + .prep =3D io_futexv_prep, + .issue =3D io_futex_waitv, +#else + .prep =3D io_eopnotsupp_prep, #endif }, }; @@ -670,6 +678,9 @@ const struct io_cold_def io_cold_defs[] =3D { [IORING_OP_FUTEX_WAKE] =3D { .name =3D "FUTEX_WAKE", }, + [IORING_OP_FUTEX_WAITV] =3D { + .name =3D "FUTEX_WAITV", + }, }; =20 const char *io_uring_get_opcode(u8 opcode) --=20 2.40.1