From nobody Fri Oct 3 08:49:05 2025 Received: from mail-vs1-f100.google.com (mail-vs1-f100.google.com [209.85.217.100]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3BF0F2C21EC for ; Wed, 3 Sep 2025 03:27:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.217.100 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756870021; cv=none; b=aXWS6nSBUGTG/2d0kdWqhnesxEbys9rviAfBt32p1Ox13AU+yUY3P804XFbHburB3Ose9d/Mk0jrh6L1iw57Zw/LNnRNmoyCcSPObiNK1NnomHXjTq3FaGlv1M0Y3hW4WJ0rBOtkBpH06zMDjt6IFu2bpqK4UG+mNbLugJ+pDNI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756870021; c=relaxed/simple; bh=hFmR3pVdhT7xfVLKpbhJFsgBB3+5r+/8jlK7QV+iOBQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=QgocCI38IjfAsiWQgfmsPSEwf5IDrC06jRzJzDdm5A/g+8RmQ0LHWRAUQ1uJwFLt7Udol2xde96I2AnvpWp57zTsJlXDWu6QcCrbBF2BrRyll385MQ07tuA5I3wTUAZnJEp3UgAls4nqggnCQX7SWA6rjvlWigjXCmXMeD10kI8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=SfJ3DUc4; arc=none smtp.client-ip=209.85.217.100 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="SfJ3DUc4" Received: by mail-vs1-f100.google.com with SMTP id ada2fe7eead31-52e6ff36cddso56316137.1 for ; Tue, 02 Sep 2025 20:26:59 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1756870019; x=1757474819; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=eSqF8u89jLv/Gbai9ll4AlOnIZ8yRMDq0tMj1xwV5eA=; b=SfJ3DUc4R1HrK6TWTCyLUp4c1a0p5ZuQ278IqlaKPnWsvH4kBgPGyOGYRcJpXgzYZc DcGwjuJLsmQ0GE7KO1TwkCxcWIHB8NxlZ4Z5A+3Y4jigS9fzz8UqtWYlYUQw08n0xtT1 DN+UbIBtZHOA/2DmKFWMVzKzeVI0O0RBXiIIphtO8kphteGOB96hhRGYxChQeQSj9L9U 10o/b+BUetE/9AD0TJzoFsLTmBtaQcYxFa8+skdkrA1Ki/kMswnJFRceAVj4eoxyrGV9 b81QTO86gAcAjHybeaODw4PcyWPtE/+OKYC9RHdYeIpVc40l+EbuMI0U0gTZditNSP7b UY2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756870019; x=1757474819; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=eSqF8u89jLv/Gbai9ll4AlOnIZ8yRMDq0tMj1xwV5eA=; b=pKiZgFIJ/dUOkiRD6/qeMreSXI3XI/nXC4JbSViDntq3J31YXqDqBpF0VW+jhO7/wb CFDthhdD+ne3fhFAHoc74tG5cInHsIDrirOZOI/4mEsMlIvmDe+8ykcxhez57M6zYX2X 1VY8UfEhVju73FqKJ9cqHR7oqMjoVVyF5SRsj6mJmRGJlmV3JMBJUoozzKcE28dqbPXY AyVO+Rd6iRk7kKDHXhJY6I86cUOq3PojsmfWRI8X8KgB7cmUEhoWhEyzEW4AIypOZQaF ZBoeKAmkSC4jX7/an9suhgr6mPQDf6mN0YfDEnZ7vL2lX8lPhxd3jUTh/IRegKH/dWQs XNCQ== X-Forwarded-Encrypted: i=1; AJvYcCXmM/WX+ieGuKt71qqES9lfVOx6NZrYF9vUyiGSJ087QQZ0J87PcYZkmA+joUr+Bg49Q7iEiD1L8mREOmg=@vger.kernel.org X-Gm-Message-State: AOJu0YzTPAry04jmq5Y80ybJ0n5Do22uSYahJtlKY2HxQK0xMMl5q2c1 1IpYkSddjlMobNRe/fPPgI1GV0UiXQLj9e15Kg7wlHkzucvL7kfsCAER/N6uz7hY+p1k4KMUVNp gFMlATzOWnxWrcpl/UQ3FF/bRYx9DDpn/4c7jsXjxyq8Cmy84U0Rs X-Gm-Gg: ASbGncsQxLFRCJDmcJNX1Jcn2JuIvWHPAKKQRo0/oKT3H/w4fPhk+hGWuWp557CU70g lCPyxsuF55e+IbF6/id8ItZEkByUn1TpLFaIevcpUBt1Qwldge/Sj3+TqzDhMNvJ5SK5BEZd/XY 0V4h37RWkUCqyu7jyVgaSi2/3DQdiLgpnTRxfHi03Kzdut/DwteeYLrFB9ZqwPchDXPjyZNqsRS 1tuZevAwyETFHuubrs7SHU9dPeHZsE1cmPTQrFRtbxyVV9d/Y8VeHcsyqgS3HCyvVMBtZbs9qLb mc2WL+nq6FO6cdB/Jsyj5nTDbmjHMLm4E1I3Lp0zXG7facBbVyMwEth0kA== X-Google-Smtp-Source: AGHT+IFehSnrrzLkenHFKrvF2fQBZ/Hq0de+LCBKeNQJKzQhedbF9DLqRHrw6Abi/Dp8WwHLWwZOnSYtgIX2 X-Received: by 2002:a05:6102:149d:b0:525:df9e:936e with SMTP id ada2fe7eead31-52a343e4a6cmr1745805137.7.1756870019067; Tue, 02 Sep 2025 20:26:59 -0700 (PDT) Received: from c7-smtp-2023.dev.purestorage.com ([2620:125:9017:12:36:3:5:0]) by smtp-relay.gmail.com with ESMTPS id ada2fe7eead31-52af1212c63sm1232773137.3.2025.09.02.20.26.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Sep 2025 20:26:59 -0700 (PDT) X-Relaying-Domain: purestorage.com Received: from dev-csander.dev.purestorage.com (unknown [IPv6:2620:125:9007:640:ffff::1199]) by c7-smtp-2023.dev.purestorage.com (Postfix) with ESMTP id D1CD2340344; Tue, 2 Sep 2025 21:26:57 -0600 (MDT) Received: by dev-csander.dev.purestorage.com (Postfix, from userid 1557716354) id CF728E41964; Tue, 2 Sep 2025 21:26:57 -0600 (MDT) From: Caleb Sander Mateos To: Jens Axboe Cc: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org, Caleb Sander Mateos Subject: [PATCH 1/4] io_uring: don't include filetable.h in io_uring.h Date: Tue, 2 Sep 2025 21:26:53 -0600 Message-ID: <20250903032656.2012337-2-csander@purestorage.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250903032656.2012337-1-csander@purestorage.com> References: <20250903032656.2012337-1-csander@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" io_uring/io_uring.h doesn't use anything declared in io_uring/filetable.h, so drop the unnecessary #include. Add filetable.h includes in .c files previously relying on the transitive include from io_uring.h. Signed-off-by: Caleb Sander Mateos --- io_uring/cancel.c | 1 + io_uring/fdinfo.c | 2 +- io_uring/io_uring.c | 1 + io_uring/io_uring.h | 1 - io_uring/net.c | 1 + io_uring/openclose.c | 1 + io_uring/register.c | 1 + io_uring/rsrc.c | 1 + io_uring/rw.c | 1 + io_uring/splice.c | 1 + 10 files changed, 9 insertions(+), 2 deletions(-) diff --git a/io_uring/cancel.c b/io_uring/cancel.c index 6d57602304df..64b51e82baa2 100644 --- a/io_uring/cancel.c +++ b/io_uring/cancel.c @@ -9,10 +9,11 @@ #include #include =20 #include =20 +#include "filetable.h" #include "io_uring.h" #include "tctx.h" #include "poll.h" #include "timeout.h" #include "waitid.h" diff --git a/io_uring/fdinfo.c b/io_uring/fdinfo.c index 5c7339838769..ff3364531c77 100644 --- a/io_uring/fdinfo.c +++ b/io_uring/fdinfo.c @@ -7,11 +7,11 @@ #include #include =20 #include =20 -#include "io_uring.h" +#include "filetable.h" #include "sqpoll.h" #include "fdinfo.h" #include "cancel.h" #include "rsrc.h" =20 diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 545a7d5eefec..9c1190b19adf 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -77,10 +77,11 @@ =20 #include =20 #include "io-wq.h" =20 +#include "filetable.h" #include "io_uring.h" #include "opdef.h" #include "refs.h" #include "tctx.h" #include "register.h" diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index fa8a66b34d4e..d62b7d9fafed 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -9,11 +9,10 @@ #include #include #include "alloc_cache.h" #include "io-wq.h" #include "slist.h" -#include "filetable.h" #include "opdef.h" =20 #ifndef CREATE_TRACE_POINTS #include #endif diff --git a/io_uring/net.c b/io_uring/net.c index d2ca49ceb79d..cf4bf4a2264b 100644 --- a/io_uring/net.c +++ b/io_uring/net.c @@ -8,10 +8,11 @@ #include #include =20 #include =20 +#include "filetable.h" #include "io_uring.h" #include "kbuf.h" #include "alloc_cache.h" #include "net.h" #include "notif.h" diff --git a/io_uring/openclose.c b/io_uring/openclose.c index d70700e5cef8..bfeb91b31bba 100644 --- a/io_uring/openclose.c +++ b/io_uring/openclose.c @@ -12,10 +12,11 @@ =20 #include =20 #include "../fs/internal.h" =20 +#include "filetable.h" #include "io_uring.h" #include "rsrc.h" #include "openclose.h" =20 struct io_open { diff --git a/io_uring/register.c b/io_uring/register.c index aa5f56ad8358..5e493917a1a8 100644 --- a/io_uring/register.c +++ b/io_uring/register.c @@ -16,10 +16,11 @@ #include #include #include #include =20 +#include "filetable.h" #include "io_uring.h" #include "opdef.h" #include "tctx.h" #include "rsrc.h" #include "sqpoll.h" diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index f75f5e43fa4a..2d15b8785a95 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -11,10 +11,11 @@ #include #include =20 #include =20 +#include "filetable.h" #include "io_uring.h" #include "openclose.h" #include "rsrc.h" #include "memmap.h" #include "register.h" diff --git a/io_uring/rw.c b/io_uring/rw.c index dcde5bb7421a..ab6b4afccec3 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -13,10 +13,11 @@ #include #include =20 #include =20 +#include "filetable.h" #include "io_uring.h" #include "opdef.h" #include "kbuf.h" #include "alloc_cache.h" #include "rsrc.h" diff --git a/io_uring/splice.c b/io_uring/splice.c index 35ce4e60b495..e81ebbb91925 100644 --- a/io_uring/splice.c +++ b/io_uring/splice.c @@ -9,10 +9,11 @@ #include #include =20 #include =20 +#include "filetable.h" #include "io_uring.h" #include "splice.h" =20 struct io_splice { struct file *file_out; --=20 2.45.2 From nobody Fri Oct 3 08:49:05 2025 Received: from mail-io1-f98.google.com (mail-io1-f98.google.com [209.85.166.98]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 504232C2361 for ; Wed, 3 Sep 2025 03:27:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.98 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756870022; cv=none; b=WrZDFXpkraXA0q40YWVb7bh0T69FQRz/SJUdFdOnRqJbObAD6DfmzfjmPhthJv6k7+GEugEfDmhLmWzGKE1t6Xt4XbVCrTkGwDYSfdmwVKfhwcI6wvXS0TSX5BGCQpG6VYaGfN4IkGCMsJaQa/60oqRcS7YwGNyucml/hLw1ZTs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756870022; c=relaxed/simple; bh=vT37VwGjZOyCl9u5Iz9sHTvtTUWCCe6Xb+1IziJLUvo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fBv/5bQMF+GScTCouDX2jgqcTkMtUQ9CY1PGT40LeiZt6Cyav84A/KqOiAfdUF50m/p0gxKTPbR/rO56FOHhqFyzYV6wZYDKLDJq4LYzSVTzntnbMkfLH94qarv86Vx+R0zOyseHBgrJH8QXZy7iXwWMg8S4QvzOrCjE5JFUCYg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=LuGX1/Pn; arc=none smtp.client-ip=209.85.166.98 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="LuGX1/Pn" Received: by mail-io1-f98.google.com with SMTP id ca18e2360f4ac-88470af157bso5874739f.3 for ; Tue, 02 Sep 2025 20:27:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1756870019; x=1757474819; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=o/lOOvLhOLy/sorN2ba6lIemJd4kQXZprLIftfp53Us=; b=LuGX1/PnYywp65OygFqFuKoe8cbMSZnDmR/5RYZdkgbivLBbyGQrG3uovIEpVhQupT 5s7PqJ2HZDKG8odigdx7qF7ibegml1gAL7OhI6nRWYHjaRVIBoBo+Hwe0U+qiHWoscyo 5WbIbUm13RJgiP3ZCjLl/ZEuErIozFlFHo9/JPl/I4waI3gZqF7xdMXpLKFp+m4Gg3nc +9DqR6yroOxuNM1KaFmrD/vOyzEzjqy/CIdB11GqtPeJo9J+f8Dw0muIMycfP7QdSOG9 Rxkq4jK3TKJX2HwQ+oTkAJAupkBN+p0Bfzg95RhDuiteKIa/xPuGhW18edKA9ty1skMt THcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756870019; x=1757474819; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=o/lOOvLhOLy/sorN2ba6lIemJd4kQXZprLIftfp53Us=; b=dyqckRDDjmNzCbjshmShNAMmaheB2iNzuJ1szR7wsGmSQMUgY7B4hpu/HWIzBO+nt+ 4ijs1k5eRyvjdFrLijhKLpzrqdhvadLA7BPC27/NgRIJrp+ZFMg0DuyHYnDZFbPspkKb bxRGT6U6gatEjSyrQW8gjnEcEugci7Lv81UvgsJtBMgyI9tvDtw+06txb/E/zK2BCYmN 1EfIYWuisd7/Je4DbvfH1UJ4wlLkdtbyN/FdepUI/bNRiyUMfTi1QhkmUddaS5CEdeRK B43LrP45/FTX2eI6Zo3g5QAlHDtY8jFg000Dqw/hDfEzoxtvBVuEKbZsHFraawuEPj5s 08zg== X-Forwarded-Encrypted: i=1; AJvYcCUh/wSySblCsN9vNPxShHJSGyQbPiuIjhI20J6hzQ8RqAgpJpo9jJQXdXeZWBhMRR9EGI7tsqXhd0LinUE=@vger.kernel.org X-Gm-Message-State: AOJu0Yyn9mBxJV3n/NQSDKzbm31c+1R2PSfi+c+js9k+XKp6UETWT4xO r54ipz/wQaDHvUhFCkYVDRrOKxnG3RkgISjt0cgVBrRY5Ojr+h9eXlZtJJch/MiIQ8POTBnPczk luel/7EugSixdMUdm5c+aQ4XaklkOQl0SYwqyhWP81jrrJmbk0sbA X-Gm-Gg: ASbGncsCqK3kNltSgj+IYUKzfHkzH/rrWxD9Z5fhA1GaTWBqyhW5peok0e/HKZL6vQx I15HZ/P5IkqwxU/QhWu3k5Iia9YSiECrPgKyW7EZL/d+qKV/35RrFuhW9jen5duxNXvdo/YyPZH 3867atSVEWfeOxCN/sjk2Zw+dulPvRGq4GZM/9sARbvjOTIYmA7gkAcBiiE5IzP48v3SlDBJPbt /g98Rd1col3eXa8cby8lnaX6XSGUxN4Y8QSb8BqiatzlN+b3gQi9OvmvMyLhhquycDRfrp92IGv /izY2Gn/BvTGyr77jOGardb5Xa9Pf336hgJWl0rg6Jyqvu1VOfycIPBj3Q== X-Google-Smtp-Source: AGHT+IG6TC1qm4eO2jp6q81u7BtA9gndoChMx4n9tAGaldjwUIcSFnLmfQ16RVQQVkcE5oHUy/Sqof2Cdqst X-Received: by 2002:a05:6e02:1fc5:b0:3ef:beb7:dba4 with SMTP id e9e14a558f8ab-3f321afda65mr110085785ab.2.1756870019411; Tue, 02 Sep 2025 20:26:59 -0700 (PDT) Received: from c7-smtp-2023.dev.purestorage.com ([2620:125:9017:12:36:3:5:0]) by smtp-relay.gmail.com with ESMTPS id 8926c6da1cb9f-50d8f31b06bsm723181173.48.2025.09.02.20.26.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Sep 2025 20:26:59 -0700 (PDT) X-Relaying-Domain: purestorage.com Received: from dev-csander.dev.purestorage.com (dev-csander.dev.purestorage.com [10.7.70.37]) by c7-smtp-2023.dev.purestorage.com (Postfix) with ESMTP id C06EF34029E; Tue, 2 Sep 2025 21:26:58 -0600 (MDT) Received: by dev-csander.dev.purestorage.com (Postfix, from userid 1557716354) id BD418E41964; Tue, 2 Sep 2025 21:26:58 -0600 (MDT) From: Caleb Sander Mateos To: Jens Axboe Cc: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org, Caleb Sander Mateos Subject: [PATCH 2/4] io_uring/rsrc: respect submitter_task in io_register_clone_buffers() Date: Tue, 2 Sep 2025 21:26:54 -0600 Message-ID: <20250903032656.2012337-3-csander@purestorage.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250903032656.2012337-1-csander@purestorage.com> References: <20250903032656.2012337-1-csander@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" io_ring_ctx's enabled with IORING_SETUP_SINGLE_ISSUER are only allowed a single task submitting to the ctx. Although the documentation only mentions this restriction applying to io_uring_enter() syscalls, commit d7cce96c449e ("io_uring: limit registration w/ SINGLE_ISSUER") extends it to io_uring_register(). Ensuring only one task interacts with the io_ring_ctx will be important to allow this task to avoid taking the uring_lock. There is, however, one gap in these checks: io_register_clone_buffers() may take the uring_lock on a second (source) io_ring_ctx, but __io_uring_register() only checks the current thread against the *destination* io_ring_ctx's submitter_task. Fail the IORING_REGISTER_CLONE_BUFFERS with -EEXIST if the source io_ring_ctx has a registered submitter_task other than the current task. Signed-off-by: Caleb Sander Mateos --- io_uring/rsrc.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index 2d15b8785a95..1e5b7833076a 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -1298,14 +1298,21 @@ int io_register_clone_buffers(struct io_ring_ctx *c= tx, void __user *arg) =20 src_ctx =3D file->private_data; if (src_ctx !=3D ctx) { mutex_unlock(&ctx->uring_lock); lock_two_rings(ctx, src_ctx); + + if (src_ctx->submitter_task &&=20 + src_ctx->submitter_task !=3D current) { + ret =3D -EEXIST; + goto out; + } } =20 ret =3D io_clone_buffers(ctx, src_ctx, &buf); =20 +out: if (src_ctx !=3D ctx) mutex_unlock(&src_ctx->uring_lock); =20 fput(file); return ret; --=20 2.45.2 From nobody Fri Oct 3 08:49:05 2025 Received: from mail-yb1-f226.google.com (mail-yb1-f226.google.com [209.85.219.226]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 21CA91E833D for ; Wed, 3 Sep 2025 03:27:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.226 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756870023; cv=none; b=jhJdCmFA4pTuD/OC2VgSVwyMbr0hNxL9n5IEIQpzhU1l3s+fP9GzWFy5+6Dt7CHvK+ECNqSA1Nqqk/cDkmtQeUeQgcG8xJRqsYD26TiMcePX7ElLBB7BTqU9dAoGui5nbE3vfd+Xy0jQ4ob6VH7BFEEs65IywBo4+sP+7TW95AA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756870023; c=relaxed/simple; bh=VGhRdcgIPamkVki4XU7NEO9Cf8tfJP+zcLBPTF0R5ZY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=q594wqkf4DKG9efFLkFhvITz3HjHy5oRLibLqsbW1Pgn3PzlPLIboKwnQ4NRiR8+tVw+rzy7vXE723W6Jm0DCigckv3JgdFIemCenUShno5/Elmk9UuWvq0Enx3Oqk77z+8z1vSnfmNiLdsbdgBplmJbOIofnkl1E3jOxCKc6sU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=H02xFUjQ; arc=none smtp.client-ip=209.85.219.226 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="H02xFUjQ" Received: by mail-yb1-f226.google.com with SMTP id 3f1490d57ef6-e96e9b9baaaso983944276.1 for ; Tue, 02 Sep 2025 20:27:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1756870020; x=1757474820; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=uQvmwor8x7PnbCSbpEzXCUbcOnrK52uWsyerHQsV0X4=; b=H02xFUjQUlx80ZkphFnH7LBLXzMKCF7RbkhwNCy4z+i9GCzzDclUYolt2c5Rzv/6kc UzE+npsoyJzXSlUJAtdX9CbZRWAtaI7OWqC8LZNYCQZSY3CYpgTHiftHfLB1MJZZHyB5 TnLN6c4/AUOvkYTLxwDGxkiv1UbUZoDFP+0fhHtSvHg8tyGcejXfBxBUD0E79mG4Ppxu CkYW6mZVgHsY9Qs69NHWuEtM/CGb0TEL2jnOHBj/I5RqXaUYscytboVHmHdpfdG0jsQQ tuf7jIDU5E+gKDQto8b6Cxi3/csjd3pQmwqOT5Z4uyzTAvmVpiw88yg3dHi0pZ6VrZik nhnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756870020; x=1757474820; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=uQvmwor8x7PnbCSbpEzXCUbcOnrK52uWsyerHQsV0X4=; b=U+e0XT6p9hLRl+THQsZvdac/965y/V2+l0+gNhNp5E/rNFA1qp+1nM0RxDzxM3bxGu 0T8664TNRelgEN7jpWknVuE2fTBeObdIoMRzZS7mjPIvSo2nmeVj0sLMYXEAAdOKYysp c9Cx88zFUsmW7NkQo0STMuue0HlWb6cIdZgsNGS6T3Gkdm8JH5pTj+DfpifmHu6ZBsF7 tskkpIqM5zAWI2pJd8cFR9Uuar9vElObiSccw1NJHIiYvYd8XHgLW5yIxZjRGmi2iP7O uBqHqgbJwPld6zavZSJ1+W/R2MOCy9vVU+V2z6RmQZQ7RU/DAHjg02Az1B/iCzlo8TQk s+3Q== X-Forwarded-Encrypted: i=1; AJvYcCU9+DaVxrvjocxWCVVSiU92pC9HtzTw2eZxnhs8q6h7/aZ5qrrfBEh5/VB1GKB30YR7Qv+zKqGrdIhITYA=@vger.kernel.org X-Gm-Message-State: AOJu0YxgpyGprVK4WrMRu8W0VfKW9ceeGgSTQNBLmOMwiCdInbS+mVFw +vs0GuZTKWqaXL9yuxTZkZc6Y3YtxC5GOL4KOMVdmqapSMlLYxPtzrUpZd0WmuYM1QX0jZB/J2z tCw29p285MBja6GGUPZHrD9rXfe3edv28R8/G X-Gm-Gg: ASbGncuSIcqEKKvMyx90hWGM61AdEzPFYV++1oCFKZj5oGWmC2kArUuBSNOBeZ8DTRR 9YBmL54iZfwX4sYWbQF8IEretsjlbnN6dNBCo/ymSSPkplbqP1NC4gYdnCpKvurGd8SO+qTTBk4 2DTdMny02RWsm4Mbjsz7LyFTAIaKWBLM1Dej1vBz8DrwaroLVS/UWS5NQ7oM0yr7T9EDmY48owS qADD0f9IzroGzchHq0ljX4HXJN8SuP7knDLope2CglVI8+MM9hoG/B6awfcfl6ER6FZMWugV73h ZK5Xsxdu3LJqAJfoBM2+uborCMvgQ7tenyYeYdum1JvViH6FBMyWc8uETqT6IiJrGBEoTA1w X-Google-Smtp-Source: AGHT+IGBqr10WvIKh9YRjMH6vAho8CO+frP1+TdiO5aBt99ESoONK7h28C9iEpVtu2Igo0Xx4ZC/7CdO1gxi X-Received: by 2002:a05:6902:1896:b0:e97:d40:4745 with SMTP id 3f1490d57ef6-e989be977efmr11484179276.2.1756870020022; Tue, 02 Sep 2025 20:27:00 -0700 (PDT) Received: from c7-smtp-2023.dev.purestorage.com ([208.88.159.128]) by smtp-relay.gmail.com with ESMTPS id 3f1490d57ef6-e9bbdf67835sm265156276.8.2025.09.02.20.26.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Sep 2025 20:27:00 -0700 (PDT) X-Relaying-Domain: purestorage.com Received: from dev-csander.dev.purestorage.com (unknown [IPv6:2620:125:9007:640:ffff::1199]) by c7-smtp-2023.dev.purestorage.com (Postfix) with ESMTP id 2A8BE3404DD; Tue, 2 Sep 2025 21:26:59 -0600 (MDT) Received: by dev-csander.dev.purestorage.com (Postfix, from userid 1557716354) id 28979E41964; Tue, 2 Sep 2025 21:26:59 -0600 (MDT) From: Caleb Sander Mateos To: Jens Axboe Cc: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org, Caleb Sander Mateos Subject: [PATCH 3/4] io_uring: factor out uring_lock helpers Date: Tue, 2 Sep 2025 21:26:55 -0600 Message-ID: <20250903032656.2012337-4-csander@purestorage.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250903032656.2012337-1-csander@purestorage.com> References: <20250903032656.2012337-1-csander@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" A subsequent commit will skip acquiring the io_ring_ctx uring_lock in io_uring_enter() and io_handle_tw_list() for IORING_SETUP_SINGLE_ISSUER. Prepare for this change by factoring out the uring_lock accesses under these functions into helper functions: - io_ring_ctx_lock() for mutex_lock(&ctx->uring_lock) - io_ring_ctx_unlock() for mutex_unlock(&ctx->uring_lock) - io_ring_ctx_assert_locked() for lockdep_assert_held(&ctx->uring_lock) For now, the helpers unconditionally call the mutex functions. But a subsequent commit will condition them on !IORING_SETUP_SINGLE_ISSUER. Signed-off-by: Caleb Sander Mateos --- io_uring/filetable.c | 3 ++- io_uring/io_uring.c | 51 ++++++++++++++++++++++++++------------------ io_uring/io_uring.h | 28 ++++++++++++++++++------ io_uring/kbuf.c | 6 +++--- io_uring/notif.c | 5 +++-- io_uring/notif.h | 3 ++- io_uring/poll.c | 2 +- io_uring/rsrc.c | 2 +- io_uring/rsrc.h | 3 ++- io_uring/rw.c | 2 +- io_uring/waitid.c | 2 +- 11 files changed, 67 insertions(+), 40 deletions(-) diff --git a/io_uring/filetable.c b/io_uring/filetable.c index a21660e3145a..aae283e77856 100644 --- a/io_uring/filetable.c +++ b/io_uring/filetable.c @@ -55,14 +55,15 @@ void io_free_file_tables(struct io_ring_ctx *ctx, struc= t io_file_table *table) table->bitmap =3D NULL; } =20 static int io_install_fixed_file(struct io_ring_ctx *ctx, struct file *fil= e, u32 slot_index) - __must_hold(&req->ctx->uring_lock) { struct io_rsrc_node *node; =20 + io_ring_ctx_assert_locked(ctx); + if (io_is_uring_fops(file)) return -EBADF; if (!ctx->file_table.data.nr) return -ENXIO; if (slot_index >=3D ctx->file_table.data.nr) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 9c1190b19adf..7f19b6da5d3d 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -554,11 +554,11 @@ static unsigned io_linked_nr(struct io_kiocb *req) =20 static __cold noinline void io_queue_deferred(struct io_ring_ctx *ctx) { bool drain_seen =3D false, first =3D true; =20 - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); __io_req_caches_free(ctx); =20 while (!list_empty(&ctx->defer_list)) { struct io_defer_entry *de =3D list_first_entry(&ctx->defer_list, struct io_defer_entry, list); @@ -925,11 +925,11 @@ bool io_post_aux_cqe(struct io_ring_ctx *ctx, u64 use= r_data, s32 res, u32 cflags * Must be called from inline task_work so we now a flush will happen late= r, * and obviously with ctx->uring_lock held (tw always has that). */ void io_add_aux_cqe(struct io_ring_ctx *ctx, u64 user_data, s32 res, u32 c= flags) { - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); lockdep_assert(ctx->lockless_cq); =20 if (!io_fill_cqe_aux(ctx, user_data, res, cflags)) { struct io_cqe cqe =3D io_init_cqe(user_data, res, cflags); =20 @@ -954,11 +954,11 @@ bool io_req_post_cqe(struct io_kiocb *req, s32 res, u= 32 cflags) */ if (!wq_list_empty(&ctx->submit_state.compl_reqs)) __io_submit_flush_completions(ctx); =20 lockdep_assert(!io_wq_current_is_worker()); - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); =20 if (!ctx->lockless_cq) { spin_lock(&ctx->completion_lock); posted =3D io_fill_cqe_aux(ctx, req->cqe.user_data, res, cflags); spin_unlock(&ctx->completion_lock); @@ -978,11 +978,11 @@ bool io_req_post_cqe32(struct io_kiocb *req, struct i= o_uring_cqe cqe[2]) { struct io_ring_ctx *ctx =3D req->ctx; bool posted; =20 lockdep_assert(!io_wq_current_is_worker()); - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); =20 cqe[0].user_data =3D req->cqe.user_data; if (!ctx->lockless_cq) { spin_lock(&ctx->completion_lock); posted =3D io_fill_cqe_aux32(ctx, cqe); @@ -1032,15 +1032,14 @@ static void io_req_complete_post(struct io_kiocb *r= eq, unsigned issue_flags) */ req_ref_put(req); } =20 void io_req_defer_failed(struct io_kiocb *req, s32 res) - __must_hold(&ctx->uring_lock) { const struct io_cold_def *def =3D &io_cold_defs[req->opcode]; =20 - lockdep_assert_held(&req->ctx->uring_lock); + io_ring_ctx_assert_locked(req->ctx); =20 req_set_fail(req); io_req_set_res(req, res, io_put_kbuf(req, res, NULL)); if (def->fail) def->fail(req); @@ -1052,16 +1051,17 @@ void io_req_defer_failed(struct io_kiocb *req, s32 = res) * handlers and io_issue_sqe() are done with it, e.g. inline completion pa= th. * Because of that, io_alloc_req() should be called only under ->uring_lock * and with extra caution to not get a request that is still worked on. */ __cold bool __io_alloc_req_refill(struct io_ring_ctx *ctx) - __must_hold(&ctx->uring_lock) { gfp_t gfp =3D GFP_KERNEL | __GFP_NOWARN | __GFP_ZERO; void *reqs[IO_REQ_ALLOC_BATCH]; int ret; =20 + io_ring_ctx_assert_locked(ctx); + ret =3D kmem_cache_alloc_bulk(req_cachep, gfp, ARRAY_SIZE(reqs), reqs); =20 /* * Bulk alloc is all-or-nothing. If we fail to get a batch, * retry single alloc to be on the safe side. @@ -1126,11 +1126,11 @@ static void ctx_flush_and_put(struct io_ring_ctx *c= tx, io_tw_token_t tw) return; if (ctx->flags & IORING_SETUP_TASKRUN_FLAG) atomic_andnot(IORING_SQ_TASKRUN, &ctx->rings->sq_flags); =20 io_submit_flush_completions(ctx); - mutex_unlock(&ctx->uring_lock); + io_ring_ctx_unlock(ctx); percpu_ref_put(&ctx->refs); } =20 /* * Run queued task_work, returning the number of entries processed in *cou= nt. @@ -1150,11 +1150,11 @@ struct llist_node *io_handle_tw_list(struct llist_n= ode *node, io_task_work.node); =20 if (req->ctx !=3D ctx) { ctx_flush_and_put(ctx, ts); ctx =3D req->ctx; - mutex_lock(&ctx->uring_lock); + io_ring_ctx_lock(ctx); percpu_ref_get(&ctx->refs); } INDIRECT_CALL_2(req->io_task_work.func, io_poll_task_func, io_req_rw_complete, req, ts); @@ -1502,12 +1502,13 @@ static inline void io_req_put_rsrc_nodes(struct io_= kiocb *req) io_put_rsrc_node(req->ctx, req->buf_node); } =20 static void io_free_batch_list(struct io_ring_ctx *ctx, struct io_wq_work_node *node) - __must_hold(&ctx->uring_lock) { + io_ring_ctx_assert_locked(ctx); + do { struct io_kiocb *req =3D container_of(node, struct io_kiocb, comp_list); =20 if (unlikely(req->flags & IO_REQ_CLEAN_SLOW_FLAGS)) { @@ -1543,15 +1544,16 @@ static void io_free_batch_list(struct io_ring_ctx *= ctx, io_req_add_to_cache(req, ctx); } while (node); } =20 void __io_submit_flush_completions(struct io_ring_ctx *ctx) - __must_hold(&ctx->uring_lock) { struct io_submit_state *state =3D &ctx->submit_state; struct io_wq_work_node *node; =20 + io_ring_ctx_assert_locked(ctx); + __io_cq_lock(ctx); __wq_list_for_each(node, &state->compl_reqs) { struct io_kiocb *req =3D container_of(node, struct io_kiocb, comp_list); =20 @@ -1767,16 +1769,17 @@ io_req_flags_t io_file_get_flags(struct file *file) res |=3D REQ_F_SUPPORT_NOWAIT; return res; } =20 static __cold void io_drain_req(struct io_kiocb *req) - __must_hold(&ctx->uring_lock) { struct io_ring_ctx *ctx =3D req->ctx; bool drain =3D req->flags & IOSQE_IO_DRAIN; struct io_defer_entry *de; =20 + io_ring_ctx_assert_locked(ctx); + de =3D kmalloc(sizeof(*de), GFP_KERNEL_ACCOUNT); if (!de) { io_req_defer_failed(req, -ENOMEM); return; } @@ -2043,12 +2046,13 @@ static int io_req_sqe_copy(struct io_kiocb *req, un= signed int issue_flags) def->sqe_copy(req); return 0; } =20 static void io_queue_async(struct io_kiocb *req, unsigned int issue_flags,= int ret) - __must_hold(&req->ctx->uring_lock) { + io_ring_ctx_assert_locked(req->ctx); + if (ret !=3D -EAGAIN || (req->flags & REQ_F_NOWAIT)) { fail: io_req_defer_failed(req, ret); return; } @@ -2068,16 +2072,17 @@ static void io_queue_async(struct io_kiocb *req, un= signed int issue_flags, int r break; } } =20 static inline void io_queue_sqe(struct io_kiocb *req, unsigned int extra_f= lags) - __must_hold(&req->ctx->uring_lock) { unsigned int issue_flags =3D IO_URING_F_NONBLOCK | IO_URING_F_COMPLETE_DEFER | extra_flags; int ret; =20 + io_ring_ctx_assert_locked(req->ctx); + ret =3D io_issue_sqe(req, issue_flags); =20 /* * We async punt it if the file wasn't marked NOWAIT, or if the file * doesn't support non-blocking read/write attempts @@ -2085,12 +2090,13 @@ static inline void io_queue_sqe(struct io_kiocb *re= q, unsigned int extra_flags) if (unlikely(ret)) io_queue_async(req, issue_flags, ret); } =20 static void io_queue_sqe_fallback(struct io_kiocb *req) - __must_hold(&req->ctx->uring_lock) { + io_ring_ctx_assert_locked(req->ctx); + if (unlikely(req->flags & REQ_F_FAIL)) { /* * We don't submit, fail them all, for that replace hardlinks * with normal links. Extra REQ_F_LINK is tolerated. */ @@ -2155,17 +2161,18 @@ static __cold int io_init_fail_req(struct io_kiocb = *req, int err) return err; } =20 static int io_init_req(struct io_ring_ctx *ctx, struct io_kiocb *req, const struct io_uring_sqe *sqe) - __must_hold(&ctx->uring_lock) { const struct io_issue_def *def; unsigned int sqe_flags; int personality; u8 opcode; =20 + io_ring_ctx_assert_locked(ctx); + req->ctx =3D ctx; req->opcode =3D opcode =3D READ_ONCE(sqe->opcode); /* same numerical values with corresponding REQ_F_*, safe to copy */ sqe_flags =3D READ_ONCE(sqe->flags); req->flags =3D (__force io_req_flags_t) sqe_flags; @@ -2290,15 +2297,16 @@ static __cold int io_submit_fail_init(const struct = io_uring_sqe *sqe, return 0; } =20 static inline int io_submit_sqe(struct io_ring_ctx *ctx, struct io_kiocb *= req, const struct io_uring_sqe *sqe) - __must_hold(&ctx->uring_lock) { struct io_submit_link *link =3D &ctx->submit_state.link; int ret; =20 + io_ring_ctx_assert_locked(ctx); + ret =3D io_init_req(ctx, req, sqe); if (unlikely(ret)) return io_submit_fail_init(sqe, req, ret); =20 trace_io_uring_submit_req(req); @@ -2419,16 +2427,17 @@ static bool io_get_sqe(struct io_ring_ctx *ctx, con= st struct io_uring_sqe **sqe) *sqe =3D &ctx->sq_sqes[head]; return true; } =20 int io_submit_sqes(struct io_ring_ctx *ctx, unsigned int nr) - __must_hold(&ctx->uring_lock) { unsigned int entries =3D io_sqring_entries(ctx); unsigned int left; int ret; =20 + io_ring_ctx_assert_locked(ctx); + if (unlikely(!entries)) return 0; /* make sure SQ entry isn't read before tail */ ret =3D left =3D min(nr, entries); io_get_task_refs(left); @@ -3518,14 +3527,14 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u= 32, to_submit, } else if (to_submit) { ret =3D io_uring_add_tctx_node(ctx); if (unlikely(ret)) goto out; =20 - mutex_lock(&ctx->uring_lock); + io_ring_ctx_lock(ctx); ret =3D io_submit_sqes(ctx, to_submit); if (ret !=3D to_submit) { - mutex_unlock(&ctx->uring_lock); + io_ring_ctx_unlock(ctx); goto out; } if (flags & IORING_ENTER_GETEVENTS) { if (ctx->syscall_iopoll) goto iopoll_locked; @@ -3534,11 +3543,11 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u= 32, to_submit, * it should handle ownership problems if any. */ if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) (void)io_run_local_work_locked(ctx, min_complete); } - mutex_unlock(&ctx->uring_lock); + io_ring_ctx_unlock(ctx); } =20 if (flags & IORING_ENTER_GETEVENTS) { int ret2; =20 diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index d62b7d9fafed..a0580a1bf6b5 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -119,20 +119,35 @@ bool __io_alloc_req_refill(struct io_ring_ctx *ctx); bool io_match_task_safe(struct io_kiocb *head, struct io_uring_task *tctx, bool cancel_all); =20 void io_activate_pollwq(struct io_ring_ctx *ctx); =20 +static inline void io_ring_ctx_lock(struct io_ring_ctx *ctx) +{ + mutex_lock(&ctx->uring_lock); +} + +static inline void io_ring_ctx_unlock(struct io_ring_ctx *ctx) +{ + mutex_unlock(&ctx->uring_lock); +} + +static inline void io_ring_ctx_assert_locked(const struct io_ring_ctx *ctx) +{ + lockdep_assert_held(&ctx->uring_lock); +} + static inline void io_lockdep_assert_cq_locked(struct io_ring_ctx *ctx) { #if defined(CONFIG_PROVE_LOCKING) lockdep_assert(in_task()); =20 if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); =20 if (ctx->flags & IORING_SETUP_IOPOLL) { - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); } else if (!ctx->task_complete) { lockdep_assert_held(&ctx->completion_lock); } else if (ctx->submitter_task) { /* * ->submitter_task may be NULL and we can still post a CQE, @@ -300,11 +315,11 @@ static inline void io_put_file(struct io_kiocb *req) } =20 static inline void io_ring_submit_unlock(struct io_ring_ctx *ctx, unsigned issue_flags) { - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); if (unlikely(issue_flags & IO_URING_F_UNLOCKED)) mutex_unlock(&ctx->uring_lock); } =20 static inline void io_ring_submit_lock(struct io_ring_ctx *ctx, @@ -316,11 +331,11 @@ static inline void io_ring_submit_lock(struct io_ring= _ctx *ctx, * The only exception is when we've detached the request and issue it * from an async worker thread, grab the lock for that case. */ if (unlikely(issue_flags & IO_URING_F_UNLOCKED)) mutex_lock(&ctx->uring_lock); - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); } =20 static inline void io_commit_cqring(struct io_ring_ctx *ctx) { /* order cqe stores with ring update */ @@ -428,24 +443,23 @@ static inline bool io_task_work_pending(struct io_rin= g_ctx *ctx) return task_work_pending(current) || io_local_work_pending(ctx); } =20 static inline void io_tw_lock(struct io_ring_ctx *ctx, io_tw_token_t tw) { - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); } =20 /* * Don't complete immediately but use deferred completion infrastructure. * Protected by ->uring_lock and can only be used either with * IO_URING_F_COMPLETE_DEFER or inside a tw handler holding the mutex. */ static inline void io_req_complete_defer(struct io_kiocb *req) - __must_hold(&req->ctx->uring_lock) { struct io_submit_state *state =3D &req->ctx->submit_state; =20 - lockdep_assert_held(&req->ctx->uring_lock); + io_ring_ctx_assert_locked(req->ctx); =20 wq_list_add_tail(&req->comp_list, &state->compl_reqs); } =20 static inline void io_commit_cqring_flush(struct io_ring_ctx *ctx) diff --git a/io_uring/kbuf.c b/io_uring/kbuf.c index 3e9aab21af9d..ea6f3588d875 100644 --- a/io_uring/kbuf.c +++ b/io_uring/kbuf.c @@ -68,11 +68,11 @@ bool io_kbuf_commit(struct io_kiocb *req, } =20 static inline struct io_buffer_list *io_buffer_get_list(struct io_ring_ctx= *ctx, unsigned int bgid) { - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); =20 return xa_load(&ctx->io_bl_xa, bgid); } =20 static int io_buffer_add_list(struct io_ring_ctx *ctx, @@ -337,11 +337,11 @@ int io_buffers_peek(struct io_kiocb *req, struct buf_= sel_arg *arg, { struct io_ring_ctx *ctx =3D req->ctx; struct io_buffer_list *bl; int ret; =20 - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); =20 bl =3D io_buffer_get_list(ctx, arg->buf_group); if (unlikely(!bl)) return -ENOENT; =20 @@ -393,11 +393,11 @@ static int io_remove_buffers_legacy(struct io_ring_ct= x *ctx, { unsigned long i =3D 0; struct io_buffer *nxt; =20 /* protects io_buffers_cache */ - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); WARN_ON_ONCE(bl->flags & IOBL_BUF_RING); =20 for (i =3D 0; i < nbufs && !list_empty(&bl->buf_list); i++) { nxt =3D list_first_entry(&bl->buf_list, struct io_buffer, list); list_del(&nxt->list); diff --git a/io_uring/notif.c b/io_uring/notif.c index 8c92e9cde2c6..9dd248fcb213 100644 --- a/io_uring/notif.c +++ b/io_uring/notif.c @@ -14,11 +14,11 @@ static const struct ubuf_info_ops io_ubuf_ops; static void io_notif_tw_complete(struct io_kiocb *notif, io_tw_token_t tw) { struct io_notif_data *nd =3D io_notif_to_data(notif); struct io_ring_ctx *ctx =3D notif->ctx; =20 - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); =20 do { notif =3D cmd_to_io_kiocb(nd); =20 if (WARN_ON_ONCE(ctx !=3D notif->ctx)) @@ -108,15 +108,16 @@ static const struct ubuf_info_ops io_ubuf_ops =3D { .complete =3D io_tx_ubuf_complete, .link_skb =3D io_link_skb, }; =20 struct io_kiocb *io_alloc_notif(struct io_ring_ctx *ctx) - __must_hold(&ctx->uring_lock) { struct io_kiocb *notif; struct io_notif_data *nd; =20 + io_ring_ctx_assert_locked(ctx); + if (unlikely(!io_alloc_req(ctx, ¬if))) return NULL; notif->ctx =3D ctx; notif->opcode =3D IORING_OP_NOP; notif->flags =3D 0; diff --git a/io_uring/notif.h b/io_uring/notif.h index f3589cfef4a9..c33c9a1179c9 100644 --- a/io_uring/notif.h +++ b/io_uring/notif.h @@ -31,14 +31,15 @@ static inline struct io_notif_data *io_notif_to_data(st= ruct io_kiocb *notif) { return io_kiocb_to_cmd(notif, struct io_notif_data); } =20 static inline void io_notif_flush(struct io_kiocb *notif) - __must_hold(¬if->ctx->uring_lock) { struct io_notif_data *nd =3D io_notif_to_data(notif); =20 + io_ring_ctx_assert_locked(notif->ctx); + io_tx_ubuf_complete(NULL, &nd->uarg, true); } =20 static inline int io_notif_account_mem(struct io_kiocb *notif, unsigned le= n) { diff --git a/io_uring/poll.c b/io_uring/poll.c index ea75c5cd81a0..ba71403c8fd8 100644 --- a/io_uring/poll.c +++ b/io_uring/poll.c @@ -121,11 +121,11 @@ static struct io_poll *io_poll_get_single(struct io_k= iocb *req) static void io_poll_req_insert(struct io_kiocb *req) { struct io_hash_table *table =3D &req->ctx->cancel_table; u32 index =3D hash_long(req->cqe.user_data, table->hash_bits); =20 - lockdep_assert_held(&req->ctx->uring_lock); + io_ring_ctx_assert_locked(req->ctx); =20 hlist_add_head(&req->hash_node, &table->hbs[index].list); } =20 static void io_init_poll_iocb(struct io_poll *poll, __poll_t events) diff --git a/io_uring/rsrc.c b/io_uring/rsrc.c index 1e5b7833076a..1c1753de7340 100644 --- a/io_uring/rsrc.c +++ b/io_uring/rsrc.c @@ -347,11 +347,11 @@ static int __io_register_rsrc_update(struct io_ring_c= tx *ctx, unsigned type, struct io_uring_rsrc_update2 *up, unsigned nr_args) { __u32 tmp; =20 - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); =20 if (check_add_overflow(up->offset, nr_args, &tmp)) return -EOVERFLOW; =20 switch (type) { diff --git a/io_uring/rsrc.h b/io_uring/rsrc.h index a3ca6ba66596..d537a3b895d6 100644 --- a/io_uring/rsrc.h +++ b/io_uring/rsrc.h @@ -2,10 +2,11 @@ #ifndef IOU_RSRC_H #define IOU_RSRC_H =20 #include #include +#include "io_uring.h" =20 #define IO_VEC_CACHE_SOFT_CAP 256 =20 enum { IORING_RSRC_FILE =3D 0, @@ -97,11 +98,11 @@ static inline struct io_rsrc_node *io_rsrc_node_lookup(= struct io_rsrc_data *data return NULL; } =20 static inline void io_put_rsrc_node(struct io_ring_ctx *ctx, struct io_rsr= c_node *node) { - lockdep_assert_held(&ctx->uring_lock); + io_ring_ctx_assert_locked(ctx); if (!--node->refs) io_free_rsrc_node(ctx, node); } =20 static inline bool io_reset_rsrc_node(struct io_ring_ctx *ctx, diff --git a/io_uring/rw.c b/io_uring/rw.c index ab6b4afccec3..f00e02a02dc7 100644 --- a/io_uring/rw.c +++ b/io_uring/rw.c @@ -461,11 +461,11 @@ int io_read_mshot_prep(struct io_kiocb *req, const st= ruct io_uring_sqe *sqe) return 0; } =20 void io_readv_writev_cleanup(struct io_kiocb *req) { - lockdep_assert_held(&req->ctx->uring_lock); + io_ring_ctx_assert_locked(req->ctx); io_rw_recycle(req, 0); } =20 static inline loff_t *io_kiocb_update_pos(struct io_kiocb *req) { diff --git a/io_uring/waitid.c b/io_uring/waitid.c index 26c118f3918d..f7a5054d4d81 100644 --- a/io_uring/waitid.c +++ b/io_uring/waitid.c @@ -114,11 +114,11 @@ static void io_waitid_complete(struct io_kiocb *req, = int ret) struct io_waitid *iw =3D io_kiocb_to_cmd(req, struct io_waitid); =20 /* anyone completing better be holding a reference */ WARN_ON_ONCE(!(atomic_read(&iw->refs) & IO_WAITID_REF_MASK)); =20 - lockdep_assert_held(&req->ctx->uring_lock); + io_ring_ctx_assert_locked(req->ctx); =20 hlist_del_init(&req->hash_node); =20 ret =3D io_waitid_finish(req, ret); if (ret < 0) --=20 2.45.2 From nobody Fri Oct 3 08:49:05 2025 Received: from mail-qk1-f225.google.com (mail-qk1-f225.google.com [209.85.222.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 333692C3247 for ; Wed, 3 Sep 2025 03:27:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.225 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756870022; cv=none; b=MuhkBVCG2549ULh46zRJ2UxNiiIgb+mSKnt7n75geu3Gw6c5leQ5004tCQo0IyjL7beuwqDS5El37+iRFeSGTo3EwVUa+KLfLRnA0/ixPaMnPeG0RP2cvHnevR/ohnU4fT/ApZ4w+5UkM3QaLyoeb2hEdFDPGpd09vs7j5WWDgg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756870022; c=relaxed/simple; bh=ImgJ+08fKEjfMojQOkHQFAmBOTeJebqjPQWHFiTfvp0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ju/5fsguQCZvLSPswBFFqZEvkzt8FmmsuihIGknmRO5e51Oe+lWqWkC0XMl9iCsl8Ey56nnidT1N3Ez0lGpFzur8djX1/tLeLGjxBsA1SA7IQ2VFQStVYOTMSH0GnK8XlsUA4ZoenAfmtyHnYCXgAMrST9raqqYnLcYmp3O1eRg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com; spf=fail smtp.mailfrom=purestorage.com; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b=PsnShx/C; arc=none smtp.client-ip=209.85.222.225 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=purestorage.com Authentication-Results: smtp.subspace.kernel.org; spf=fail smtp.mailfrom=purestorage.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=purestorage.com header.i=@purestorage.com header.b="PsnShx/C" Received: by mail-qk1-f225.google.com with SMTP id af79cd13be357-7fa8ed0e412so83505085a.3 for ; Tue, 02 Sep 2025 20:27:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=purestorage.com; s=google2022; t=1756870020; x=1757474820; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=KAonx/bmXDCVaViB1VqlrTFlzUKjezpOhMXA1F7pd6Y=; b=PsnShx/CTYW2GVay6yo6Ynz/rz7VRSpbqWfkQHdXMW3RKPN6rhrIAMs+GOAWAM/zIr OwiW410Ee1gEXMEPrKhXf068TUSvwowp/1qaftSehLkAxoRDLPLN/DdpwEErKGpWFBgv KLmFwv/i6DKBgMKHVOoebH5wbUYYU/YxuxmVIDQ4HswVJB9HeNkGCjnkKlrL+CbkWUhw gnfUbREo4gPpRB5XozB+lh8rESK31QJhyAJ4XGrg5YrlKHYCf/ITMuM93/W6tO8Airbg 8h1z0yjdzXyvFkEhqqxIpZ8+CZ0xEdoZG9EVRSvfJHvWZT7zg4hA/UlNntfO8G64ZqmM WOYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1756870020; x=1757474820; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=KAonx/bmXDCVaViB1VqlrTFlzUKjezpOhMXA1F7pd6Y=; b=oLManXZXS5NoW607Bv1AXjBu3vuNF/UvCzo55R4FaHr8DOh57B1Pyxc0ZKcsqBG1UC ZxYaijSvzqSzDNgipNTLipy91EPZGdOJFchjkV9+WUM7FdIFINcddt6kC4TpaaH2v6Vj qh3jOazsTWZGUdEXznsqdS3ZWlwGs4TrG0Y6HcTkEObAKpazBq6u4KMLgZKxfUgs2izn dFrhEb/exJpOhu052Q9c7wpX63B23oq9DYXVUTJoyTneiV/jZpls4lE8P+K7UkMFNM1E z2iKkUB4/KSp5mOy+65RqpZsETlOb2g+WVJA0Sh7F2P3cwuvUXIeM4eqa/A+cnm9bcRq KFJQ== X-Forwarded-Encrypted: i=1; AJvYcCVa65v8YYMRgNrrV4KTKVoWrxLvra+0NCXSuKSQUrO8877SLOrr69ntIxct43SbwLvKz+9+qB6lRDIDMqo=@vger.kernel.org X-Gm-Message-State: AOJu0YyNsb0uNDv9CJzjNSiHmgFbJP2b2GsaKuPtDaj2LWouTO/Ukn3x mZm4CdpyYe53DzmGhv3YwSlXMi5cBj8qdu7f45lvu+eMQpCUeZ74svDSro3f45D786QEGur2MML lp7kbtOmV2l4zFp7AdsMKvsQr+8OMs/tDsiPN X-Gm-Gg: ASbGncuRXEXnrj6HiCXf8LqX556kFRnvHqeKp6HNn6361tBk3fYHtOWt3hEWJRSi1Jf HIH2o5ULedzLUG0+kKYUVF1G0c4JzQir98HzIe13ccuQhPexjDLePGYK5G5kr1P83qj1R4tP58X kk/5M3bHARluuSqur3WUKcw+DPT/1OjJGQuhlmkNieNjG85QBy9aEPrSwcSZkesZmfZZRHQgov7 cklF0K85UqlcIW7mvhSQOr1gFiC0jlPCrYM6jSougq1D2Ki4zfyPOypjKsoWf7JfotQxWpwlJYN 7yGY8+0Dqz65cwPk7up+h5MDySXzbAUkwdpXEy8L/mv91cf6KWNUwVn6NC+Y0QyZD56mrIeB X-Google-Smtp-Source: AGHT+IG4HTg3TQ8eCdCiuungKWZdyTjI6dzWCVcFbYKIm0C/iu3B1cWQBq2QkDd1mxDz5E33rWXupNCuUCCi X-Received: by 2002:a0c:f11c:0:b0:722:c5aa:3c75 with SMTP id 6a1803df08f44-722c5aa4a96mr11310576d6.2.1756870020190; Tue, 02 Sep 2025 20:27:00 -0700 (PDT) Received: from c7-smtp-2023.dev.purestorage.com ([208.88.159.129]) by smtp-relay.gmail.com with ESMTPS id 6a1803df08f44-720b4c2bcbdsm2295146d6.34.2025.09.02.20.26.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 02 Sep 2025 20:27:00 -0700 (PDT) X-Relaying-Domain: purestorage.com Received: from dev-csander.dev.purestorage.com (unknown [IPv6:2620:125:9007:640:ffff::1199]) by c7-smtp-2023.dev.purestorage.com (Postfix) with ESMTP id 88655340344; Tue, 2 Sep 2025 21:26:59 -0600 (MDT) Received: by dev-csander.dev.purestorage.com (Postfix, from userid 1557716354) id 866C3E41964; Tue, 2 Sep 2025 21:26:59 -0600 (MDT) From: Caleb Sander Mateos To: Jens Axboe Cc: io-uring@vger.kernel.org, linux-kernel@vger.kernel.org, Caleb Sander Mateos Subject: [PATCH 4/4] io_uring: avoid uring_lock for IORING_SETUP_SINGLE_ISSUER Date: Tue, 2 Sep 2025 21:26:56 -0600 Message-ID: <20250903032656.2012337-5-csander@purestorage.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: <20250903032656.2012337-1-csander@purestorage.com> References: <20250903032656.2012337-1-csander@purestorage.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" io_ring_ctx's mutex uring_lock can be quite expensive in high-IOPS workloads. Even when only one thread pinned to a single CPU is accessing the io_ring_ctx, the atomic CAS required to lock and unlock the mutex is a very hot instruction. The mutex's primary purpose is to prevent concurrent io_uring system calls on the same io_ring_ctx. However, there is already a flag IORING_SETUP_SINGLE_ISSUER that promises only one task will make io_uring_enter() and io_uring_register() system calls on the io_ring_ctx once it's enabled. So if the io_ring_ctx is setup with IORING_SETUP_SINGLE_ISSUER, skip the uring_lock mutex_lock() and mutex_unlock() for the io_uring_enter() submission as well as for io_handle_tw_list(). io_uring_enter() submission calls __io_uring_add_tctx_node_from_submit() to verify the current task matches submitter_task for IORING_SETUP_SINGLE_ISSUER. And task work can only be scheduled on tasks that submit io_uring requests, so io_handle_tw_list() will also only be called on submitter_task. There is a goto from the io_uring_enter() submission to the middle of the IOPOLL block which assumed the uring_lock would already be held. This is no longer the case for IORING_SETUP_SINGLE_ISSUER, so goto the preceding mutex_lock() in that case. It may be possible to avoid taking uring_lock in other places too for IORING_SETUP_SINGLE_ISSUER, but these two cover the primary hot paths. The uring_lock in io_uring_register() is necessary at least before the io_uring is enabled because submitter_task isn't set yet. uring_lock is also used to synchronize IOPOLL on submitting tasks with io_uring worker tasks, so it's still needed there. But in principle, it should be possible to remove the mutex entirely for IORING_SETUP_SINGLE_ISSUER by running any code needing exclusive access to the io_ring_ctx in task work context on submitter_task. Signed-off-by: Caleb Sander Mateos --- io_uring/io_uring.c | 6 +++++- io_uring/io_uring.h | 14 ++++++++++++++ 2 files changed, 19 insertions(+), 1 deletion(-) diff --git a/io_uring/io_uring.c b/io_uring/io_uring.c index 7f19b6da5d3d..5793f6122159 100644 --- a/io_uring/io_uring.c +++ b/io_uring/io_uring.c @@ -3534,12 +3534,15 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u= 32, to_submit, if (ret !=3D to_submit) { io_ring_ctx_unlock(ctx); goto out; } if (flags & IORING_ENTER_GETEVENTS) { - if (ctx->syscall_iopoll) + if (ctx->syscall_iopoll) { + if (ctx->flags & IORING_SETUP_SINGLE_ISSUER) + goto iopoll; goto iopoll_locked; + } /* * Ignore errors, we'll soon call io_cqring_wait() and * it should handle ownership problems if any. */ if (ctx->flags & IORING_SETUP_DEFER_TASKRUN) @@ -3556,10 +3559,11 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u= 32, to_submit, * We disallow the app entering submit/complete with * polling, but we still need to lock the ring to * prevent racing with polled issue that got punted to * a workqueue. */ +iopoll: mutex_lock(&ctx->uring_lock); iopoll_locked: ret2 =3D io_validate_ext_arg(ctx, flags, argp, argsz); if (likely(!ret2)) ret2 =3D io_iopoll_check(ctx, min_complete); diff --git a/io_uring/io_uring.h b/io_uring/io_uring.h index a0580a1bf6b5..7296b12b0897 100644 --- a/io_uring/io_uring.h +++ b/io_uring/io_uring.h @@ -121,20 +121,34 @@ bool io_match_task_safe(struct io_kiocb *head, struct= io_uring_task *tctx, =20 void io_activate_pollwq(struct io_ring_ctx *ctx); =20 static inline void io_ring_ctx_lock(struct io_ring_ctx *ctx) { + if (ctx->flags & IORING_SETUP_SINGLE_ISSUER) { + WARN_ON_ONCE(current !=3D ctx->submitter_task); + return; + } + mutex_lock(&ctx->uring_lock); } =20 static inline void io_ring_ctx_unlock(struct io_ring_ctx *ctx) { + if (ctx->flags & IORING_SETUP_SINGLE_ISSUER) { + WARN_ON_ONCE(current !=3D ctx->submitter_task); + return; + } + mutex_unlock(&ctx->uring_lock); } =20 static inline void io_ring_ctx_assert_locked(const struct io_ring_ctx *ctx) { + if (ctx->flags & IORING_SETUP_SINGLE_ISSUER && + current =3D=3D ctx->submitter_task) + return; + lockdep_assert_held(&ctx->uring_lock); } =20 static inline void io_lockdep_assert_cq_locked(struct io_ring_ctx *ctx) { --=20 2.45.2