From nobody Sun May 10 23:26:37 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 660CDC433EF for ; Thu, 21 Apr 2022 09:17:19 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387467AbiDUJUG (ORCPT ); Thu, 21 Apr 2022 05:20:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57290 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387546AbiDUJTt (ORCPT ); Thu, 21 Apr 2022 05:19:49 -0400 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0BF55289BA for ; Thu, 21 Apr 2022 02:16:57 -0700 (PDT) Received: from pps.filterd (m0089730.ppops.net [127.0.0.1]) by m0089730.ppops.net (8.16.1.2/8.16.1.2) with ESMTP id 23L0VLZ0019884 for ; Thu, 21 Apr 2022 02:16:57 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=mjM0wwI9pOqSPCCC1KVwggul7a+EfCAqQSAjGkjuJtg=; b=Sx64OVf8J+WH+y0eMsjwIVf5rQ0Q4rYMvJdDeS+OFZfhyJte/5yXAf0oPhy5gqetuKol WqmEp/iVaO4YgLwHD4K3vsqLwM0LtCB8i66aIxch6f46po9+6a02CScUq5nMLsF+Sxv6 ELgjr9IBarWVwaNykaJbdM/CwJ3VyB6K8+A= Received: from mail.thefacebook.com ([163.114.132.120]) by m0089730.ppops.net (PPS) with ESMTPS id 3fjvsvstve-6 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 21 Apr 2022 02:16:57 -0700 Received: from twshared8053.07.ash9.facebook.com (2620:10d:c085:208::11) by mail.thefacebook.com (2620:10d:c085:11d::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Thu, 21 Apr 2022 02:16:54 -0700 Received: by devbig039.lla1.facebook.com (Postfix, from userid 572232) id DF8B97CA75FA; Thu, 21 Apr 2022 02:14:01 -0700 (PDT) From: Dylan Yudaken To: CC: , , , , Dylan Yudaken Subject: [PATCH 1/6] io_uring: add trace support for CQE overflow Date: Thu, 21 Apr 2022 02:13:40 -0700 Message-ID: <20220421091345.2115755-2-dylany@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220421091345.2115755-1-dylany@fb.com> References: <20220421091345.2115755-1-dylany@fb.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-FB-Internal: Safe X-Proofpoint-GUID: UVXTBXW7w1RsjO1ZzC6SPKEHESpra43r X-Proofpoint-ORIG-GUID: UVXTBXW7w1RsjO1ZzC6SPKEHESpra43r X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-04-20_06,2022-04-20_01,2022-02-23_01 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add trace function for overflowing CQ ring. Signed-off-by: Dylan Yudaken --- include/trace/events/io_uring.h | 42 ++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/include/trace/events/io_uring.h b/include/trace/events/io_urin= g.h index cddf5b6fbeb4..42534ec2ab9d 100644 --- a/include/trace/events/io_uring.h +++ b/include/trace/events/io_uring.h @@ -530,7 +530,7 @@ TRACE_EVENT(io_uring_req_failed, ), =20 TP_printk("ring %p, req %p, user_data 0x%llx, " - "op %d, flags 0x%x, prio=3D%d, off=3D%llu, addr=3D%llu, " + "op %d, flags 0x%x, prio=3D%d, off=3D%llu, addr=3D%llu, " "len=3D%u, rw_flags=3D0x%x, buf_index=3D%d, " "personality=3D%d, file_index=3D%d, pad=3D0x%llx/%llx, error=3D%d", __entry->ctx, __entry->req, __entry->user_data, @@ -543,6 +543,46 @@ TRACE_EVENT(io_uring_req_failed, (unsigned long long) __entry->pad2, __entry->error) ); =20 + +/* + * io_uring_cqe_overflow - a CQE overflowed + * + * @ctx: pointer to a ring context structure + * @user_data: user data associated with the request + * @res: CQE result + * @cflags: CQE flags + * @ocqe: pointer to the overflow cqe (if available) + * + */ +TRACE_EVENT(io_uring_cqe_overflow, + + TP_PROTO(void *ctx, unsigned long long user_data, s32 res, u32 cflags, + void *ocqe), + + TP_ARGS(ctx, user_data, res, cflags, ocqe), + + TP_STRUCT__entry ( + __field( void *, ctx ) + __field( unsigned long long, user_data ) + __field( s32, res ) + __field( u32, cflags ) + __field( void *, ocqe ) + ), + + TP_fast_assign( + __entry->ctx =3D ctx; + __entry->user_data =3D user_data; + __entry->res =3D res; + __entry->cflags =3D cflags; + __entry->ocqe =3D ocqe; + ), + + TP_printk("ring %p, user_data 0x%llx, res %d, flags %x, " + "overflow_cqe %p", + __entry->ctx, __entry->user_data, __entry->res, + __entry->cflags, __entry->ocqe) +); + #endif /* _TRACE_IO_URING_H */ =20 /* This part must be outside protection */ --=20 2.30.2 From nobody Sun May 10 23:26:37 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CE0A6C433EF for ; Thu, 21 Apr 2022 09:17:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387507AbiDUJUQ (ORCPT ); Thu, 21 Apr 2022 05:20:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57724 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387448AbiDUJT6 (ORCPT ); Thu, 21 Apr 2022 05:19:58 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CB1DD11A09 for ; Thu, 21 Apr 2022 02:17:08 -0700 (PDT) Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 23KL3Pmd007320 for ; Thu, 21 Apr 2022 02:17:08 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=DX6ytzEKTL5duvmKH/e/++KuZNvGuRCr7bx9oCW9eMA=; b=od7P82sRevfiZJOcn5YV5zeOVrsGoTp6XSZMp1F7jpFZ36U7oc4jqbq/UBEgOQV7gsE/ qfEzophPeVsnMQxY+zeuAHqk7eLPKY77JMfu7cArYmfWUCGpGTWcczfmSZKvx/SYUpey aj8JQZM7/Tbxv3GeVxnYaVpVN5Mq/en49Ec= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3fjsrju40q-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 21 Apr 2022 02:17:07 -0700 Received: from twshared4937.07.ash9.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:11d::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Thu, 21 Apr 2022 02:17:06 -0700 Received: by devbig039.lla1.facebook.com (Postfix, from userid 572232) id EB3EF7CA75FC; Thu, 21 Apr 2022 02:14:01 -0700 (PDT) From: Dylan Yudaken To: CC: , , , , Dylan Yudaken Subject: [PATCH 2/6] io_uring: trace cqe overflows Date: Thu, 21 Apr 2022 02:13:41 -0700 Message-ID: <20220421091345.2115755-3-dylany@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220421091345.2115755-1-dylany@fb.com> References: <20220421091345.2115755-1-dylany@fb.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: SykKKcqlWtEzpvBOzPxH7_MBc3nPaY-n X-Proofpoint-GUID: SykKKcqlWtEzpvBOzPxH7_MBc3nPaY-n X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-04-20_06,2022-04-20_01,2022-02-23_01 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Trace cqe overflows in io_uring. Print ocqe before the check, so if it is NULL it indicates that it has been dropped. Signed-off-by: Dylan Yudaken --- fs/io_uring.c | 1 + 1 file changed, 1 insertion(+) diff --git a/fs/io_uring.c b/fs/io_uring.c index 7e1d5243bbbc..d654faffa486 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -2107,6 +2107,7 @@ static bool io_cqring_event_overflow(struct io_ring_c= tx *ctx, u64 user_data, struct io_overflow_cqe *ocqe; =20 ocqe =3D kmalloc(sizeof(*ocqe), GFP_ATOMIC | __GFP_ACCOUNT); + trace_io_uring_cqe_overflow(ctx, user_data, res, cflags, ocqe); if (!ocqe) { /* * If we're in ring overflow flush mode, or in task cancel mode, --=20 2.30.2 From nobody Sun May 10 23:26:37 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3A4C1C433EF for ; Thu, 21 Apr 2022 09:17:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387456AbiDUJT7 (ORCPT ); Thu, 21 Apr 2022 05:19:59 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57128 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387542AbiDUJTt (ORCPT ); Thu, 21 Apr 2022 05:19:49 -0400 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D1645220C4 for ; Thu, 21 Apr 2022 02:16:59 -0700 (PDT) Received: from pps.filterd (m0001303.ppops.net [127.0.0.1]) by m0001303.ppops.net (8.16.1.2/8.16.1.2) with ESMTP id 23L7Ai08007753 for ; Thu, 21 Apr 2022 02:16:59 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=OvpskX5wjD7culEON5XL98GA5r5k8R6m/m81IDYjyjQ=; b=NT/b0dr9o2DbaGRdOCamlPlQidzL9kWJUAJavXx5s/MN1DSTiWs4WVcCoSZJDSsZEd/v 4WFYyfX7eSp4EjGkGalTRRifctCpItLVQrtRMCbYoMISlhL+LXU50yFMvhhlIcKqp7xo YxhsEnOS/rtiBRr/donuuYLWxNLhgRh/Ul4= Received: from maileast.thefacebook.com ([163.114.130.16]) by m0001303.ppops.net (PPS) with ESMTPS id 3fj7k3hetb-3 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 21 Apr 2022 02:16:58 -0700 Received: from twshared6486.05.ash9.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Thu, 21 Apr 2022 02:16:57 -0700 Received: by devbig039.lla1.facebook.com (Postfix, from userid 572232) id EFD7B7CA75FE; Thu, 21 Apr 2022 02:14:01 -0700 (PDT) From: Dylan Yudaken To: CC: , , , , Dylan Yudaken Subject: [PATCH 3/6] io_uring: rework io_uring_enter to simplify return value Date: Thu, 21 Apr 2022 02:13:42 -0700 Message-ID: <20220421091345.2115755-4-dylany@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220421091345.2115755-1-dylany@fb.com> References: <20220421091345.2115755-1-dylany@fb.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: ArglgWF-VBNlUexkYoiazy8Gw0MXgf0S X-Proofpoint-GUID: ArglgWF-VBNlUexkYoiazy8Gw0MXgf0S X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-04-20_06,2022-04-20_01,2022-02-23_01 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" io_uring_enter returns the count submitted preferrably over an error code. In some code paths this check is not required, so reorganise the code so that the check is only done as needed. This is also a prep for returning error codes only in waiting scenarios. Signed-off-by: Dylan Yudaken --- fs/io_uring.c | 35 +++++++++++++++++++++-------------- 1 file changed, 21 insertions(+), 14 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index d654faffa486..1837b3afa47f 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -10843,7 +10843,6 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u= 32, to_submit, size_t, argsz) { struct io_ring_ctx *ctx; - int submitted =3D 0; struct fd f; long ret; =20 @@ -10906,15 +10905,15 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd,= u32, to_submit, if (ret) goto out; } - submitted =3D to_submit; + ret =3D to_submit; } else if (to_submit) { ret =3D io_uring_add_tctx_node(ctx); if (unlikely(ret)) goto out; =20 mutex_lock(&ctx->uring_lock); - submitted =3D io_submit_sqes(ctx, to_submit); - if (submitted !=3D to_submit) { + ret =3D io_submit_sqes(ctx, to_submit); + if (ret !=3D to_submit) { mutex_unlock(&ctx->uring_lock); goto out; } @@ -10923,6 +10922,7 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u= 32, to_submit, mutex_unlock(&ctx->uring_lock); } if (flags & IORING_ENTER_GETEVENTS) { + int ret2; if (ctx->syscall_iopoll) { /* * We disallow the app entering submit/complete with @@ -10932,22 +10932,29 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd,= u32, to_submit, */ mutex_lock(&ctx->uring_lock); iopoll_locked: - ret =3D io_validate_ext_arg(flags, argp, argsz); - if (likely(!ret)) { - min_complete =3D min(min_complete, ctx->cq_entries); - ret =3D io_iopoll_check(ctx, min_complete); + ret2 =3D io_validate_ext_arg(flags, argp, argsz); + if (likely(!ret2)) { + min_complete =3D min(min_complete, + ctx->cq_entries); + ret2 =3D io_iopoll_check(ctx, min_complete); } mutex_unlock(&ctx->uring_lock); } else { const sigset_t __user *sig; struct __kernel_timespec __user *ts; =20 - ret =3D io_get_ext_arg(flags, argp, &argsz, &ts, &sig); - if (unlikely(ret)) - goto out; - min_complete =3D min(min_complete, ctx->cq_entries); - ret =3D io_cqring_wait(ctx, min_complete, sig, argsz, ts); + ret2 =3D io_get_ext_arg(flags, argp, &argsz, &ts, &sig); + if (likely(!ret2)) { + min_complete =3D min(min_complete, + ctx->cq_entries); + ret2 =3D io_cqring_wait(ctx, min_complete, sig, + argsz, ts); + } } + + if (!ret) + ret =3D ret2; + } =20 out: @@ -10955,7 +10962,7 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, u= 32, to_submit, out_fput: if (!(flags & IORING_ENTER_REGISTERED_RING)) fdput(f); - return submitted ? submitted : ret; + return ret; } =20 #ifdef CONFIG_PROC_FS --=20 2.30.2 From nobody Sun May 10 23:26:37 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15166C433FE for ; Thu, 21 Apr 2022 09:17:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387441AbiDUJUD (ORCPT ); Thu, 21 Apr 2022 05:20:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57092 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387465AbiDUJTt (ORCPT ); Thu, 21 Apr 2022 05:19:49 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9715F2898A for ; Thu, 21 Apr 2022 02:16:56 -0700 (PDT) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 23L6iga7001913 for ; Thu, 21 Apr 2022 02:16:55 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=crGBemUctK2aDiXsGbr+Y1XDXp3SStzaORv3NA8oNCo=; b=L2RAueuJTX4ZLpLHMg2Gr/69UroX8HuWGqK6brm8b02sB4l0xEguqeO1B5tdnmMpQxSp 47U3Zwozjj0wNPXLbCHnyVKGQ+BQeh7IIbMR/GhnalBA5RO2xynP/ACz41HRwm0dveQJ TGk6zE1sn9d+cS+EJQwOQILj9GOxH/NdJtA= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3fhub7eq2v-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 21 Apr 2022 02:16:55 -0700 Received: from snc-exhub201.TheFacebook.com (2620:10d:c085:21d::7) by snc-exhub204.TheFacebook.com (2620:10d:c085:21d::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Thu, 21 Apr 2022 02:16:54 -0700 Received: from twshared4937.07.ash9.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:21d::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Thu, 21 Apr 2022 02:16:54 -0700 Received: by devbig039.lla1.facebook.com (Postfix, from userid 572232) id 09E967CA7600; Thu, 21 Apr 2022 02:14:02 -0700 (PDT) From: Dylan Yudaken To: CC: , , , , Dylan Yudaken Subject: [PATCH 4/6] io_uring: use constants for cq_overflow bitfield Date: Thu, 21 Apr 2022 02:13:43 -0700 Message-ID: <20220421091345.2115755-5-dylany@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220421091345.2115755-1-dylany@fb.com> References: <20220421091345.2115755-1-dylany@fb.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-FB-Internal: Safe X-Proofpoint-GUID: 7FUjQCvhspEopY8cC78AN458oIziRsfq X-Proofpoint-ORIG-GUID: 7FUjQCvhspEopY8cC78AN458oIziRsfq X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-04-20_06,2022-04-20_01,2022-02-23_01 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Prepare to use this bitfield for more flags by using constants instead of magic value 0 Signed-off-by: Dylan Yudaken --- fs/io_uring.c | 22 ++++++++++++++-------- 1 file changed, 14 insertions(+), 8 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 1837b3afa47f..db878c114e16 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -431,7 +431,7 @@ struct io_ring_ctx { struct wait_queue_head sqo_sq_wait; struct list_head sqd_list; =20 - unsigned long check_cq_overflow; + unsigned long check_cq; =20 struct { /* @@ -903,6 +903,10 @@ struct io_cqe { }; }; =20 +enum { + IO_CHECK_CQ_OVERFLOW_BIT, +}; + /* * NOTE! Each of the iocb union members has the file pointer * as the first entry in their struct definition. So you can @@ -2024,7 +2028,7 @@ static bool __io_cqring_overflow_flush(struct io_ring= _ctx *ctx, bool force) =20 all_flushed =3D list_empty(&ctx->cq_overflow_list); if (all_flushed) { - clear_bit(0, &ctx->check_cq_overflow); + clear_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq); WRITE_ONCE(ctx->rings->sq_flags, ctx->rings->sq_flags & ~IORING_SQ_CQ_OVERFLOW); } @@ -2040,7 +2044,7 @@ static bool io_cqring_overflow_flush(struct io_ring_c= tx *ctx) { bool ret =3D true; =20 - if (test_bit(0, &ctx->check_cq_overflow)) { + if (test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq)) { /* iopoll syncs against uring_lock, not completion_lock */ if (ctx->flags & IORING_SETUP_IOPOLL) mutex_lock(&ctx->uring_lock); @@ -2118,7 +2122,7 @@ static bool io_cqring_event_overflow(struct io_ring_c= tx *ctx, u64 user_data, return false; } if (list_empty(&ctx->cq_overflow_list)) { - set_bit(0, &ctx->check_cq_overflow); + set_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq); WRITE_ONCE(ctx->rings->sq_flags, ctx->rings->sq_flags | IORING_SQ_CQ_OVERFLOW); =20 @@ -2961,7 +2965,7 @@ static int io_iopoll_check(struct io_ring_ctx *ctx, l= ong min) * If we do, we can potentially be spinning for commands that * already triggered a CQE (eg in error). */ - if (test_bit(0, &ctx->check_cq_overflow)) + if (test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq)) __io_cqring_overflow_flush(ctx, false); if (io_cqring_events(ctx)) return 0; @@ -8271,7 +8275,8 @@ static int io_wake_function(struct wait_queue_entry *= curr, unsigned int mode, * Cannot safely flush overflowed CQEs from here, ensure we wake up * the task, and the next invocation will do it. */ - if (io_should_wake(iowq) || test_bit(0, &iowq->ctx->check_cq_overflow)) + if (io_should_wake(iowq) || + test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &iowq->ctx->check_cq)) return autoremove_wake_function(curr, mode, wake_flags, key); return -1; } @@ -8299,7 +8304,7 @@ static inline int io_cqring_wait_schedule(struct io_r= ing_ctx *ctx, if (ret || io_should_wake(iowq)) return ret; /* let the caller flush overflows, retry */ - if (test_bit(0, &ctx->check_cq_overflow)) + if (test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq)) return 1; =20 if (!schedule_hrtimeout(&timeout, HRTIMER_MODE_ABS)) @@ -10094,7 +10099,8 @@ static __poll_t io_uring_poll(struct file *file, po= ll_table *wait) * Users may get EPOLLIN meanwhile seeing nothing in cqring, this * pushs them to do the flush. */ - if (io_cqring_events(ctx) || test_bit(0, &ctx->check_cq_overflow)) + if (io_cqring_events(ctx) || + test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq)) mask |=3D EPOLLIN | EPOLLRDNORM; =20 return mask; --=20 2.30.2 From nobody Sun May 10 23:26:37 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 786DAC433EF for ; Thu, 21 Apr 2022 09:17:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387531AbiDUJU0 (ORCPT ); Thu, 21 Apr 2022 05:20:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57786 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387432AbiDUJT7 (ORCPT ); Thu, 21 Apr 2022 05:19:59 -0400 Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DAB8F14084 for ; Thu, 21 Apr 2022 02:17:10 -0700 (PDT) Received: from pps.filterd (m0001303.ppops.net [127.0.0.1]) by m0001303.ppops.net (8.16.1.2/8.16.1.2) with ESMTP id 23L7Ai0A007753 for ; Thu, 21 Apr 2022 02:17:10 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=dOWQsXWrlscVSXwcUY7oiVZY2tYgB27g3OF+R0KAHOY=; b=WEvYnH9zQRwSv6ZKHLUbnYlZjK+jq2oQfTyBOKE6CgPCYnKWiiGnGeLZpC6UtreFRoj0 Q+c7aTTJ3zzmQLFXb6bIgdE0A6EYY7U1EOFYqn3DqBPox+XlSu3knR63b9+wio3BNzt/ W2Both0ib1XKM/HIPKMxxwmHfI88ScbQ51k= Received: from maileast.thefacebook.com ([163.114.130.16]) by m0001303.ppops.net (PPS) with ESMTPS id 3fj7k3heu4-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 21 Apr 2022 02:17:09 -0700 Received: from twshared6486.05.ash9.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Thu, 21 Apr 2022 02:17:09 -0700 Received: by devbig039.lla1.facebook.com (Postfix, from userid 572232) id 152E37CA7602; Thu, 21 Apr 2022 02:14:02 -0700 (PDT) From: Dylan Yudaken To: CC: , , , , Dylan Yudaken Subject: [PATCH 5/6] io_uring: return an error when cqe is dropped Date: Thu, 21 Apr 2022 02:13:44 -0700 Message-ID: <20220421091345.2115755-6-dylany@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220421091345.2115755-1-dylany@fb.com> References: <20220421091345.2115755-1-dylany@fb.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: GQIaF4f8cj-vLgfg9JIArUfSqgNLG8Zm X-Proofpoint-GUID: GQIaF4f8cj-vLgfg9JIArUfSqgNLG8Zm X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-04-20_06,2022-04-20_01,2022-02-23_01 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Right now io_uring will not actively inform userspace if a CQE is dropped. This is extremely rare, requiring a CQ ring overflow, as well as a GFP_ATOMIC kmalloc failure. However the consequences could cause for example applications to go into an undefined state, possibly waiting for a CQE that never arrives. Return an error code (EBADR) in these cases. Since this is expected to be incredibly rare, try and avoid as much as possible affecting the hot code paths, and so it only is returned lazily and when there is no other available CQEs. Once the error is returned, reset the error condition assuming the user is either ok with it or will clean up appropriately. Signed-off-by: Dylan Yudaken --- fs/io_uring.c | 32 ++++++++++++++++++++++++++++---- 1 file changed, 28 insertions(+), 4 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index db878c114e16..e46dc67c917c 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -905,6 +905,7 @@ struct io_cqe { =20 enum { IO_CHECK_CQ_OVERFLOW_BIT, + IO_CHECK_CQ_DROPPED_BIT, }; =20 /* @@ -2119,6 +2120,7 @@ static bool io_cqring_event_overflow(struct io_ring_c= tx *ctx, u64 user_data, * on the floor. */ io_account_cq_overflow(ctx); + set_bit(IO_CHECK_CQ_DROPPED_BIT, &ctx->check_cq); return false; } if (list_empty(&ctx->cq_overflow_list)) { @@ -2959,16 +2961,26 @@ static int io_iopoll_check(struct io_ring_ctx *ctx,= long min) { unsigned int nr_events =3D 0; int ret =3D 0; + unsigned long check_cq; =20 /* * Don't enter poll loop if we already have events pending. * If we do, we can potentially be spinning for commands that * already triggered a CQE (eg in error). */ - if (test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq)) + check_cq =3D READ_ONCE(ctx->check_cq); + if (check_cq & BIT(IO_CHECK_CQ_OVERFLOW_BIT)) __io_cqring_overflow_flush(ctx, false); if (io_cqring_events(ctx)) return 0; + + /* + * Similarly do not spin if we have not informed the user of any + * dropped CQE. + */ + if (unlikely(check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))) + return -EBADR; + do { /* * If a submit got punted to a workqueue, we can have the @@ -8298,15 +8310,18 @@ static inline int io_cqring_wait_schedule(struct io= _ring_ctx *ctx, ktime_t timeout) { int ret; + unsigned long check_cq; =20 /* make sure we run task_work before checking for signals */ ret =3D io_run_task_work_sig(); if (ret || io_should_wake(iowq)) return ret; + check_cq =3D READ_ONCE(ctx->check_cq); /* let the caller flush overflows, retry */ - if (test_bit(IO_CHECK_CQ_OVERFLOW_BIT, &ctx->check_cq)) + if (check_cq & BIT(IO_CHECK_CQ_OVERFLOW_BIT)) return 1; - + if (unlikely(check_cq & BIT(IO_CHECK_CQ_DROPPED_BIT))) + return -EBADR; if (!schedule_hrtimeout(&timeout, HRTIMER_MODE_ABS)) return -ETIME; return 1; @@ -10958,9 +10973,18 @@ SYSCALL_DEFINE6(io_uring_enter, unsigned int, fd, = u32, to_submit, } } =20 - if (!ret) + if (!ret) { ret =3D ret2; =20 + /* + * EBADR indicates that one or more CQE were dropped. + * Once the user has been informed we can clear the bit + * as they are obviously ok with those drops. + */ + if (unlikely(ret2 =3D=3D -EBADR)) + clear_bit(IO_CHECK_CQ_DROPPED_BIT, + &ctx->check_cq); + } } =20 out: --=20 2.30.2 From nobody Sun May 10 23:26:37 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4832EC433EF for ; Thu, 21 Apr 2022 09:17:26 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1387487AbiDUJUM (ORCPT ); Thu, 21 Apr 2022 05:20:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56148 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1387212AbiDUJTv (ORCPT ); Thu, 21 Apr 2022 05:19:51 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 258C023BE0 for ; Thu, 21 Apr 2022 02:17:03 -0700 (PDT) Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 23L73eUt022788 for ; Thu, 21 Apr 2022 02:17:03 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=o+BEJzHk53Z2SWPQnNBmJJX4g5pdl6FD6EzyM6ejeDU=; b=FCTyzZzxkRgUdu2CKkT1lqxM385q8rqANFfI2jCSJy1xA5kx568x83wPiHoDW2uUdRu8 e0pNL4XY/uIQIany6u78wOXzqJxoebmgu/aCCiEnIPUR9I0OoDacclvMDyhXPalqb+c4 iUp2ht/xgy7XqbQc4pshCQZ1y6Sdva/Caf8= Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3fjhgxy3q2-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Thu, 21 Apr 2022 02:17:02 -0700 Received: from twshared41237.03.ash8.facebook.com (2620:10d:c0a8:1b::d) by mail.thefacebook.com (2620:10d:c0a8:83::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Thu, 21 Apr 2022 02:17:01 -0700 Received: by devbig039.lla1.facebook.com (Postfix, from userid 572232) id 19B7F7CA7604; Thu, 21 Apr 2022 02:14:02 -0700 (PDT) From: Dylan Yudaken To: CC: , , , , Dylan Yudaken Subject: [PATCH 6/6] io_uring: allow NOP opcode in IOPOLL mode Date: Thu, 21 Apr 2022 02:13:45 -0700 Message-ID: <20220421091345.2115755-7-dylany@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220421091345.2115755-1-dylany@fb.com> References: <20220421091345.2115755-1-dylany@fb.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: GKIBp6dQTQwht9uJfIwKe6ctwUQmaijC X-Proofpoint-GUID: GKIBp6dQTQwht9uJfIwKe6ctwUQmaijC X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.858,Hydra:6.0.486,FMLib:17.11.64.514 definitions=2022-04-20_06,2022-04-20_01,2022-02-23_01 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" This is useful for tests so that IOPOLL can be tested without requiring files. NOP is acceptable in IOPOLL as it always completes immediately. Signed-off-by: Dylan Yudaken --- fs/io_uring.c | 5 ----- 1 file changed, 5 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index e46dc67c917c..a4e42ba708b4 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -4526,11 +4526,6 @@ static int io_splice(struct io_kiocb *req, unsigned = int issue_flags) */ static int io_nop(struct io_kiocb *req, unsigned int issue_flags) { - struct io_ring_ctx *ctx =3D req->ctx; - - if (unlikely(ctx->flags & IORING_SETUP_IOPOLL)) - return -EINVAL; - __io_req_complete(req, issue_flags, 0, 0); return 0; } --=20 2.30.2