From nobody Mon Jun 22 01:38:03 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4CBA2C433EF for ; Mon, 21 Feb 2022 14:17:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377519AbiBUORq (ORCPT ); Mon, 21 Feb 2022 09:17:46 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:54464 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377509AbiBUORn (ORCPT ); Mon, 21 Feb 2022 09:17:43 -0500 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 01E171EAF0 for ; Mon, 21 Feb 2022 06:17:20 -0800 (PST) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 21L4ml0Y000908 for ; Mon, 21 Feb 2022 06:17:20 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=8JSelopTzUd3gZTcg3UIsHAtAUYSuIqFSQKhpF++mnI=; b=q05daNCgw7JtWrRIB+3JoF5i2vmR9748VjaWTy4b/90FgBQAr9cwigY9Oxi5Td1IkctU nhB+RWydF6BKFEcESR1WrlZYt4dCCRWbh/V8FhJl3JO1iJhh3Prx0wMHVehJgkbXpm+i fU3pyFqOCW3QOCpGo7y9SK2d72smH05Sce4= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3eay6212jf-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 21 Feb 2022 06:17:20 -0800 Received: from twshared7634.08.ash8.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:21d::5) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Mon, 21 Feb 2022 06:17:18 -0800 Received: by devbig039.lla1.facebook.com (Postfix, from userid 572232) id 9E2A346F091A; Mon, 21 Feb 2022 06:17:14 -0800 (PST) From: Dylan Yudaken To: Jens Axboe , Pavel Begunkov , CC: , , Dylan Yudaken Subject: [PATCH v2 1/4] io_uring: remove duplicated calls to io_kiocb_ppos Date: Mon, 21 Feb 2022 06:16:46 -0800 Message-ID: <20220221141649.624233-2-dylany@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220221141649.624233-1-dylany@fb.com> References: <20220221141649.624233-1-dylany@fb.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: sSdAcHxtXXeYNpSYfxoOWcOyBPNcfqQj X-Proofpoint-GUID: sSdAcHxtXXeYNpSYfxoOWcOyBPNcfqQj X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2022-02-21_07,2022-02-21_01,2021-12-02_01 X-Proofpoint-Spam-Details: rule=fb_outbound_notspam policy=fb_outbound score=0 priorityscore=1501 lowpriorityscore=0 phishscore=0 impostorscore=0 spamscore=0 adultscore=0 mlxlogscore=741 suspectscore=0 malwarescore=0 mlxscore=0 bulkscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202210086 X-FB-Internal: deliver Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" io_kiocb_ppos is called in both branches, and it seems that the compiler does not fuse this. Fusing removes a few bytes from loop_rw_iter. Before: $ nm -S fs/io_uring.o | grep loop_rw_iter 0000000000002430 0000000000000124 t loop_rw_iter After: $ nm -S fs/io_uring.o | grep loop_rw_iter 0000000000002430 000000000000010d t loop_rw_iter Signed-off-by: Dylan Yudaken --- fs/io_uring.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 77b9c7e4793b..1f9b4466c269 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -3400,6 +3400,7 @@ static ssize_t loop_rw_iter(int rw, struct io_kiocb *= req, struct iov_iter *iter) struct kiocb *kiocb =3D &req->rw.kiocb; struct file *file =3D req->file; ssize_t ret =3D 0; + loff_t *ppos; =20 /* * Don't support polled IO through this interface, and we can't @@ -3412,6 +3413,8 @@ static ssize_t loop_rw_iter(int rw, struct io_kiocb *= req, struct iov_iter *iter) !(kiocb->ki_filp->f_flags & O_NONBLOCK)) return -EAGAIN; =20 + ppos =3D io_kiocb_ppos(kiocb); + while (iov_iter_count(iter)) { struct iovec iovec; ssize_t nr; @@ -3425,10 +3428,10 @@ static ssize_t loop_rw_iter(int rw, struct io_kiocb= *req, struct iov_iter *iter) =20 if (rw =3D=3D READ) { nr =3D file->f_op->read(file, iovec.iov_base, - iovec.iov_len, io_kiocb_ppos(kiocb)); + iovec.iov_len, ppos); } else { nr =3D file->f_op->write(file, iovec.iov_base, - iovec.iov_len, io_kiocb_ppos(kiocb)); + iovec.iov_len, ppos); } =20 if (nr < 0) { --=20 2.30.2 From nobody Mon Jun 22 01:38:03 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 567B7C433EF for ; Mon, 21 Feb 2022 14:17:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377546AbiBUOSD (ORCPT ); Mon, 21 Feb 2022 09:18:03 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:54712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377495AbiBUORs (ORCPT ); Mon, 21 Feb 2022 09:17:48 -0500 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 499F9DF63 for ; Mon, 21 Feb 2022 06:17:25 -0800 (PST) Received: from pps.filterd (m0148460.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 21LANSdb002488 for ; Mon, 21 Feb 2022 06:17:24 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=61rzr/XC//QWTgQarhzaCqWt4vh91P2MKvAjSBOBxTU=; b=irBj/A1itCRO9GYL3E36OrELiZxfRB0le7IChONuTQ5CB4M669pVbX95tIrgQwamCoXQ BMrp/TTy8cgE2jGzNCUWu3pjmSuVlKf791Hc7GHsk4m6ukbvvh2gE04odAFSbO0QrVZ1 +omIkt+1QzH9b86492+/XvvR21RFsks4pHo= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3ec8xnh0ss-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 21 Feb 2022 06:17:24 -0800 Received: from twshared33860.05.ash9.facebook.com (2620:10d:c085:208::11) by mail.thefacebook.com (2620:10d:c085:21d::7) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Mon, 21 Feb 2022 06:17:22 -0800 Received: by devbig039.lla1.facebook.com (Postfix, from userid 572232) id 8A5D246F091F; Mon, 21 Feb 2022 06:17:15 -0800 (PST) From: Dylan Yudaken To: Jens Axboe , Pavel Begunkov , CC: , , Dylan Yudaken Subject: [PATCH v2 2/4] io_uring: update kiocb->ki_pos at execution time Date: Mon, 21 Feb 2022 06:16:47 -0800 Message-ID: <20220221141649.624233-3-dylany@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220221141649.624233-1-dylany@fb.com> References: <20220221141649.624233-1-dylany@fb.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-FB-Internal: Safe X-Proofpoint-GUID: ToY-GlDGm1mH3kOhvGucDbQscBDpKmTP X-Proofpoint-ORIG-GUID: ToY-GlDGm1mH3kOhvGucDbQscBDpKmTP X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2022-02-21_07,2022-02-21_01,2021-12-02_01 X-Proofpoint-Spam-Details: rule=fb_outbound_notspam policy=fb_outbound score=0 clxscore=1015 priorityscore=1501 malwarescore=0 mlxscore=0 phishscore=0 adultscore=0 spamscore=0 lowpriorityscore=0 impostorscore=0 mlxlogscore=999 bulkscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202210086 X-FB-Internal: deliver Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Update kiocb->ki_pos at execution time rather than in io_prep_rw(). io_prep_rw() happens before the job is enqueued to a worker and so the offset might be read multiple times before being executed once. Ensures that the file position in a set of _linked_ SQEs will be only obtained after earlier SQEs have completed, and so will include their incremented file position. Signed-off-by: Dylan Yudaken --- fs/io_uring.c | 25 +++++++++++++++++-------- 1 file changed, 17 insertions(+), 8 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 1f9b4466c269..50b93ff2ee12 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -3000,14 +3000,6 @@ static int io_prep_rw(struct io_kiocb *req, const st= ruct io_uring_sqe *sqe) req->flags |=3D io_file_get_flags(file) << REQ_F_SUPPORT_NOWAIT_BIT; =20 kiocb->ki_pos =3D READ_ONCE(sqe->off); - if (kiocb->ki_pos =3D=3D -1) { - if (!(file->f_mode & FMODE_STREAM)) { - req->flags |=3D REQ_F_CUR_POS; - kiocb->ki_pos =3D file->f_pos; - } else { - kiocb->ki_pos =3D 0; - } - } kiocb->ki_flags =3D iocb_flags(file); ret =3D kiocb_set_rw_flags(kiocb, READ_ONCE(sqe->rw_flags)); if (unlikely(ret)) @@ -3074,6 +3066,19 @@ static inline void io_rw_done(struct kiocb *kiocb, s= size_t ret) } } =20 +static inline void +io_kiocb_update_pos(struct io_kiocb *req, struct kiocb *kiocb) +{ + if (kiocb->ki_pos =3D=3D -1) { + if (!(req->file->f_mode & FMODE_STREAM)) { + req->flags |=3D REQ_F_CUR_POS; + kiocb->ki_pos =3D req->file->f_pos; + } else { + kiocb->ki_pos =3D 0; + } + } +} + static void kiocb_done(struct io_kiocb *req, ssize_t ret, unsigned int issue_flags) { @@ -3662,6 +3667,8 @@ static int io_read(struct io_kiocb *req, unsigned int= issue_flags) kiocb->ki_flags &=3D ~IOCB_NOWAIT; } =20 + io_kiocb_update_pos(req, kiocb); + ret =3D rw_verify_area(READ, req->file, io_kiocb_ppos(kiocb), req->result= ); if (unlikely(ret)) { kfree(iovec); @@ -3791,6 +3798,8 @@ static int io_write(struct io_kiocb *req, unsigned in= t issue_flags) kiocb->ki_flags &=3D ~IOCB_NOWAIT; } =20 + io_kiocb_update_pos(req, kiocb); + ret =3D rw_verify_area(WRITE, req->file, io_kiocb_ppos(kiocb), req->resul= t); if (unlikely(ret)) goto out_free; --=20 2.30.2 From nobody Mon Jun 22 01:38:03 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43F57C433EF for ; Mon, 21 Feb 2022 14:17:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377539AbiBUOSH (ORCPT ); Mon, 21 Feb 2022 09:18:07 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:54722 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377517AbiBUORt (ORCPT ); Mon, 21 Feb 2022 09:17:49 -0500 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 102301EAF2 for ; Mon, 21 Feb 2022 06:17:26 -0800 (PST) Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 21L4KJ8h014222 for ; Mon, 21 Feb 2022 06:17:25 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=XsWZ+dstq9O1Y74mqKTBcqdLDX5wS+fQHRNAdVW6HtI=; b=FU7SzZ/Jra+9EUb9f96ooumawIM+nllcAogPUHOwgQ6+VjhS6bjtwsMZj3eciCQfwHHp 33wCN6DTt8YkIezFAOpGoU1i7H7ZcJBenSDsxeHkdock0guH/lLGaI2Q/hJSvS9P9AJS gWKJ8by1Ls1rCUWVZTkOSEitM2ncZI0ou4Q= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3eaxhx98eg-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 21 Feb 2022 06:17:25 -0800 Received: from twshared26885.03.ash8.facebook.com (2620:10d:c085:208::f) by mail.thefacebook.com (2620:10d:c085:11d::6) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Mon, 21 Feb 2022 06:17:23 -0800 Received: by devbig039.lla1.facebook.com (Postfix, from userid 572232) id D594D46F0930; Mon, 21 Feb 2022 06:17:16 -0800 (PST) From: Dylan Yudaken To: Jens Axboe , Pavel Begunkov , CC: , , Dylan Yudaken Subject: [PATCH v2 3/4] io_uring: do not recalculate ppos unnecessarily Date: Mon, 21 Feb 2022 06:16:48 -0800 Message-ID: <20220221141649.624233-4-dylany@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220221141649.624233-1-dylany@fb.com> References: <20220221141649.624233-1-dylany@fb.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: Zc8qOIJ7oaJkaqfjcHdwUiWqap6i7Itw X-Proofpoint-GUID: Zc8qOIJ7oaJkaqfjcHdwUiWqap6i7Itw X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2022-02-21_07,2022-02-21_01,2021-12-02_01 X-Proofpoint-Spam-Details: rule=fb_outbound_notspam policy=fb_outbound score=0 mlxscore=0 clxscore=1015 phishscore=0 priorityscore=1501 adultscore=0 mlxlogscore=641 suspectscore=0 lowpriorityscore=0 impostorscore=0 spamscore=0 bulkscore=0 malwarescore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202210086 X-FB-Internal: deliver Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" There is a slight optimisation to be had by calculating the correct pos pointer inside io_kiocb_update_pos and then using that later. It seems code size drops by a bit: 000000000000a1b0 0000000000000400 t io_read 000000000000a5b0 0000000000000319 t io_write vs 000000000000a1b0 00000000000003f6 t io_read 000000000000a5b0 0000000000000310 t io_write Signed-off-by: Dylan Yudaken --- fs/io_uring.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 50b93ff2ee12..abd8c739988e 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -3066,17 +3066,21 @@ static inline void io_rw_done(struct kiocb *kiocb, = ssize_t ret) } } =20 -static inline void +static inline loff_t* io_kiocb_update_pos(struct io_kiocb *req, struct kiocb *kiocb) { + bool is_stream =3D req->file->f_mode & FMODE_STREAM; if (kiocb->ki_pos =3D=3D -1) { - if (!(req->file->f_mode & FMODE_STREAM)) { + if (!is_stream) { req->flags |=3D REQ_F_CUR_POS; kiocb->ki_pos =3D req->file->f_pos; + return &kiocb->ki_pos; } else { kiocb->ki_pos =3D 0; + return NULL; } } + return is_stream ? NULL : &kiocb->ki_pos; } =20 static void kiocb_done(struct io_kiocb *req, ssize_t ret, @@ -3637,6 +3641,7 @@ static int io_read(struct io_kiocb *req, unsigned int= issue_flags) bool force_nonblock =3D issue_flags & IO_URING_F_NONBLOCK; struct io_async_rw *rw; ssize_t ret, ret2; + loff_t *ppos; =20 if (!req_has_async_data(req)) { ret =3D io_import_iovec(READ, req, &iovec, s, issue_flags); @@ -3667,9 +3672,9 @@ static int io_read(struct io_kiocb *req, unsigned int= issue_flags) kiocb->ki_flags &=3D ~IOCB_NOWAIT; } =20 - io_kiocb_update_pos(req, kiocb); + ppos =3D io_kiocb_update_pos(req, kiocb); =20 - ret =3D rw_verify_area(READ, req->file, io_kiocb_ppos(kiocb), req->result= ); + ret =3D rw_verify_area(READ, req->file, ppos, req->result); if (unlikely(ret)) { kfree(iovec); return ret; @@ -3768,6 +3773,7 @@ static int io_write(struct io_kiocb *req, unsigned in= t issue_flags) struct kiocb *kiocb =3D &req->rw.kiocb; bool force_nonblock =3D issue_flags & IO_URING_F_NONBLOCK; ssize_t ret, ret2; + loff_t *ppos; =20 if (!req_has_async_data(req)) { ret =3D io_import_iovec(WRITE, req, &iovec, s, issue_flags); @@ -3798,9 +3804,9 @@ static int io_write(struct io_kiocb *req, unsigned in= t issue_flags) kiocb->ki_flags &=3D ~IOCB_NOWAIT; } =20 - io_kiocb_update_pos(req, kiocb); + ppos =3D io_kiocb_update_pos(req, kiocb); =20 - ret =3D rw_verify_area(WRITE, req->file, io_kiocb_ppos(kiocb), req->resul= t); + ret =3D rw_verify_area(WRITE, req->file, ppos, req->result); if (unlikely(ret)) goto out_free; =20 --=20 2.30.2 From nobody Mon Jun 22 01:38:03 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BC474C433EF for ; Mon, 21 Feb 2022 14:17:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377536AbiBUORx (ORCPT ); Mon, 21 Feb 2022 09:17:53 -0500 Received: from mxb-00190b01.gslb.pphosted.com ([23.128.96.19]:54768 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377523AbiBUORr (ORCPT ); Mon, 21 Feb 2022 09:17:47 -0500 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA7D31EAFD for ; Mon, 21 Feb 2022 06:17:23 -0800 (PST) Received: from pps.filterd (m0044010.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.16.1.2/8.16.1.2) with ESMTP id 21LAFiJw026086 for ; Mon, 21 Feb 2022 06:17:23 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fb.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=facebook; bh=Fz7oKYENs0XPhuxwK1aReWcUz58BgCYUiYpPETqigJE=; b=H/HxMYnFyy6XrSdiPH2pDJB9JPa2ZfPXZ3YLsyUIdJWnz8bwY6bAPR80cpkLXIOgbg77 PCnf5Mz1MDgxC3mmhVFXzL1zVn0+aBG2fxrDPrIDFk9qFLTVgCv7EisMBQDgSjv0JX3s 8ZW18y5Tu/DibL3ZX2mFF0BweySDS5EqQJI= Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3ec8tk137f-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Mon, 21 Feb 2022 06:17:23 -0800 Received: from twshared33860.05.ash9.facebook.com (2620:10d:c085:208::11) by mail.thefacebook.com (2620:10d:c085:11d::4) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2308.21; Mon, 21 Feb 2022 06:17:22 -0800 Received: by devbig039.lla1.facebook.com (Postfix, from userid 572232) id C40B146F093A; Mon, 21 Feb 2022 06:17:17 -0800 (PST) From: Dylan Yudaken To: Jens Axboe , Pavel Begunkov , CC: , , Dylan Yudaken Subject: [PATCH v2 4/4] io_uring: pre-increment f_pos on rw Date: Mon, 21 Feb 2022 06:16:49 -0800 Message-ID: <20220221141649.624233-5-dylany@fb.com> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20220221141649.624233-1-dylany@fb.com> References: <20220221141649.624233-1-dylany@fb.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-FB-Internal: Safe X-Proofpoint-GUID: sGSrcnwmAb-RkyWZ3rY0C7btglN8JNjV X-Proofpoint-ORIG-GUID: sGSrcnwmAb-RkyWZ3rY0C7btglN8JNjV X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.205,Aquarius:18.0.816,Hydra:6.0.425,FMLib:17.11.62.513 definitions=2022-02-21_07,2022-02-21_01,2021-12-02_01 X-Proofpoint-Spam-Details: rule=fb_outbound_notspam policy=fb_outbound score=0 suspectscore=0 mlxlogscore=699 mlxscore=0 lowpriorityscore=0 spamscore=0 impostorscore=0 malwarescore=0 bulkscore=0 adultscore=0 priorityscore=1501 phishscore=0 clxscore=1015 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2201110000 definitions=main-2202210086 X-FB-Internal: deliver Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" In read/write ops, preincrement f_pos when no offset is specified, and then attempt fix up the position after IO completes if it completed less than expected. This fixes the problem where multiple queued up IO will all obtain the same f_pos, and so perform the same read/write. This is still not as consistent as sync r/w, as it is able to advance the file offset past the end of the file. It seems it would be quite a performance hit to work around this limitation - such as by keeping track of concurrent operations - and the downside does not seem to be too problematic. The attempt to fix up the f_pos after will at least mean that in situations where a single operation is run, then the position will be consistent. Co-developed-by: Jens Axboe Signed-off-by: Jens Axboe Signed-off-by: Dylan Yudaken --- fs/io_uring.c | 81 ++++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 68 insertions(+), 13 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index abd8c739988e..a951d0754899 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -3066,21 +3066,71 @@ static inline void io_rw_done(struct kiocb *kiocb, = ssize_t ret) } } =20 -static inline loff_t* -io_kiocb_update_pos(struct io_kiocb *req, struct kiocb *kiocb) +static inline bool +io_kiocb_update_pos(struct io_kiocb *req, struct kiocb *kiocb, + loff_t **ppos, u64 expected, bool force_nonblock) { bool is_stream =3D req->file->f_mode & FMODE_STREAM; if (kiocb->ki_pos =3D=3D -1) { if (!is_stream) { - req->flags |=3D REQ_F_CUR_POS; + *ppos =3D &kiocb->ki_pos; + WARN_ON(req->flags & REQ_F_CUR_POS); + if (req->file->f_mode & FMODE_ATOMIC_POS) { + if (force_nonblock) { + if (!mutex_trylock(&req->file->f_pos_lock)) + return true; + } else { + mutex_lock(&req->file->f_pos_lock); + } + } kiocb->ki_pos =3D req->file->f_pos; - return &kiocb->ki_pos; + req->flags |=3D REQ_F_CUR_POS; + req->file->f_pos +=3D expected; + if (req->file->f_mode & FMODE_ATOMIC_POS) + mutex_unlock(&req->file->f_pos_lock); + return false; } else { kiocb->ki_pos =3D 0; - return NULL; + *ppos =3D NULL; + return false; } } - return is_stream ? NULL : &kiocb->ki_pos; + *ppos =3D is_stream ? NULL : &kiocb->ki_pos; + return false; +} + +static inline void +io_kiocb_done_pos(struct io_kiocb *req, struct kiocb *kiocb, u64 actual) +{ + u64 expected; + + if (likely(!(req->flags & REQ_F_CUR_POS))) + return; + + expected =3D req->rw.len; + if (actual >=3D expected) + return; + + /* + * It's not definitely safe to lock here, and the assumption is, + * that if we cannot lock the position that it will be changing, + * and if it will be changing - then we can't update it anyway + */ + if (req->file->f_mode & FMODE_ATOMIC_POS + && !mutex_trylock(&req->file->f_pos_lock)) + return; + + /* + * now we want to move the pointer, but only if everything is consistent + * with how we left it originally + */ + if (req->file->f_pos =3D=3D kiocb->ki_pos + (expected - actual)) + req->file->f_pos =3D kiocb->ki_pos; + + /* else something else messed with f_pos and we can't do anything */ + + if (req->file->f_mode & FMODE_ATOMIC_POS) + mutex_unlock(&req->file->f_pos_lock); } =20 static void kiocb_done(struct io_kiocb *req, ssize_t ret, @@ -3096,8 +3146,7 @@ static void kiocb_done(struct io_kiocb *req, ssize_t = ret, ret +=3D io->bytes_done; } =20 - if (req->flags & REQ_F_CUR_POS) - req->file->f_pos =3D req->rw.kiocb.ki_pos; + io_kiocb_done_pos(req, &req->rw.kiocb, ret >=3D 0 ? ret : 0); if (ret >=3D 0 && (req->rw.kiocb.ki_complete =3D=3D io_complete_rw)) __io_complete_rw(req, ret, issue_flags); else @@ -3662,21 +3711,23 @@ static int io_read(struct io_kiocb *req, unsigned i= nt issue_flags) =20 if (force_nonblock) { /* If the file doesn't support async, just async punt */ - if (unlikely(!io_file_supports_nowait(req))) { + if (unlikely(!io_file_supports_nowait(req) || + io_kiocb_update_pos(req, kiocb, &ppos, + req->rw.len, true))) { ret =3D io_setup_async_rw(req, iovec, s, true); return ret ?: -EAGAIN; } kiocb->ki_flags |=3D IOCB_NOWAIT; } else { + io_kiocb_update_pos(req, kiocb, &ppos, req->rw.len, false); /* Ensure we clear previously set non-block flag */ kiocb->ki_flags &=3D ~IOCB_NOWAIT; } =20 - ppos =3D io_kiocb_update_pos(req, kiocb); - ret =3D rw_verify_area(READ, req->file, ppos, req->result); if (unlikely(ret)) { kfree(iovec); + io_kiocb_done_pos(req, kiocb, 0); return ret; } =20 @@ -3798,14 +3849,17 @@ static int io_write(struct io_kiocb *req, unsigned = int issue_flags) (req->flags & REQ_F_ISREG)) goto copy_iov; =20 + /* if we cannot lock the file position then punt */ + if (unlikely(io_kiocb_update_pos(req, kiocb, &ppos, req->rw.len, true))) + goto copy_iov; + kiocb->ki_flags |=3D IOCB_NOWAIT; } else { + io_kiocb_update_pos(req, kiocb, &ppos, req->rw.len, false); /* Ensure we clear previously set non-block flag */ kiocb->ki_flags &=3D ~IOCB_NOWAIT; } =20 - ppos =3D io_kiocb_update_pos(req, kiocb); - ret =3D rw_verify_area(WRITE, req->file, ppos, req->result); if (unlikely(ret)) goto out_free; @@ -3858,6 +3912,7 @@ static int io_write(struct io_kiocb *req, unsigned in= t issue_flags) return ret ?: -EAGAIN; } out_free: + io_kiocb_done_pos(req, kiocb, 0); /* it's reportedly faster than delegating the null check to kfree() */ if (iovec) kfree(iovec); --=20 2.30.2