From nobody Tue Dec 16 23:41:39 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A56C2C88CB9 for ; Thu, 24 Aug 2023 14:46:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241758AbjHXOpg (ORCPT ); Thu, 24 Aug 2023 10:45:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36748 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241734AbjHXOpM (ORCPT ); Thu, 24 Aug 2023 10:45:12 -0400 Received: from out-23.mta1.migadu.com (out-23.mta1.migadu.com [IPv6:2001:41d0:203:375::17]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 67B2C1BC6 for ; Thu, 24 Aug 2023 07:45:09 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1692888307; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZPF+c1DJ7XmvhtLIn8XJOXXiqYrgAJHZCAQvWuMIHRQ=; b=PtBO8GhDb7ozEtU9kP2iVRjbUhhPDrVqhXJ12tWghz4MV8Z/vLY50fEpBZ7YsatVtjEQMW fQBcJ0VvKSXqaNK2AYY/nPNI2Ych8r8jnsvEavHhFq3vkhhF7eQc24BK8DJbw4dHLUbTm7 tR47+vDLlad1Vrwe7GOF2+A7+qMCfLs= From: chengming.zhou@linux.dev To: axboe@kernel.dk, hch@lst.de, ming.lei@redhat.com, bvanassche@acm.org, kbusch@kernel.org Cc: mst@redhat.com, sagi@grimberg.me, damien.lemoal@opensource.wdc.com, kch@nvidia.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 1/6] blk-mq: account active requests when get driver tag Date: Thu, 24 Aug 2023 22:43:58 +0800 Message-ID: <20230824144403.2135739-2-chengming.zhou@linux.dev> In-Reply-To: <20230824144403.2135739-1-chengming.zhou@linux.dev> References: <20230824144403.2135739-1-chengming.zhou@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Chengming Zhou There is a limit that batched queue_rqs() can't work on shared tags queue, since the account of active requests can't be done there. Now we account the active requests only in blk_mq_get_driver_tag(), which is not the time we get driver tag actually (with none elevator). To support batched queue_rqs() on shared tags queue, we move the account of active requests to where we get the driver tag: 1. none elevator: blk_mq_get_tags() and blk_mq_get_tag() 2. other elevator: __blk_mq_alloc_driver_tag() This is clearer and match with the unaccount side, which just happen when we put the driver tag. The other good point is that we don't need RQF_MQ_INFLIGHT trick anymore, which used to avoid double account of flush request. Now we only account when actually get the driver tag, so all is good. We will remove RQF_MQ_INFLIGHT in the next patch. Signed-off-by: Chengming Zhou --- block/blk-mq.c | 33 +++++++++++------------------ block/blk-mq.h | 56 ++++++++++++++++++++++++++++++++++++-------------- 2 files changed, 53 insertions(+), 36 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index ec922c6bccbe..bcdb750ef575 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -426,6 +426,8 @@ __blk_mq_alloc_requests_batch(struct blk_mq_alloc_data = *data) rq_list_add(data->cached_rq, rq); nr++; } + if (!(data->rq_flags & RQF_SCHED_TAGS)) + blk_mq_add_active_requests(data->hctx, nr); /* caller already holds a reference, add for remainder */ percpu_ref_get_many(&data->q->q_usage_counter, nr - 1); data->nr_tags -=3D nr; @@ -510,6 +512,8 @@ static struct request *__blk_mq_alloc_requests(struct b= lk_mq_alloc_data *data) goto retry; } =20 + if (!(data->rq_flags & RQF_SCHED_TAGS)) + blk_mq_inc_active_requests(data->hctx); rq =3D blk_mq_rq_ctx_init(data, blk_mq_tags_from_data(data), tag); blk_mq_rq_time_init(rq, alloc_time_ns); return rq; @@ -669,6 +673,8 @@ struct request *blk_mq_alloc_request_hctx(struct reques= t_queue *q, tag =3D blk_mq_get_tag(&data); if (tag =3D=3D BLK_MQ_NO_TAG) goto out_queue_exit; + if (!(data.rq_flags & RQF_SCHED_TAGS)) + blk_mq_inc_active_requests(data.hctx); rq =3D blk_mq_rq_ctx_init(&data, blk_mq_tags_from_data(&data), tag); blk_mq_rq_time_init(rq, alloc_time_ns); rq->__data_len =3D 0; @@ -708,11 +714,10 @@ static void __blk_mq_free_request(struct request *rq) blk_pm_mark_last_busy(rq); rq->mq_hctx =3D NULL; =20 - if (rq->rq_flags & RQF_MQ_INFLIGHT) - __blk_mq_dec_active_requests(hctx); - - if (rq->tag !=3D BLK_MQ_NO_TAG) + if (rq->tag !=3D BLK_MQ_NO_TAG) { + blk_mq_dec_active_requests(hctx); blk_mq_put_tag(hctx->tags, ctx, rq->tag); + } if (sched_tag !=3D BLK_MQ_NO_TAG) blk_mq_put_tag(hctx->sched_tags, ctx, sched_tag); blk_mq_sched_restart(hctx); @@ -1065,8 +1070,7 @@ static inline void blk_mq_flush_tag_batch(struct blk_= mq_hw_ctx *hctx, * All requests should have been marked as RQF_MQ_INFLIGHT, so * update hctx->nr_active in batch */ - if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) - __blk_mq_sub_active_requests(hctx, nr_tags); + blk_mq_sub_active_requests(hctx, nr_tags); =20 blk_mq_put_tags(hctx->tags, tag_array, nr_tags); percpu_ref_put_many(&q->q_usage_counter, nr_tags); @@ -1748,7 +1752,7 @@ struct request *blk_mq_dequeue_from_ctx(struct blk_mq= _hw_ctx *hctx, return data.rq; } =20 -static bool __blk_mq_alloc_driver_tag(struct request *rq) +bool __blk_mq_alloc_driver_tag(struct request *rq) { struct sbitmap_queue *bt =3D &rq->mq_hctx->tags->bitmap_tags; unsigned int tag_offset =3D rq->mq_hctx->tags->nr_reserved_tags; @@ -1769,20 +1773,7 @@ static bool __blk_mq_alloc_driver_tag(struct request= *rq) return false; =20 rq->tag =3D tag + tag_offset; - return true; -} - -bool __blk_mq_get_driver_tag(struct blk_mq_hw_ctx *hctx, struct request *r= q) -{ - if (rq->tag =3D=3D BLK_MQ_NO_TAG && !__blk_mq_alloc_driver_tag(rq)) - return false; - - if ((hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) && - !(rq->rq_flags & RQF_MQ_INFLIGHT)) { - rq->rq_flags |=3D RQF_MQ_INFLIGHT; - __blk_mq_inc_active_requests(hctx); - } - hctx->tags->rqs[rq->tag] =3D rq; + blk_mq_inc_active_requests(rq->mq_hctx); return true; } =20 diff --git a/block/blk-mq.h b/block/blk-mq.h index 1743857e0b01..560a76df290a 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -271,12 +271,18 @@ static inline int blk_mq_get_rq_budget_token(struct r= equest *rq) return -1; } =20 -static inline void __blk_mq_inc_active_requests(struct blk_mq_hw_ctx *hctx) +static inline void __blk_mq_add_active_requests(struct blk_mq_hw_ctx *hctx, + int val) { if (blk_mq_is_shared_tags(hctx->flags)) - atomic_inc(&hctx->queue->nr_active_requests_shared_tags); + atomic_add(val, &hctx->queue->nr_active_requests_shared_tags); else - atomic_inc(&hctx->nr_active); + atomic_add(val, &hctx->nr_active); +} + +static inline void __blk_mq_inc_active_requests(struct blk_mq_hw_ctx *hctx) +{ + __blk_mq_add_active_requests(hctx, 1); } =20 static inline void __blk_mq_sub_active_requests(struct blk_mq_hw_ctx *hctx, @@ -293,6 +299,32 @@ static inline void __blk_mq_dec_active_requests(struct= blk_mq_hw_ctx *hctx) __blk_mq_sub_active_requests(hctx, 1); } =20 +static inline void blk_mq_add_active_requests(struct blk_mq_hw_ctx *hctx, + int val) +{ + if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) + __blk_mq_add_active_requests(hctx, val); +} + +static inline void blk_mq_inc_active_requests(struct blk_mq_hw_ctx *hctx) +{ + if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) + __blk_mq_inc_active_requests(hctx); +} + +static inline void blk_mq_sub_active_requests(struct blk_mq_hw_ctx *hctx, + int val) +{ + if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) + __blk_mq_sub_active_requests(hctx, val); +} + +static inline void blk_mq_dec_active_requests(struct blk_mq_hw_ctx *hctx) +{ + if (hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED) + __blk_mq_dec_active_requests(hctx); +} + static inline int __blk_mq_active_requests(struct blk_mq_hw_ctx *hctx) { if (blk_mq_is_shared_tags(hctx->flags)) @@ -302,13 +334,9 @@ static inline int __blk_mq_active_requests(struct blk_= mq_hw_ctx *hctx) static inline void __blk_mq_put_driver_tag(struct blk_mq_hw_ctx *hctx, struct request *rq) { + blk_mq_dec_active_requests(hctx); blk_mq_put_tag(hctx->tags, rq->mq_ctx, rq->tag); rq->tag =3D BLK_MQ_NO_TAG; - - if (rq->rq_flags & RQF_MQ_INFLIGHT) { - rq->rq_flags &=3D ~RQF_MQ_INFLIGHT; - __blk_mq_dec_active_requests(hctx); - } } =20 static inline void blk_mq_put_driver_tag(struct request *rq) @@ -319,19 +347,17 @@ static inline void blk_mq_put_driver_tag(struct reque= st *rq) __blk_mq_put_driver_tag(rq->mq_hctx, rq); } =20 -bool __blk_mq_get_driver_tag(struct blk_mq_hw_ctx *hctx, struct request *r= q); +bool __blk_mq_alloc_driver_tag(struct request *rq); =20 static inline bool blk_mq_get_driver_tag(struct request *rq) { struct blk_mq_hw_ctx *hctx =3D rq->mq_hctx; =20 - if (rq->tag !=3D BLK_MQ_NO_TAG && - !(hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) { - hctx->tags->rqs[rq->tag] =3D rq; - return true; - } + if (rq->tag =3D=3D BLK_MQ_NO_TAG && !__blk_mq_alloc_driver_tag(rq)) + return false; =20 - return __blk_mq_get_driver_tag(hctx, rq); + hctx->tags->rqs[rq->tag] =3D rq; + return true; } =20 static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap) --=20 2.41.0 From nobody Tue Dec 16 23:41:39 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B5FFCEE49A5 for ; Thu, 24 Aug 2023 14:46:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241772AbjHXOpj (ORCPT ); Thu, 24 Aug 2023 10:45:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36764 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241736AbjHXOpQ (ORCPT ); Thu, 24 Aug 2023 10:45:16 -0400 Received: from out-9.mta1.migadu.com (out-9.mta1.migadu.com [95.215.58.9]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 94A331BC6 for ; Thu, 24 Aug 2023 07:45:13 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1692888312; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ioomYqYWf906f961IGBrVvToN2mTXaJ5vW4lI+wrQ5c=; b=rU04i+t4Os3zyH1G2hJfWVt7Br9LhZsFLFPOZQ5K5AZ6k5ubnNReHJm6gznd+YklvVsgta WvQjp4PTqv/+lMxf3LvVskfjbuMYNS6iXFWQtB26jb+IyLndKSbKreqEk3DRzkBWpfTrGW 0zmZXam001cC/A2rN2erwclxhq/ET7Y= From: chengming.zhou@linux.dev To: axboe@kernel.dk, hch@lst.de, ming.lei@redhat.com, bvanassche@acm.org, kbusch@kernel.org Cc: mst@redhat.com, sagi@grimberg.me, damien.lemoal@opensource.wdc.com, kch@nvidia.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 2/6] blk-mq: remove RQF_MQ_INFLIGHT Date: Thu, 24 Aug 2023 22:43:59 +0800 Message-ID: <20230824144403.2135739-3-chengming.zhou@linux.dev> In-Reply-To: <20230824144403.2135739-1-chengming.zhou@linux.dev> References: <20230824144403.2135739-1-chengming.zhou@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Chengming Zhou Since the previous patch change to only account active requests when we really allocate the driver tag, the RQF_MQ_INFLIGHT can be removed and no double account problem. 1. none elevator: flush request will use the first pending request's driver tag, won't double account. 2. other elevator: flush request will be accounted when allocate driver tag when issue, and will be unaccounted when it put the driver tag. Signed-off-by: Chengming Zhou --- block/blk-flush.c | 11 ++--------- block/blk-mq-debugfs.c | 1 - block/blk-mq.c | 4 ---- include/linux/blk-mq.h | 2 -- 4 files changed, 2 insertions(+), 16 deletions(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index e73dc22d05c1..3f4d41952ef2 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -323,16 +323,9 @@ static void blk_kick_flush(struct request_queue *q, st= ruct blk_flush_queue *fq, flush_rq->mq_ctx =3D first_rq->mq_ctx; flush_rq->mq_hctx =3D first_rq->mq_hctx; =20 - if (!q->elevator) { + if (!q->elevator) flush_rq->tag =3D first_rq->tag; - - /* - * We borrow data request's driver tag, so have to mark - * this flush request as INFLIGHT for avoiding double - * account of this driver tag - */ - flush_rq->rq_flags |=3D RQF_MQ_INFLIGHT; - } else + else flush_rq->internal_tag =3D first_rq->internal_tag; =20 flush_rq->cmd_flags =3D REQ_OP_FLUSH | REQ_PREFLUSH; diff --git a/block/blk-mq-debugfs.c b/block/blk-mq-debugfs.c index c3b5930106b2..5cbeb9344f2f 100644 --- a/block/blk-mq-debugfs.c +++ b/block/blk-mq-debugfs.c @@ -246,7 +246,6 @@ static const char *const rqf_name[] =3D { RQF_NAME(STARTED), RQF_NAME(FLUSH_SEQ), RQF_NAME(MIXED_MERGE), - RQF_NAME(MQ_INFLIGHT), RQF_NAME(DONTPREP), RQF_NAME(SCHED_TAGS), RQF_NAME(USE_SCHED), diff --git a/block/blk-mq.c b/block/blk-mq.c index bcdb750ef575..1d0459142f61 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1066,10 +1066,6 @@ static inline void blk_mq_flush_tag_batch(struct blk= _mq_hw_ctx *hctx, { struct request_queue *q =3D hctx->queue; =20 - /* - * All requests should have been marked as RQF_MQ_INFLIGHT, so - * update hctx->nr_active in batch - */ blk_mq_sub_active_requests(hctx, nr_tags); =20 blk_mq_put_tags(hctx->tags, tag_array, nr_tags); diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index 958ed7e89b30..1ab3081c82ed 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -32,8 +32,6 @@ typedef __u32 __bitwise req_flags_t; #define RQF_FLUSH_SEQ ((__force req_flags_t)(1 << 4)) /* merge of different types, fail separately */ #define RQF_MIXED_MERGE ((__force req_flags_t)(1 << 5)) -/* track inflight for MQ */ -#define RQF_MQ_INFLIGHT ((__force req_flags_t)(1 << 6)) /* don't call prep for this one */ #define RQF_DONTPREP ((__force req_flags_t)(1 << 7)) /* use hctx->sched_tags */ --=20 2.41.0 From nobody Tue Dec 16 23:41:39 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CCF1EEE4993 for ; Thu, 24 Aug 2023 14:46:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241777AbjHXOpl (ORCPT ); Thu, 24 Aug 2023 10:45:41 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241740AbjHXOpV (ORCPT ); Thu, 24 Aug 2023 10:45:21 -0400 Received: from out-25.mta1.migadu.com (out-25.mta1.migadu.com [95.215.58.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D81121BC9 for ; Thu, 24 Aug 2023 07:45:17 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1692888316; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=Z52FLMQcKzOLYIHH0RGB6ixBmspzLyNsEH35/uGCkUU=; b=gfcGC/VTEK9hRqOYp40Qr8BqqMyydafpamnriWKBMMOUVH+b0BWiM8Kl/hOyRFfYEdpzgE 0mC1Dr7e7ZsuUplosZ6goBPqIcKGYF5vrsRIJbJMCQOU4WGL5nvDd3AyAQWTwXddXPXGsH r4ir/SNLmF/05Id+BbtnD+GWCC8kRgk= From: chengming.zhou@linux.dev To: axboe@kernel.dk, hch@lst.de, ming.lei@redhat.com, bvanassche@acm.org, kbusch@kernel.org Cc: mst@redhat.com, sagi@grimberg.me, damien.lemoal@opensource.wdc.com, kch@nvidia.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 3/6] blk-mq: support batched queue_rqs() on shared tags queue Date: Thu, 24 Aug 2023 22:44:00 +0800 Message-ID: <20230824144403.2135739-4-chengming.zhou@linux.dev> In-Reply-To: <20230824144403.2135739-1-chengming.zhou@linux.dev> References: <20230824144403.2135739-1-chengming.zhou@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Chengming Zhou Since active requests have been accounted when allocate driver tags, we can remove this limit now. Signed-off-by: Chengming Zhou --- block/blk-mq.c | 7 +------ 1 file changed, 1 insertion(+), 6 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 1d0459142f61..44595385b34c 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -2781,13 +2781,8 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, b= ool from_schedule) * If we do, we can dispatch the whole plug list in one go. We * already know at this point that all requests belong to the * same queue, caller must ensure that's the case. - * - * Since we pass off the full list to the driver at this point, - * we do not increment the active request count for the queue. - * Bypass shared tags for now because of that. */ - if (q->mq_ops->queue_rqs && - !(rq->mq_hctx->flags & BLK_MQ_F_TAG_QUEUE_SHARED)) { + if (q->mq_ops->queue_rqs) { blk_mq_run_dispatch_ops(q, __blk_mq_flush_plug_list(q, plug)); if (rq_list_empty(plug->mq_list)) --=20 2.41.0 From nobody Tue Dec 16 23:41:39 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E3A85EE49AB for ; Thu, 24 Aug 2023 14:46:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241784AbjHXOpm (ORCPT ); Thu, 24 Aug 2023 10:45:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241741AbjHXOpY (ORCPT ); Thu, 24 Aug 2023 10:45:24 -0400 Received: from out-17.mta1.migadu.com (out-17.mta1.migadu.com [IPv6:2001:41d0:203:375::11]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9F4851BC5 for ; Thu, 24 Aug 2023 07:45:22 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1692888321; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=zNMIjrcjjgLtOR47Ivp/pWFjvTNQF2gw34heMee5wC0=; b=szAF17CsZXJ3rsgIWhZDvvPDDCZ+0LKs7QcffukGw3yP+NwTajtWYV5l9u1BFdXMTMcrvC PTN43prCMhw1klaOb0ULkbBxn9i+FefrzJdRNDCJ/sBDGSe1d+N5IErcpX1IeVTTSDHj0Z k4Ak3o63jDl+d48mHBU+euzjOgnFCVs= From: chengming.zhou@linux.dev To: axboe@kernel.dk, hch@lst.de, ming.lei@redhat.com, bvanassche@acm.org, kbusch@kernel.org Cc: mst@redhat.com, sagi@grimberg.me, damien.lemoal@opensource.wdc.com, kch@nvidia.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 4/6] blk-mq: update driver tags request table when start request Date: Thu, 24 Aug 2023 22:44:01 +0800 Message-ID: <20230824144403.2135739-5-chengming.zhou@linux.dev> In-Reply-To: <20230824144403.2135739-1-chengming.zhou@linux.dev> References: <20230824144403.2135739-1-chengming.zhou@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Chengming Zhou Now we update driver tags request table in blk_mq_get_driver_tag(), so the driver that support queue_rqs() have to update that inflight table by itself. Move it to blk_mq_start_request(), which is a better place where we setup the deadline for request timeout check. And it's just where the request becomes inflight. Signed-off-by: Chengming Zhou --- block/blk-mq.c | 1 + block/blk-mq.h | 3 --- drivers/block/virtio_blk.c | 2 -- drivers/nvme/host/pci.c | 1 - 4 files changed, 1 insertion(+), 6 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 44595385b34c..ff1b0f3ab3a8 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1259,6 +1259,7 @@ void blk_mq_start_request(struct request *rq) =20 blk_add_timer(rq); WRITE_ONCE(rq->state, MQ_RQ_IN_FLIGHT); + rq->mq_hctx->tags->rqs[rq->tag] =3D rq; =20 #ifdef CONFIG_BLK_DEV_INTEGRITY if (blk_integrity_rq(rq) && req_op(rq) =3D=3D REQ_OP_WRITE) diff --git a/block/blk-mq.h b/block/blk-mq.h index 560a76df290a..f75a9ecfebde 100644 --- a/block/blk-mq.h +++ b/block/blk-mq.h @@ -351,12 +351,9 @@ bool __blk_mq_alloc_driver_tag(struct request *rq); =20 static inline bool blk_mq_get_driver_tag(struct request *rq) { - struct blk_mq_hw_ctx *hctx =3D rq->mq_hctx; - if (rq->tag =3D=3D BLK_MQ_NO_TAG && !__blk_mq_alloc_driver_tag(rq)) return false; =20 - hctx->tags->rqs[rq->tag] =3D rq; return true; } =20 diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c index 1fe011676d07..4689ac2e0c0e 100644 --- a/drivers/block/virtio_blk.c +++ b/drivers/block/virtio_blk.c @@ -470,8 +470,6 @@ static bool virtblk_prep_rq_batch(struct request *req) struct virtio_blk *vblk =3D req->mq_hctx->queue->queuedata; struct virtblk_req *vbr =3D blk_mq_rq_to_pdu(req); =20 - req->mq_hctx->tags->rqs[req->tag] =3D req; - return virtblk_prep_rq(req->mq_hctx, vblk, req, vbr) =3D=3D BLK_STS_OK; } =20 diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 2f57da12d983..c2e942808eff 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -924,7 +924,6 @@ static bool nvme_prep_rq_batch(struct nvme_queue *nvmeq= , struct request *req) if (unlikely(!nvme_check_ready(&nvmeq->dev->ctrl, req, true))) return false; =20 - req->mq_hctx->tags->rqs[req->tag] =3D req; return nvme_prep_rq(nvmeq->dev, req) =3D=3D BLK_STS_OK; } =20 --=20 2.41.0 From nobody Tue Dec 16 23:41:39 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0635BEE49A6 for ; Thu, 24 Aug 2023 14:46:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241803AbjHXOpo (ORCPT ); Thu, 24 Aug 2023 10:45:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51382 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241746AbjHXOp3 (ORCPT ); Thu, 24 Aug 2023 10:45:29 -0400 Received: from out-11.mta1.migadu.com (out-11.mta1.migadu.com [IPv6:2001:41d0:203:375::b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 737FF1BC6 for ; Thu, 24 Aug 2023 07:45:27 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1692888325; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WfeZa4Y5jAeM+nwkiT/BS1yQ2iWyllMXdsEwns8FPns=; b=EKJiNVsaft0sHEc9R6T7LMrjABFaeKezxlh5EL18TMxj99tnCeqaL+HKir1bIpvCuj88ql EY91OcF5k/gn2ZYgQx4kYxBZlbVBiM2Fkf2WohB8Bm5XFJx8QrjNdOd0fjIhiStcyFyShC HARTCWpXJhX4r/dYItsWnFfXEba9XnE= From: chengming.zhou@linux.dev To: axboe@kernel.dk, hch@lst.de, ming.lei@redhat.com, bvanassche@acm.org, kbusch@kernel.org Cc: mst@redhat.com, sagi@grimberg.me, damien.lemoal@opensource.wdc.com, kch@nvidia.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 5/6] blk-mq: fix potential reorder of request state and deadline Date: Thu, 24 Aug 2023 22:44:02 +0800 Message-ID: <20230824144403.2135739-6-chengming.zhou@linux.dev> In-Reply-To: <20230824144403.2135739-1-chengming.zhou@linux.dev> References: <20230824144403.2135739-1-chengming.zhou@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Chengming Zhou CPU0 CPU1 blk_mq_start_request() blk_mq_req_expired() WRITE_ONCE(rq->deadline) WRITE_ONCE(rq->state) if (READ_ONCE(rq->state) !=3D IN_FLIGHT) return deadline =3D READ_ONCE(rq->deadline) If CPU1 speculately reorder rq->deadline LOAD before rq->state, the deadline will be the initial value 0. CPU0 CPU1 blk_mq_start_request() blk_mq_req_expired() deadline =3D READ_ONCE(rq->deadline) WRITE_ONCE(rq->deadline) WRITE_ONCE(rq->state) if (READ_ONCE(rq->state) !=3D IN_FLIGHT) return Signed-off-by: Chengming Zhou --- block/blk-mq.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/block/blk-mq.c b/block/blk-mq.c index ff1b0f3ab3a8..49cbf826b100 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1258,6 +1258,8 @@ void blk_mq_start_request(struct request *rq) WARN_ON_ONCE(blk_mq_rq_state(rq) !=3D MQ_RQ_IDLE); =20 blk_add_timer(rq); + /* Pair with smp_rmb in blk_mq_req_expired(). */ + smp_wmb(); WRITE_ONCE(rq->state, MQ_RQ_IN_FLIGHT); rq->mq_hctx->tags->rqs[rq->tag] =3D rq; =20 @@ -1568,6 +1570,12 @@ static bool blk_mq_req_expired(struct request *rq, s= truct blk_expired_data *expi if (rq->rq_flags & RQF_TIMED_OUT) return false; =20 + /* + * Order LOADs of rq->state and rq->deadline, pair with + * smp_wmb in blk_mq_start_request(). + */ + smp_rmb(); + deadline =3D READ_ONCE(rq->deadline); if (time_after_eq(expired->timeout_start, deadline)) return true; --=20 2.41.0 From nobody Tue Dec 16 23:41:39 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8632AC7EE2C for ; Thu, 24 Aug 2023 14:46:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241753AbjHXOqF (ORCPT ); Thu, 24 Aug 2023 10:46:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36990 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241749AbjHXOpe (ORCPT ); Thu, 24 Aug 2023 10:45:34 -0400 Received: from out-33.mta1.migadu.com (out-33.mta1.migadu.com [95.215.58.33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A299A1BC6 for ; Thu, 24 Aug 2023 07:45:32 -0700 (PDT) X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1692888330; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KuUH+9zTpQHPAacq3J8wMyhnGazr4U6zklKopxDD0no=; b=nQd1/FwzjzBFi7rMxNcNiH+nJ08oYnR89VYgKO2z6iN1+2H5zVbUphrNLZ+n1fVUXYQHqz fHhAdGpe+xkMok57NI7IWItVvXkPAhc32VIii9eqjap3szb7J/lQk5rqsAywa6VAmMKVqK o/vCJY4RvmF4WMsMfmYiXwLFstL5JlM= From: chengming.zhou@linux.dev To: axboe@kernel.dk, hch@lst.de, ming.lei@redhat.com, bvanassche@acm.org, kbusch@kernel.org Cc: mst@redhat.com, sagi@grimberg.me, damien.lemoal@opensource.wdc.com, kch@nvidia.com, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, zhouchengming@bytedance.com Subject: [PATCH 6/6] block/null_blk: add queue_rqs() support Date: Thu, 24 Aug 2023 22:44:03 +0800 Message-ID: <20230824144403.2135739-7-chengming.zhou@linux.dev> In-Reply-To: <20230824144403.2135739-1-chengming.zhou@linux.dev> References: <20230824144403.2135739-1-chengming.zhou@linux.dev> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Migadu-Flow: FLOW_OUT Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" From: Chengming Zhou Add batched mq_ops.queue_rqs() support in null_blk for testing. The implementation is much easy since null_blk doesn't have commit_rqs(). We simply handle each request one by one, if errors are encountered, leave them in the passed in list and return back. There is about 3.6% improvement in IOPS of fio/t/io_uring on null_blk with hw_queue_depth=3D256 on my test VM, from 1.09M to 1.13M. Signed-off-by: Chengming Zhou --- drivers/block/null_blk/main.c | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c index 864013019d6b..1b1b58d36707 100644 --- a/drivers/block/null_blk/main.c +++ b/drivers/block/null_blk/main.c @@ -1742,6 +1742,25 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_c= tx *hctx, return null_handle_cmd(cmd, sector, nr_sectors, req_op(rq)); } =20 +static void null_queue_rqs(struct request **rqlist) +{ + struct request *requeue_list =3D NULL; + struct request **requeue_lastp =3D &requeue_list; + struct blk_mq_queue_data bd =3D { }; + blk_status_t ret; + + do { + struct request *rq =3D rq_list_pop(rqlist); + + bd.rq =3D rq; + ret =3D null_queue_rq(rq->mq_hctx, &bd); + if (ret =3D=3D BLK_STS_RESOURCE || ret =3D=3D BLK_STS_DEV_RESOURCE) + rq_list_add_tail(&requeue_lastp, rq); + } while (!rq_list_empty(*rqlist)); + + *rqlist =3D requeue_list; +} + static void cleanup_queue(struct nullb_queue *nq) { bitmap_free(nq->tag_map); @@ -1794,6 +1813,7 @@ static int null_init_hctx(struct blk_mq_hw_ctx *hctx,= void *driver_data, =20 static const struct blk_mq_ops null_mq_ops =3D { .queue_rq =3D null_queue_rq, + .queue_rqs =3D null_queue_rqs, .complete =3D null_complete_rq, .timeout =3D null_timeout_rq, .poll =3D null_poll, --=20 2.41.0