From nobody Fri Apr 17 21:03:53 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DF1B5CCA47F for ; Fri, 22 Jul 2022 09:51:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235345AbiGVJv6 (ORCPT ); Fri, 22 Jul 2022 05:51:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52574 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234522AbiGVJvl (ORCPT ); Fri, 22 Jul 2022 05:51:41 -0400 Received: from smtpbgjp3.qq.com (smtpbgjp3.qq.com [54.92.39.34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E225910FC4 for ; Fri, 22 Jul 2022 02:51:38 -0700 (PDT) X-QQ-mid: bizesmtp68t1658483491t0many5a Received: from eureka.localdomain ( [123.124.208.226]) by bizesmtp.qq.com (ESMTP) with id ; Fri, 22 Jul 2022 17:51:30 +0800 (CST) X-QQ-SSF: 01400000002000B0D000000A0000020 X-QQ-FEAT: Wp4pj0u9TIdmhLHodcPF8UXhopaXa0qcLkh1vWVFEEqxZPb5uzQeA+h0NzL61 iZOeByUPqeq7nP2nqVQ4ymngCpqY0MSoTi+RzWZQp06h6jtJOMotHZF0xn0sKfT6a0denIk mRa0Dx5hQR5Ta5Xr/ZtZleV5qcNUqOeAOYK72WkbYgEyzCbl7MfnHE9V05hErwj14sXYphV u8LH1sGUkxT6vA5ArypQWjMyrG4mt553ug/h+Nm95v2SXdAverM6LYkPNBlHtXoKbMYZnvy JOzzU+hU9LwpM4GQNNZGM8EnZepm06RtObuckDBerVm+WYV5xaFicWeb8Y42l+O2d7Wi836 UwH3d0iwpsRN/21MlVjuAe3zNlh1AMP2mlVkrYGB/FsayoRcPYs+diAEln80D1CcDZim+IT s7D7LLpHd7RXgUBxfl+nmcNjwl79jUv/ X-QQ-GoodBg: 1 From: Wang You To: axboe@kernel.dk, bvanassche@acm.org Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, jaegeuk@kernel.org, fio@vger.kernel.org, ming.lei@redhat.com, wangyoua@uniontech.com, wangxiaohua@uniontech.com Subject: [PATCH v2 1/2] block: Introduce nr_sched_batch sys interface Date: Fri, 22 Jul 2022 17:51:19 +0800 Message-Id: <20220722095120.371212-2-wangyoua@uniontech.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220722095120.371212-1-wangyoua@uniontech.com> References: <20220722095120.371212-1-wangyoua@uniontech.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:uniontech.com:qybgforeign:qybgforeign9 X-QQ-Bgrelay: 1 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The function of this patch is to add an nr_sched_batch interface under /sys/block/sdx/queue/, which can be used to set the number of batching requests. Of course, the default value is nr_requests and will follow nr_request when it has not been changed. Signed-off-by: Wang You --- block/blk-mq-sched.c | 4 +++- block/blk-sysfs.c | 32 ++++++++++++++++++++++++++++++++ include/linux/blkdev.h | 1 + 3 files changed, 36 insertions(+), 1 deletion(-) diff --git a/block/blk-mq-sched.c b/block/blk-mq-sched.c index a4f7c101b53b..92798a0c03bd 100644 --- a/block/blk-mq-sched.c +++ b/block/blk-mq-sched.c @@ -100,7 +100,7 @@ static int __blk_mq_do_dispatch_sched(struct blk_mq_hw_= ctx *hctx) if (hctx->dispatch_busy) max_dispatch =3D 1; else - max_dispatch =3D hctx->queue->nr_requests; + max_dispatch =3D q->nr_sched_batch; =20 do { struct request *rq; @@ -567,6 +567,7 @@ int blk_mq_init_sched(struct request_queue *q, struct e= levator_type *e) blk_queue_flag_clear(QUEUE_FLAG_SQ_SCHED, q); q->elevator =3D NULL; q->nr_requests =3D q->tag_set->queue_depth; + q->nr_sched_batch =3D q->nr_requests; return 0; } =20 @@ -577,6 +578,7 @@ int blk_mq_init_sched(struct request_queue *q, struct e= levator_type *e) */ q->nr_requests =3D 2 * min_t(unsigned int, q->tag_set->queue_depth, BLKDEV_DEFAULT_RQ); + q->nr_sched_batch =3D q->nr_requests; =20 if (blk_mq_is_shared_tags(flags)) { ret =3D blk_mq_init_sched_shared_tags(q); diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 9b905e9443e4..34fbfb22e3e1 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -70,6 +70,7 @@ queue_requests_store(struct request_queue *q, const char = *page, size_t count) { unsigned long nr; int ret, err; + unsigned long prev_nr_request =3D q->nr_requests; =20 if (!queue_is_mq(q)) return -EINVAL; @@ -85,6 +86,35 @@ queue_requests_store(struct request_queue *q, const char= *page, size_t count) if (err) return err; =20 + if (q->nr_sched_batch =3D=3D prev_nr_request || q->nr_sched_batch > nr) + q->nr_sched_batch =3D nr; + + return ret; +} + +static ssize_t elv_nr_batch_show(struct request_queue *q, char *page) +{ + return queue_var_show(q->nr_sched_batch, page); +} + +static ssize_t elv_nr_batch_store(struct request_queue *q, const char *pag= e, + size_t count) +{ + unsigned long nr; + int ret; + + if (!queue_is_mq(q)) + return -EINVAL; + + ret =3D queue_var_store(&nr, page, count); + if (ret < 0) + return ret; + + if (nr > q->nr_requests || nr < 1) + return -EINVAL; + + q->nr_sched_batch =3D nr; + return ret; } =20 @@ -573,6 +603,7 @@ QUEUE_RO_ENTRY(queue_max_segments, "max_segments"); QUEUE_RO_ENTRY(queue_max_integrity_segments, "max_integrity_segments"); QUEUE_RO_ENTRY(queue_max_segment_size, "max_segment_size"); QUEUE_RW_ENTRY(elv_iosched, "scheduler"); +QUEUE_RW_ENTRY(elv_nr_batch, "nr_sched_batch"); =20 QUEUE_RO_ENTRY(queue_logical_block_size, "logical_block_size"); QUEUE_RO_ENTRY(queue_physical_block_size, "physical_block_size"); @@ -632,6 +663,7 @@ static struct attribute *queue_attrs[] =3D { &queue_max_integrity_segments_entry.attr, &queue_max_segment_size_entry.attr, &elv_iosched_entry.attr, + &elv_nr_batch_entry.attr, &queue_hw_sector_size_entry.attr, &queue_logical_block_size_entry.attr, &queue_physical_block_size_entry.attr, diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 2f7b43444c5f..e38eba4161a3 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -422,6 +422,7 @@ struct request_queue { * queue settings */ unsigned long nr_requests; /* Max # of requests */ + unsigned long nr_sched_batch; =20 unsigned int dma_pad_mask; unsigned int dma_alignment; --=20 2.27.0 From nobody Fri Apr 17 21:03:53 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25915CCA48B for ; Fri, 22 Jul 2022 09:52:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234466AbiGVJwC (ORCPT ); Fri, 22 Jul 2022 05:52:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56746 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235403AbiGVJvn (ORCPT ); Fri, 22 Jul 2022 05:51:43 -0400 Received: from smtpbguseast3.qq.com (smtpbguseast3.qq.com [54.243.244.52]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DEC0513D74 for ; Fri, 22 Jul 2022 02:51:40 -0700 (PDT) X-QQ-mid: bizesmtp68t1658483494tbs747nq Received: from eureka.localdomain ( [123.124.208.226]) by bizesmtp.qq.com (ESMTP) with id ; Fri, 22 Jul 2022 17:51:33 +0800 (CST) X-QQ-SSF: 01400000002000B0D000000A0000020 X-QQ-FEAT: 3auutXC5AV8E3RzgBgxgTnhPlZLbjLHjhmHStedrgoyqUuafpVYdht6B4JDCO u6RofgjvrgbKz9de0wgfyMOyzkAv247dcjYD2hP+Oc82MkBTWp0qV56syPX2rUE56g1NBP9 BYzBirCBEeuv5xSUAJi970KiBMj04RMMYIMLGDS8uhEXw6QtLcvSBVqEXVYn9KFbl0ntSaJ Mxg2eQTkyYZ+tCekIwuP7uNYMaMBqmdTe0aBB845gOJ28WQd1VIUdiBPIybPuXAxprR/7HM whJRqFODt0hIAfX/mcoS0BV1QdBFrU6UVc9igOXudhmkPQZjG+KyLwjTGbNiHwLihB8m2be c6sSXwc5RNz7WDv6FqPtpSfTQeG6q0URLbsJ+zOeMtHCWH6fXtZ8Ae2FHVRHHL1UEpJPBVA wVZVPA/4OxOUZWIqK65Nkajc7nIJ+S6V X-QQ-GoodBg: 1 From: Wang You To: axboe@kernel.dk, bvanassche@acm.org Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, hch@lst.de, jaegeuk@kernel.org, fio@vger.kernel.org, ming.lei@redhat.com, wangyoua@uniontech.com, wangxiaohua@uniontech.com Subject: [PATCH v2 2/2] block/mq-deadline: Prioritize first request Date: Fri, 22 Jul 2022 17:51:20 +0800 Message-Id: <20220722095120.371212-3-wangyoua@uniontech.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20220722095120.371212-1-wangyoua@uniontech.com> References: <20220722095120.371212-1-wangyoua@uniontech.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-QQ-SENDSIZE: 520 Feedback-ID: bizesmtp:uniontech.com:qybgforeign:qybgforeign8 X-QQ-Bgrelay: 1 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The function deadline_head_request can select the request located at the head from the sector red-black tree of the mq-deadline scheduler, dispatch such a request may cause the disk access address to return to the head, so as to prevent it from swinging back and forth. - The presence of the scheduler batching requests may reduce or even eliminate its ability to fuse and sort, so I sometimes set it to 1. - This pathc may exacerbate the risk of expire, I don't know if a more absolute expire detection is necessary. - Tested some disks (mainly rotational disks and some SSDs) with the fio tool (using sync, direct, etc. parameters), the results show that they increase the disk's small sector sequential read and write performance, does this imply that changing nr_sched_batch is reasonable? - Later testing on different hardware showed that the raid controller=20 probably played an important role, but the performance of a single=20 disk did not improve as expected. so I'm not sure if this patch really=20 has the desired effect. Thanks, Wang. The following are all test data: The test hardware is: Kunpeng-920, HW-SAS3508+(MG04ACA400N * 2), RAID0. The test command is: fio -ioengine=3Dpsync -lockmem=3D1G -buffered=3D0 -time_based=3D1 -direct= =3D1 -iodepth=3D1 -thread -bs=3D512B -size=3D110g -numjobs=3D16 -runtime=3D300 -group_reporting -name=3Dread -filename=3D/dev/sdb14 -ioscheduler=3Dmq-deadline -rw=3Dread[,write,rw] - The following is the test data: origin/master: read iops: 152421 write iops: 136959 rw iops: 54593,54581 nr_sched_batch =3D 1: read iops: 166449 write iops: 139477 rw iops: 55363,55355 nr_sched_batch =3D 1, use deadline_head_request: read iops: 171177 write iops: 184431 rw iops: 56178,56169 - The test hardware is: Hygon C86, MG04ACA400N The test command is: fio -ioengine=3Dpsync -lockmem=3D1G -buffered=3D0 -time_based=3D1 -direct= =3D1 -iodepth=3D1=20 -thread -bs=3D512B -size=3D110g -numjobs=3D32 -runtime=3D300 -group_reporti= ng=20 -name=3Dread -filename=3D/dev/sdc -ioscheduler=3Dmq-deadline -rw=3Dread[,wr= ite,rw] The following is the test data: origin/master: read iops: 15463 write iops: 5949 rw iops: 574,576 nr_sched_batch =3D 1: read iops: 15082 write iops: 6283 rw iops: 783,786 nr_sched_batch =3D 1, use deadline_head_request: read iops: 15368 write iops: 6575 rw iops: 907,906 - The test hardware is: Kunpeng-920, HW-SAS3508 + Samsung SSD 780, RAID0. The test command is: fio -ioengine=3Dpsync -lockmem=3D1G -buffered=3D0 -time_based=3D1 -direct= =3D1 -iodepth=3D1=20 -thread -bs=3D512B -size=3D110g -numjobs=3D16 -runtime=3D300 -group_reporti= ng=20 -name=3Dread -filename=3D/dev/sda -ioscheduler=3Dmq-deadline -rw=3Dread[,wr= ite,rw] The following is the test data: origin/master: read iops: 115399 write iops: 136801 rw iops: 58082,58084 nr_sched_batch =3D 1, use deadline_head_request: read iops: 136473 write iops: 184646 rw iops: 56460,56454 Signed-off-by: Wang You --- block/mq-deadline.c | 43 ++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 40 insertions(+), 3 deletions(-) diff --git a/block/mq-deadline.c b/block/mq-deadline.c index 1a9e835e816c..4660dd4a16f6 100644 --- a/block/mq-deadline.c +++ b/block/mq-deadline.c @@ -344,6 +344,36 @@ deadline_next_request(struct deadline_data *dd, struct= dd_per_prio *per_prio, return rq; } =20 +static inline struct request * +deadline_head_request(struct deadline_data *dd, struct dd_per_prio *per_pr= io, + int data_dir) +{ + struct rb_node *node =3D rb_first(&per_prio->sort_list[data_dir]); + struct request *rq; + unsigned long flags; + + if (!node) + return NULL; + + rq =3D rb_entry_rq(node); + if (data_dir =3D=3D DD_READ || !blk_queue_is_zoned(rq->q)) + return rq; + + /* + * Look for a write request that can be dispatched, that is one with + * an unlocked target zone. + */ + spin_lock_irqsave(&dd->zone_lock, flags); + while (rq) { + if (blk_req_can_dispatch_to_zone(rq)) + break; + rq =3D deadline_latter_request(rq); + } + spin_unlock_irqrestore(&dd->zone_lock, flags); + + return rq; +} + /* * Returns true if and only if @rq started after @latest_start where * @latest_start is in jiffies. @@ -429,13 +459,20 @@ static struct request *__dd_dispatch_request(struct d= eadline_data *dd, * we are not running a batch, find best request for selected data_dir */ next_rq =3D deadline_next_request(dd, per_prio, data_dir); - if (deadline_check_fifo(per_prio, data_dir) || !next_rq) { + if (deadline_check_fifo(per_prio, data_dir)) { /* * A deadline has expired, the last request was in the other - * direction, or we have run out of higher-sectored requests. - * Start again from the request with the earliest expiry time. + * direction. Start again from the request with the earliest + * expiry time. */ rq =3D deadline_fifo_request(dd, per_prio, data_dir); + } else if (!next_rq) { + /* + * There is no operation expired, and we have run out of + * higher-sectored requests. Look for the sector at the head + * which may reduce disk seek consumption. + */ + rq =3D deadline_head_request(dd, per_prio, data_dir); } else { /* * The last req was the same dir and we have a next request in --=20 2.27.0