From nobody Fri May 1 11:12:19 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B57CDC433F5 for ; Sat, 28 May 2022 06:31:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355902AbiE1GbQ (ORCPT ); Sat, 28 May 2022 02:31:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43154 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355835AbiE1GaY (ORCPT ); Sat, 28 May 2022 02:30:24 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2569846669; Fri, 27 May 2022 23:30:04 -0700 (PDT) Received: from kwepemi100014.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4L9BbG6F2gzjX4V; Sat, 28 May 2022 14:29:14 +0800 (CST) Received: from kwepemm600009.china.huawei.com (7.193.23.164) by kwepemi100014.china.huawei.com (7.221.188.106) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:00 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:00 +0800 From: Yu Kuai To: , , , CC: , , , , Subject: [PATCH -next v5 1/8] blk-throttle: fix that io throttle can only work for single bio Date: Sat, 28 May 2022 14:43:23 +0800 Message-ID: <20220528064330.3471000-2-yukuai3@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220528064330.3471000-1-yukuai3@huawei.com> References: <20220528064330.3471000-1-yukuai3@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600009.china.huawei.com (7.193.23.164) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" commit 9f5ede3c01f9 ("block: throttle split bio in case of iops limit") introduce a new problem, for example: Test scripts: cd /sys/fs/cgroup/blkio/ echo "8:0 1024" > blkio.throttle.write_bps_device echo $$ > cgroup.procs dd if=3D/dev/zero of=3D/dev/sda bs=3D10k count=3D1 oflag=3Ddirect & dd if=3D/dev/zero of=3D/dev/sda bs=3D10k count=3D1 oflag=3Ddirect & Test result: 10240 bytes (10 kB, 10 KiB) copied, 10.0134 s, 1.0 kB/s 10240 bytes (10 kB, 10 KiB) copied, 10.0135 s, 1.0 kB/s The problem is that the second bio is finished after 10s instead of 20s. This is because if some bios are already queued, current bio is queued directly and the flag 'BIO_THROTTLED' is set. And later, when former bios are dispatched, this bio will be dispatched without waiting at all, this is due to tg_with_in_bps_limit() return 0 for this bio. In order to fix the problem, don't skip flaged bio in tg_with_in_bps_limit(), and for the problem that split bio can be double accounted, compensate the over-accounting in __blk_throtl_bio(). Fixes: 9f5ede3c01f9 ("block: throttle split bio in case of iops limit") Signed-off-by: Yu Kuai Reviewed-by: Ming Lei --- block/blk-throttle.c | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 139b2d7a99e2..5c1d1c4d8188 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -811,7 +811,7 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg,= struct bio *bio, unsigned int bio_size =3D throtl_bio_data_size(bio); =20 /* no need to throttle if this bio's bytes have been accounted */ - if (bps_limit =3D=3D U64_MAX || bio_flagged(bio, BIO_THROTTLED)) { + if (bps_limit =3D=3D U64_MAX) { if (wait) *wait =3D 0; return true; @@ -921,11 +921,8 @@ static void throtl_charge_bio(struct throtl_grp *tg, s= truct bio *bio) unsigned int bio_size =3D throtl_bio_data_size(bio); =20 /* Charge the bio to the group */ - if (!bio_flagged(bio, BIO_THROTTLED)) { - tg->bytes_disp[rw] +=3D bio_size; - tg->last_bytes_disp[rw] +=3D bio_size; - } - + tg->bytes_disp[rw] +=3D bio_size; + tg->last_bytes_disp[rw] +=3D bio_size; tg->io_disp[rw]++; tg->last_io_disp[rw]++; =20 @@ -2121,6 +2118,21 @@ bool __blk_throtl_bio(struct bio *bio) tg->last_low_overflow_time[rw] =3D jiffies; throtl_downgrade_check(tg); throtl_upgrade_check(tg); + + /* + * re-entered bio has accounted bytes already, so try to + * compensate previous over-accounting. However, if new + * slice is started, just forget it. + */ + if (bio_flagged(bio, BIO_THROTTLED)) { + unsigned int bio_size =3D throtl_bio_data_size(bio); + + if (tg->bytes_disp[rw] >=3D bio_size) + tg->bytes_disp[rw] -=3D bio_size; + if (tg->last_bytes_disp[rw] >=3D bio_size) + tg->last_bytes_disp[rw] -=3D bio_size; + } + /* throtl is FIFO - if bios are already queued, should queue */ if (sq->nr_queued[rw]) break; --=20 2.31.1 From nobody Fri May 1 11:12:19 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A1D44C433F5 for ; Sat, 28 May 2022 06:30:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355891AbiE1Gan (ORCPT ); Sat, 28 May 2022 02:30:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355822AbiE1GaX (ORCPT ); Sat, 28 May 2022 02:30:23 -0400 Received: from szxga03-in.huawei.com (szxga03-in.huawei.com [45.249.212.189]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E21BD3D49F; Fri, 27 May 2022 23:30:03 -0700 (PDT) Received: from kwepemi100015.china.huawei.com (unknown [172.30.72.56]) by szxga03-in.huawei.com (SkyGuard) with ESMTP id 4L9Bc26WQqzDqWP; Sat, 28 May 2022 14:29:54 +0800 (CST) Received: from kwepemm600009.china.huawei.com (7.193.23.164) by kwepemi100015.china.huawei.com (7.221.188.125) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:01 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:00 +0800 From: Yu Kuai To: , , , CC: , , , , Subject: [PATCH -next v5 2/8] blk-throttle: prevent overflow while calculating wait time Date: Sat, 28 May 2022 14:43:24 +0800 Message-ID: <20220528064330.3471000-3-yukuai3@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220528064330.3471000-1-yukuai3@huawei.com> References: <20220528064330.3471000-1-yukuai3@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600009.china.huawei.com (7.193.23.164) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" In tg_with_in_bps_limit(), 'bps_limit * jiffy_elapsed_rnd' might overflow. FIx the problem by calling mul_u64_u64_div_u64() instead. Signed-off-by: Yu Kuai --- block/blk-throttle.c | 8 +++----- 1 file changed, 3 insertions(+), 5 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 5c1d1c4d8188..a89c62bef2fb 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -806,7 +806,7 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg,= struct bio *bio, u64 bps_limit, unsigned long *wait) { bool rw =3D bio_data_dir(bio); - u64 bytes_allowed, extra_bytes, tmp; + u64 bytes_allowed, extra_bytes; unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd; unsigned int bio_size =3D throtl_bio_data_size(bio); =20 @@ -824,10 +824,8 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg= , struct bio *bio, jiffy_elapsed_rnd =3D tg->td->throtl_slice; =20 jiffy_elapsed_rnd =3D roundup(jiffy_elapsed_rnd, tg->td->throtl_slice); - - tmp =3D bps_limit * jiffy_elapsed_rnd; - do_div(tmp, HZ); - bytes_allowed =3D tmp; + bytes_allowed =3D mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed_rnd, + (u64)HZ); =20 if (tg->bytes_disp[rw] + bio_size <=3D bytes_allowed) { if (wait) --=20 2.31.1 From nobody Fri May 1 11:12:19 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id EB142C433F5 for ; Sat, 28 May 2022 06:31:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355823AbiE1Gbp (ORCPT ); Sat, 28 May 2022 02:31:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355834AbiE1GaY (ORCPT ); Sat, 28 May 2022 02:30:24 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7995746B0F; Fri, 27 May 2022 23:30:04 -0700 (PDT) Received: from kwepemi100012.china.huawei.com (unknown [172.30.72.55]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4L9BbJ2z0kzjX8P; Sat, 28 May 2022 14:29:16 +0800 (CST) Received: from kwepemm600009.china.huawei.com (7.193.23.164) by kwepemi100012.china.huawei.com (7.221.188.202) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:02 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:01 +0800 From: Yu Kuai To: , , , CC: , , , , Subject: [PATCH -next v5 3/8] blk-throttle: factor out code to calculate ios/bytes_allowed Date: Sat, 28 May 2022 14:43:25 +0800 Message-ID: <20220528064330.3471000-4-yukuai3@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220528064330.3471000-1-yukuai3@huawei.com> References: <20220528064330.3471000-1-yukuai3@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600009.china.huawei.com (7.193.23.164) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" No functional changes, new apis will be used in later patches to calculate wait time for throttled bios while updating config. Signed-off-by: Yu Kuai --- block/blk-throttle.c | 48 +++++++++++++++++++++++++++----------------- 1 file changed, 30 insertions(+), 18 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index a89c62bef2fb..d67b20ce4d63 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -754,25 +754,12 @@ static inline void throtl_trim_slice(struct throtl_gr= p *tg, bool rw) tg->slice_start[rw], tg->slice_end[rw], jiffies); } =20 -static bool tg_with_in_iops_limit(struct throtl_grp *tg, struct bio *bio, - u32 iops_limit, unsigned long *wait) +static unsigned int calculate_io_allowed(u32 iops_limit, + unsigned long jiffy_elapsed_rnd) { - bool rw =3D bio_data_dir(bio); unsigned int io_allowed; - unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd; u64 tmp; =20 - if (iops_limit =3D=3D UINT_MAX) { - if (wait) - *wait =3D 0; - return true; - } - - jiffy_elapsed =3D jiffies - tg->slice_start[rw]; - - /* Round up to the next throttle slice, wait time must be nonzero */ - jiffy_elapsed_rnd =3D roundup(jiffy_elapsed + 1, tg->td->throtl_slice); - /* * jiffy_elapsed_rnd should not be a big value as minimum iops can be * 1 then at max jiffy elapsed should be equivalent of 1 second as we @@ -788,6 +775,33 @@ static bool tg_with_in_iops_limit(struct throtl_grp *t= g, struct bio *bio, else io_allowed =3D tmp; =20 + return io_allowed; +} + +static u64 calculate_bytes_allowed(u64 bps_limit, + unsigned long jiffy_elapsed_rnd) +{ + return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed_rnd, (u64)HZ); +} + +static bool tg_with_in_iops_limit(struct throtl_grp *tg, struct bio *bio, + u32 iops_limit, unsigned long *wait) +{ + bool rw =3D bio_data_dir(bio); + unsigned int io_allowed; + unsigned long jiffy_elapsed, jiffy_wait, jiffy_elapsed_rnd; + + if (iops_limit =3D=3D UINT_MAX) { + if (wait) + *wait =3D 0; + return true; + } + + jiffy_elapsed =3D jiffies - tg->slice_start[rw]; + + /* Round up to the next throttle slice, wait time must be nonzero */ + jiffy_elapsed_rnd =3D roundup(jiffy_elapsed + 1, tg->td->throtl_slice); + io_allowed =3D calculate_io_allowed(iops_limit, jiffy_elapsed_rnd); if (tg->io_disp[rw] + 1 <=3D io_allowed) { if (wait) *wait =3D 0; @@ -824,9 +838,7 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg,= struct bio *bio, jiffy_elapsed_rnd =3D tg->td->throtl_slice; =20 jiffy_elapsed_rnd =3D roundup(jiffy_elapsed_rnd, tg->td->throtl_slice); - bytes_allowed =3D mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed_rnd, - (u64)HZ); - + bytes_allowed =3D calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd); if (tg->bytes_disp[rw] + bio_size <=3D bytes_allowed) { if (wait) *wait =3D 0; --=20 2.31.1 From nobody Fri May 1 11:12:19 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0DEE6C433F5 for ; Sat, 28 May 2022 06:31:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355917AbiE1GbT (ORCPT ); Sat, 28 May 2022 02:31:19 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42944 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355836AbiE1GaY (ORCPT ); Sat, 28 May 2022 02:30:24 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 548175C369; Fri, 27 May 2022 23:30:05 -0700 (PDT) Received: from kwepemi100011.china.huawei.com (unknown [172.30.72.54]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4L9BXh2VgtzRhVt; Sat, 28 May 2022 14:27:00 +0800 (CST) Received: from kwepemm600009.china.huawei.com (7.193.23.164) by kwepemi100011.china.huawei.com (7.221.188.134) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:03 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:02 +0800 From: Yu Kuai To: , , , CC: , , , , Subject: [PATCH -next v5 4/8] blk-throttle: fix io hung due to config updates Date: Sat, 28 May 2022 14:43:26 +0800 Message-ID: <20220528064330.3471000-5-yukuai3@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220528064330.3471000-1-yukuai3@huawei.com> References: <20220528064330.3471000-1-yukuai3@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600009.china.huawei.com (7.193.23.164) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" If new configuration is submitted while a bio is throttled, then new waiting time is recalculated regardless that the bio might aready wait for some time: tg_conf_updated throtl_start_new_slice tg_update_disptime throtl_schedule_next_dispatch Then io hung can be triggered by always submmiting new configuration before the throttled bio is dispatched. Fix the problem by respecting the time that throttled bio aready waited. In order to do that, add new fields to record how many bytes/io already waited, and use it to calculate wait time for throttled bio under new configuration. Some simple test: 1) cd /sys/fs/cgroup/blkio/ echo $$ > cgroup.procs echo "8:0 2048" > blkio.throttle.write_bps_device { sleep 2 echo "8:0 1024" > blkio.throttle.write_bps_device } & dd if=3D/dev/zero of=3D/dev/sda bs=3D8k count=3D1 oflag=3Ddirect 2) cd /sys/fs/cgroup/blkio/ echo $$ > cgroup.procs echo "8:0 1024" > blkio.throttle.write_bps_device { sleep 4 echo "8:0 2048" > blkio.throttle.write_bps_device } & dd if=3D/dev/zero of=3D/dev/sda bs=3D8k count=3D1 oflag=3Ddirect test results: io finish time before this patch with this patch 1) 10s 6s 2) 8s 6s Signed-off-by: Yu Kuai Reviewed-by: Michal Koutn=C3=BD --- block/blk-throttle.c | 51 ++++++++++++++++++++++++++++++++++++++------ block/blk-throttle.h | 9 ++++++++ 2 files changed, 54 insertions(+), 6 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index d67b20ce4d63..94fd73e8b2d9 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -639,6 +639,8 @@ static inline void throtl_start_new_slice_with_credit(s= truct throtl_grp *tg, { tg->bytes_disp[rw] =3D 0; tg->io_disp[rw] =3D 0; + tg->bytes_skipped[rw] =3D 0; + tg->io_skipped[rw] =3D 0; =20 /* * Previous slice has expired. We must have trimmed it after last @@ -656,12 +658,17 @@ static inline void throtl_start_new_slice_with_credit= (struct throtl_grp *tg, tg->slice_end[rw], jiffies); } =20 -static inline void throtl_start_new_slice(struct throtl_grp *tg, bool rw) +static inline void throtl_start_new_slice(struct throtl_grp *tg, bool rw, + bool clear_skipped) { tg->bytes_disp[rw] =3D 0; tg->io_disp[rw] =3D 0; tg->slice_start[rw] =3D jiffies; tg->slice_end[rw] =3D jiffies + tg->td->throtl_slice; + if (clear_skipped) { + tg->bytes_skipped[rw] =3D 0; + tg->io_skipped[rw] =3D 0; + } =20 throtl_log(&tg->service_queue, "[%c] new slice start=3D%lu end=3D%lu jiffies=3D%lu", @@ -784,6 +791,34 @@ static u64 calculate_bytes_allowed(u64 bps_limit, return mul_u64_u64_div_u64(bps_limit, (u64)jiffy_elapsed_rnd, (u64)HZ); } =20 +static void __tg_update_skipped(struct throtl_grp *tg, bool rw) +{ + unsigned long jiffy_elapsed =3D jiffies - tg->slice_start[rw]; + u64 bps_limit =3D tg_bps_limit(tg, rw); + u32 iops_limit =3D tg_iops_limit(tg, rw); + + if (bps_limit !=3D U64_MAX) + tg->bytes_skipped[rw] +=3D + calculate_bytes_allowed(bps_limit, jiffy_elapsed) - + tg->bytes_disp[rw]; + if (iops_limit !=3D UINT_MAX) + tg->io_skipped[rw] +=3D + calculate_io_allowed(iops_limit, jiffy_elapsed) - + tg->io_disp[rw]; +} + +static void tg_update_skipped(struct throtl_grp *tg) +{ + if (tg->service_queue.nr_queued[READ]) + __tg_update_skipped(tg, READ); + if (tg->service_queue.nr_queued[WRITE]) + __tg_update_skipped(tg, WRITE); + + throtl_log(&tg->service_queue, "%s: %llu %llu %u %u\n", __func__, + tg->bytes_skipped[READ], tg->bytes_skipped[WRITE], + tg->io_skipped[READ], tg->io_skipped[WRITE]); +} + static bool tg_with_in_iops_limit(struct throtl_grp *tg, struct bio *bio, u32 iops_limit, unsigned long *wait) { @@ -801,7 +836,8 @@ static bool tg_with_in_iops_limit(struct throtl_grp *tg= , struct bio *bio, =20 /* Round up to the next throttle slice, wait time must be nonzero */ jiffy_elapsed_rnd =3D roundup(jiffy_elapsed + 1, tg->td->throtl_slice); - io_allowed =3D calculate_io_allowed(iops_limit, jiffy_elapsed_rnd); + io_allowed =3D calculate_io_allowed(iops_limit, jiffy_elapsed_rnd) + + tg->io_skipped[rw]; if (tg->io_disp[rw] + 1 <=3D io_allowed) { if (wait) *wait =3D 0; @@ -838,7 +874,8 @@ static bool tg_with_in_bps_limit(struct throtl_grp *tg,= struct bio *bio, jiffy_elapsed_rnd =3D tg->td->throtl_slice; =20 jiffy_elapsed_rnd =3D roundup(jiffy_elapsed_rnd, tg->td->throtl_slice); - bytes_allowed =3D calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd); + bytes_allowed =3D calculate_bytes_allowed(bps_limit, jiffy_elapsed_rnd) + + tg->bytes_skipped[rw]; if (tg->bytes_disp[rw] + bio_size <=3D bytes_allowed) { if (wait) *wait =3D 0; @@ -899,7 +936,7 @@ static bool tg_may_dispatch(struct throtl_grp *tg, stru= ct bio *bio, * slice and it should be extended instead. */ if (throtl_slice_used(tg, rw) && !(tg->service_queue.nr_queued[rw])) - throtl_start_new_slice(tg, rw); + throtl_start_new_slice(tg, rw, true); else { if (time_before(tg->slice_end[rw], jiffies + tg->td->throtl_slice)) @@ -1328,8 +1365,8 @@ static void tg_conf_updated(struct throtl_grp *tg, bo= ol global) * that a group's limit are dropped suddenly and we don't want to * account recently dispatched IO with new low rate. */ - throtl_start_new_slice(tg, READ); - throtl_start_new_slice(tg, WRITE); + throtl_start_new_slice(tg, READ, false); + throtl_start_new_slice(tg, WRITE, false); =20 if (tg->flags & THROTL_TG_PENDING) { tg_update_disptime(tg); @@ -1357,6 +1394,7 @@ static ssize_t tg_set_conf(struct kernfs_open_file *o= f, v =3D U64_MAX; =20 tg =3D blkg_to_tg(ctx.blkg); + tg_update_skipped(tg); =20 if (is_u64) *(u64 *)((void *)tg + of_cft(of)->private) =3D v; @@ -1543,6 +1581,7 @@ static ssize_t tg_set_limit(struct kernfs_open_file *= of, return ret; =20 tg =3D blkg_to_tg(ctx.blkg); + tg_update_skipped(tg); =20 v[0] =3D tg->bps_conf[READ][index]; v[1] =3D tg->bps_conf[WRITE][index]; diff --git a/block/blk-throttle.h b/block/blk-throttle.h index c1b602996127..b8178e6b4d30 100644 --- a/block/blk-throttle.h +++ b/block/blk-throttle.h @@ -115,6 +115,15 @@ struct throtl_grp { uint64_t bytes_disp[2]; /* Number of bio's dispatched in current slice */ unsigned int io_disp[2]; + /* + * The following two fields are used to calculate new wait time for + * throttled bio when new configuration is submmited. + * + * Number of bytes will be skipped in current slice + */ + uint64_t bytes_skipped[2]; + /* Number of bio will be skipped in current slice */ + unsigned int io_skipped[2]; =20 unsigned long last_low_overflow_time[2]; =20 --=20 2.31.1 From nobody Fri May 1 11:12:19 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2E78BC433EF for ; Sat, 28 May 2022 06:31:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1351336AbiE1Gbk (ORCPT ); Sat, 28 May 2022 02:31:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44048 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355837AbiE1GaY (ORCPT ); Sat, 28 May 2022 02:30:24 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1482630F65; Fri, 27 May 2022 23:30:06 -0700 (PDT) Received: from kwepemi500008.china.huawei.com (unknown [172.30.72.57]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4L9BZP3CJFzgYDc; Sat, 28 May 2022 14:28:29 +0800 (CST) Received: from kwepemm600009.china.huawei.com (7.193.23.164) by kwepemi500008.china.huawei.com (7.221.188.139) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:03 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:03 +0800 From: Yu Kuai To: , , , CC: , , , , Subject: [PATCH -next v5 5/8] blk-throttle: use 'READ/WRITE' instead of '0/1' Date: Sat, 28 May 2022 14:43:27 +0800 Message-ID: <20220528064330.3471000-6-yukuai3@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220528064330.3471000-1-yukuai3@huawei.com> References: <20220528064330.3471000-1-yukuai3@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600009.china.huawei.com (7.193.23.164) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Make the code easier to read, like everywhere else. Signed-off-by: Yu Kuai --- block/blk-throttle.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 94fd73e8b2d9..454a360f42e8 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -329,8 +329,8 @@ static struct bio *throtl_pop_queued(struct list_head *= queued, /* init a service_queue, assumes the caller zeroed it */ static void throtl_service_queue_init(struct throtl_service_queue *sq) { - INIT_LIST_HEAD(&sq->queued[0]); - INIT_LIST_HEAD(&sq->queued[1]); + INIT_LIST_HEAD(&sq->queued[READ]); + INIT_LIST_HEAD(&sq->queued[WRITE]); sq->pending_tree =3D RB_ROOT_CACHED; timer_setup(&sq->pending_timer, throtl_pending_timer_fn, 0); } @@ -1150,7 +1150,7 @@ static int throtl_select_dispatch(struct throtl_servi= ce_queue *parent_sq) nr_disp +=3D throtl_dispatch_tg(tg); =20 sq =3D &tg->service_queue; - if (sq->nr_queued[0] || sq->nr_queued[1]) + if (sq->nr_queued[READ] || sq->nr_queued[WRITE]) tg_update_disptime(tg); =20 if (nr_disp >=3D THROTL_QUANTUM) --=20 2.31.1 From nobody Fri May 1 11:12:19 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52227C433F5 for ; Sat, 28 May 2022 06:31:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355998AbiE1Gb2 (ORCPT ); Sat, 28 May 2022 02:31:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44064 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355845AbiE1GaY (ORCPT ); Sat, 28 May 2022 02:30:24 -0400 Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8530F5E751; Fri, 27 May 2022 23:30:07 -0700 (PDT) Received: from kwepemi500009.china.huawei.com (unknown [172.30.72.56]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4L9BZQ4Gtbz1JC58; Sat, 28 May 2022 14:28:30 +0800 (CST) Received: from kwepemm600009.china.huawei.com (7.193.23.164) by kwepemi500009.china.huawei.com (7.221.188.199) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:04 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:03 +0800 From: Yu Kuai To: , , , CC: , , , , Subject: [PATCH -next v5 6/8] blk-throttle: calling throtl_dequeue/enqueue_tg in pairs Date: Sat, 28 May 2022 14:43:28 +0800 Message-ID: <20220528064330.3471000-7-yukuai3@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220528064330.3471000-1-yukuai3@huawei.com> References: <20220528064330.3471000-1-yukuai3@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600009.china.huawei.com (7.193.23.164) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" It's a litter weird to call throtl_dequeue_tg() unconditionally in throtl_select_dispatch(), since it will be called in tg_update_disptime() again if some bio is still throttled. Signed-off-by: Yu Kuai --- block/blk-throttle.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 454a360f42e8..e6ae86d284b9 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -1145,13 +1145,13 @@ static int throtl_select_dispatch(struct throtl_ser= vice_queue *parent_sq) if (time_before(jiffies, tg->disptime)) break; =20 - throtl_dequeue_tg(tg); - nr_disp +=3D throtl_dispatch_tg(tg); =20 sq =3D &tg->service_queue; if (sq->nr_queued[READ] || sq->nr_queued[WRITE]) tg_update_disptime(tg); + else + throtl_dequeue_tg(tg); =20 if (nr_disp >=3D THROTL_QUANTUM) break; --=20 2.31.1 From nobody Fri May 1 11:12:19 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C386C433F5 for ; Sat, 28 May 2022 06:30:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355897AbiE1Gar (ORCPT ); Sat, 28 May 2022 02:30:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44062 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355844AbiE1GaY (ORCPT ); Sat, 28 May 2022 02:30:24 -0400 Received: from szxga01-in.huawei.com (szxga01-in.huawei.com [45.249.212.187]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8518D5DBF7; Fri, 27 May 2022 23:30:07 -0700 (PDT) Received: from kwepemi500010.china.huawei.com (unknown [172.30.72.56]) by szxga01-in.huawei.com (SkyGuard) with ESMTP id 4L9BZR06MJzgYJN; Sat, 28 May 2022 14:28:30 +0800 (CST) Received: from kwepemm600009.china.huawei.com (7.193.23.164) by kwepemi500010.china.huawei.com (7.221.188.191) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:05 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:04 +0800 From: Yu Kuai To: , , , CC: , , , , Subject: [PATCH -next v5 7/8] blk-throttle: cleanup tg_update_disptime() Date: Sat, 28 May 2022 14:43:29 +0800 Message-ID: <20220528064330.3471000-8-yukuai3@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220528064330.3471000-1-yukuai3@huawei.com> References: <20220528064330.3471000-1-yukuai3@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600009.china.huawei.com (7.193.23.164) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" th_update_disptime() only need to adjust postion for 'tg' in 'parent_sq', there is no need to call throtl_enqueue/dequeue_tg(). Signed-off-by: Yu Kuai --- block/blk-throttle.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index e6ae86d284b9..297ce54ceaa3 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -520,7 +520,6 @@ static void throtl_rb_erase(struct rb_node *n, { rb_erase_cached(n, &parent_sq->pending_tree); RB_CLEAR_NODE(n); - --parent_sq->nr_pending; } =20 static void update_min_dispatch_time(struct throtl_service_queue *parent_s= q) @@ -572,7 +571,11 @@ static void throtl_enqueue_tg(struct throtl_grp *tg) static void throtl_dequeue_tg(struct throtl_grp *tg) { if (tg->flags & THROTL_TG_PENDING) { - throtl_rb_erase(&tg->rb_node, tg->service_queue.parent_sq); + struct throtl_service_queue *parent_sq =3D + tg->service_queue.parent_sq; + + throtl_rb_erase(&tg->rb_node, parent_sq); + --parent_sq->nr_pending; tg->flags &=3D ~THROTL_TG_PENDING; } } @@ -1034,9 +1037,9 @@ static void tg_update_disptime(struct throtl_grp *tg) disptime =3D jiffies + min_wait; =20 /* Update dispatch time */ - throtl_dequeue_tg(tg); + throtl_rb_erase(&tg->rb_node, tg->service_queue.parent_sq); tg->disptime =3D disptime; - throtl_enqueue_tg(tg); + tg_service_queue_add(tg); =20 /* see throtl_add_bio_tg() */ tg->flags &=3D ~THROTL_TG_WAS_EMPTY; --=20 2.31.1 From nobody Fri May 1 11:12:19 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A052C433EF for ; Sat, 28 May 2022 06:31:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355829AbiE1GbA (ORCPT ); Sat, 28 May 2022 02:31:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43338 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1355854AbiE1GaY (ORCPT ); Sat, 28 May 2022 02:30:24 -0400 Received: from szxga02-in.huawei.com (szxga02-in.huawei.com [45.249.212.188]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5F3F66007E; Fri, 27 May 2022 23:30:08 -0700 (PDT) Received: from kwepemi500005.china.huawei.com (unknown [172.30.72.54]) by szxga02-in.huawei.com (SkyGuard) with ESMTP id 4L9BXl3LnNzRhW7; Sat, 28 May 2022 14:27:03 +0800 (CST) Received: from kwepemm600009.china.huawei.com (7.193.23.164) by kwepemi500005.china.huawei.com (7.221.188.179) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:06 +0800 Received: from huawei.com (10.175.127.227) by kwepemm600009.china.huawei.com (7.193.23.164) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 28 May 2022 14:30:05 +0800 From: Yu Kuai To: , , , CC: , , , , Subject: [PATCH -next v5 8/8] blk-throttle: clean up flag 'THROTL_TG_PENDING' Date: Sat, 28 May 2022 14:43:30 +0800 Message-ID: <20220528064330.3471000-9-yukuai3@huawei.com> X-Mailer: git-send-email 2.31.1 In-Reply-To: <20220528064330.3471000-1-yukuai3@huawei.com> References: <20220528064330.3471000-1-yukuai3@huawei.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [10.175.127.227] X-ClientProxiedBy: dggems706-chm.china.huawei.com (10.3.19.183) To kwepemm600009.china.huawei.com (7.193.23.164) X-CFilter-Loop: Reflected Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" All related operations are inside 'queue_lock', there is no need to use the flag, we only need to make sure throtl_enqueue_tg() is called when the first bio is throttled, and throtl_dequeue_tg() is called when the last throttled bio is dispatched. Signed-off-by: Yu Kuai --- block/blk-throttle.c | 22 ++++++++-------------- block/blk-throttle.h | 7 +++---- 2 files changed, 11 insertions(+), 18 deletions(-) diff --git a/block/blk-throttle.c b/block/blk-throttle.c index 297ce54ceaa3..fe7f01c61ba8 100644 --- a/block/blk-throttle.c +++ b/block/blk-throttle.c @@ -561,23 +561,16 @@ static void tg_service_queue_add(struct throtl_grp *t= g) =20 static void throtl_enqueue_tg(struct throtl_grp *tg) { - if (!(tg->flags & THROTL_TG_PENDING)) { - tg_service_queue_add(tg); - tg->flags |=3D THROTL_TG_PENDING; - tg->service_queue.parent_sq->nr_pending++; - } + tg_service_queue_add(tg); + tg->service_queue.parent_sq->nr_pending++; } =20 static void throtl_dequeue_tg(struct throtl_grp *tg) { - if (tg->flags & THROTL_TG_PENDING) { - struct throtl_service_queue *parent_sq =3D - tg->service_queue.parent_sq; + struct throtl_service_queue *parent_sq =3D tg->service_queue.parent_sq; =20 - throtl_rb_erase(&tg->rb_node, parent_sq); - --parent_sq->nr_pending; - tg->flags &=3D ~THROTL_TG_PENDING; - } + throtl_rb_erase(&tg->rb_node, parent_sq); + --parent_sq->nr_pending; } =20 /* Call with queue lock held */ @@ -1015,8 +1008,9 @@ static void throtl_add_bio_tg(struct bio *bio, struct= throtl_qnode *qn, =20 throtl_qnode_add_bio(bio, qn, &sq->queued[rw]); =20 + if (!sq->nr_queued[READ] && !sq->nr_queued[WRITE]) + throtl_enqueue_tg(tg); sq->nr_queued[rw]++; - throtl_enqueue_tg(tg); } =20 static void tg_update_disptime(struct throtl_grp *tg) @@ -1371,7 +1365,7 @@ static void tg_conf_updated(struct throtl_grp *tg, bo= ol global) throtl_start_new_slice(tg, READ, false); throtl_start_new_slice(tg, WRITE, false); =20 - if (tg->flags & THROTL_TG_PENDING) { + if (sq->nr_queued[READ] || sq->nr_queued[WRITE]) { tg_update_disptime(tg); throtl_schedule_next_dispatch(sq->parent_sq, true); } diff --git a/block/blk-throttle.h b/block/blk-throttle.h index b8178e6b4d30..f68b95999f83 100644 --- a/block/blk-throttle.h +++ b/block/blk-throttle.h @@ -53,10 +53,9 @@ struct throtl_service_queue { }; =20 enum tg_state_flags { - THROTL_TG_PENDING =3D 1 << 0, /* on parent's pending tree */ - THROTL_TG_WAS_EMPTY =3D 1 << 1, /* bio_lists[] became non-empty */ - THROTL_TG_HAS_IOPS_LIMIT =3D 1 << 2, /* tg has iops limit */ - THROTL_TG_CANCELING =3D 1 << 3, /* starts to cancel bio */ + THROTL_TG_WAS_EMPTY =3D 1 << 0, /* bio_lists[] became non-empty */ + THROTL_TG_HAS_IOPS_LIMIT =3D 1 << 1, /* tg has iops limit */ + THROTL_TG_CANCELING =3D 1 << 2, /* starts to cancel bio */ }; =20 enum { --=20 2.31.1