block/blk-core.c | 4 ++++ block/blk-mq.c | 6 ++++++ 2 files changed, 10 insertions(+)
From: Tang Yizhou <yizhou.tang@shopee.com>
Now when I/O is submitted to a partition, the per-CPU in_flight[]
counter is incremented only on the partition's block_device, not on the
underlying whole disk. This leads to a problem which can be shown by a
fio test:
lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
mydev 252:1 0 20G 0 disk
└─mydev1 259:0 0 10G 0 part
iostat -xp 1
Device r/s rkB/s ... aqu-sz %util
mydev 128153.00 512612.00 ... 13.22 72.20
mydev1 128154.00 512616.00 ... 13.22 100.00
%util is different between mydev and mydev1, which is unexpected.
This is the cumulative effect of a series of patches. The key step is
commit 10ec5e86f9b8 ("block: merge part_{inc,dev}_in_flight into their
only callers"), which folded the whole-disk in_flight accounting into
generic_start_io_acct() and generic_end_io_acct(). Those two helpers
were then removed by commit e722fff238bb ("block: remove
generic_{start,end}_io_acct"), and from that point on the whole disk's
in_flight is no longer accounted at all.
Fix it by restoring the whole-disk in_flight accounting.
Fixes: e722fff238bb ("block: remove generic_{start,end}_io_acct")
Suggested-by: Leon Hwang <leon.huangfu@shopee.com>
Signed-off-by: Tang Yizhou <yizhou.tang@shopee.com>
---
block/blk-core.c | 4 ++++
block/blk-mq.c | 6 ++++++
2 files changed, 10 insertions(+)
diff --git a/block/blk-core.c b/block/blk-core.c
index 17450058ea6d..03f4b7015e69 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1043,6 +1043,8 @@ unsigned long bdev_start_io_acct(struct block_device *bdev, enum req_op op,
part_stat_lock();
update_io_ticks(bdev, start_time, false);
part_stat_local_inc(bdev, in_flight[op_is_write(op)]);
+ if (bdev_is_partition(bdev))
+ part_stat_local_inc(bdev_whole(bdev), in_flight[op_is_write(op)]);
part_stat_unlock();
return start_time;
@@ -1074,6 +1076,8 @@ void bdev_end_io_acct(struct block_device *bdev, enum req_op op,
part_stat_add(bdev, sectors[sgrp], sectors);
part_stat_add(bdev, nsecs[sgrp], jiffies_to_nsecs(duration));
part_stat_local_dec(bdev, in_flight[op_is_write(op)]);
+ if (bdev_is_partition(bdev))
+ part_stat_local_dec(bdev_whole(bdev), in_flight[op_is_write(op)]);
part_stat_unlock();
}
EXPORT_SYMBOL(bdev_end_io_acct);
diff --git a/block/blk-mq.c b/block/blk-mq.c
index d0c37daf568f..60ead16f1496 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1084,6 +1084,9 @@ static inline void blk_account_io_done(struct request *req, u64 now)
part_stat_add(req->part, nsecs[sgrp], now - req->start_time_ns);
part_stat_local_dec(req->part,
in_flight[op_is_write(req_op(req))]);
+ if (bdev_is_partition(req->part))
+ part_stat_local_dec(bdev_whole(req->part),
+ in_flight[op_is_write(req_op(req))]);
part_stat_unlock();
}
}
@@ -1144,6 +1147,9 @@ static inline void blk_account_io_start(struct request *req)
part_stat_lock();
update_io_ticks(req->part, jiffies, false);
part_stat_local_inc(req->part, in_flight[op_is_write(req_op(req))]);
+ if (bdev_is_partition(req->part))
+ part_stat_local_inc(bdev_whole(req->part),
+ in_flight[op_is_write(req_op(req))]);
part_stat_unlock();
}
--
2.43.0
On Fri, May 22, 2026 at 07:37:51PM +0800, Tang Yizhou wrote: > --- a/block/blk-core.c > +++ b/block/blk-core.c > @@ -1043,6 +1043,8 @@ unsigned long bdev_start_io_acct(struct block_device *bdev, enum req_op op, > part_stat_lock(); > update_io_ticks(bdev, start_time, false); > part_stat_local_inc(bdev, in_flight[op_is_write(op)]); > + if (bdev_is_partition(bdev)) > + part_stat_local_inc(bdev_whole(bdev), in_flight[op_is_write(op)]); overly lone line. > + if (bdev_is_partition(bdev)) > + part_stat_local_dec(bdev_whole(bdev), in_flight[op_is_write(op)]); Same. > } > @@ -1144,6 +1147,9 @@ static inline void blk_account_io_start(struct request *req) > part_stat_lock(); > update_io_ticks(req->part, jiffies, false); > part_stat_local_inc(req->part, in_flight[op_is_write(req_op(req))]); > + if (bdev_is_partition(req->part)) > + part_stat_local_inc(bdev_whole(req->part), > + in_flight[op_is_write(req_op(req))]); and tis duplicates the above logic. Mabye factor the common code into two little helpers?
On Fri, May 22, 2026 at 8:12 PM Christoph Hellwig <hch@lst.de> wrote: > > On Fri, May 22, 2026 at 07:37:51PM +0800, Tang Yizhou wrote: > > --- a/block/blk-core.c > > +++ b/block/blk-core.c > > @@ -1043,6 +1043,8 @@ unsigned long bdev_start_io_acct(struct block_device *bdev, enum req_op op, > > part_stat_lock(); > > update_io_ticks(bdev, start_time, false); > > part_stat_local_inc(bdev, in_flight[op_is_write(op)]); > > + if (bdev_is_partition(bdev)) > > + part_stat_local_inc(bdev_whole(bdev), in_flight[op_is_write(op)]); > > overly lone line. OK. I will update in the next patch. > > > + if (bdev_is_partition(bdev)) > > + part_stat_local_dec(bdev_whole(bdev), in_flight[op_is_write(op)]); > > Same. > > > } > > @@ -1144,6 +1147,9 @@ static inline void blk_account_io_start(struct request *req) > > part_stat_lock(); > > update_io_ticks(req->part, jiffies, false); > > part_stat_local_inc(req->part, in_flight[op_is_write(req_op(req))]); > > + if (bdev_is_partition(req->part)) > > + part_stat_local_inc(bdev_whole(req->part), > > + in_flight[op_is_write(req_op(req))]); > > and tis duplicates the above logic. Mabye factor the common code > into two little helpers? Sure. >
© 2016 - 2026 Red Hat, Inc.