block/blk-iocost.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-)
The hweight_inuse calculation in transfer_surpluses() could potentially
result in a value of 0, which would lead to division by zero errors in
subsequent calculations that use this value as a divisor.
Signed-off-by: Kunhai Dai <daikunhai@didiglobal.com>
---
block/blk-iocost.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)
diff --git a/block/blk-iocost.c b/block/blk-iocost.c
index 384aa15e8260..65cdb55d30cc 100644
--- a/block/blk-iocost.c
+++ b/block/blk-iocost.c
@@ -1999,9 +1999,10 @@ static void transfer_surpluses(struct list_head *surpluses, struct ioc_now *now)
parent = iocg->ancestors[iocg->level - 1];
/* b' = gamma * b_f + b_t' */
- iocg->hweight_inuse = DIV64_U64_ROUND_UP(
- (u64)gamma * (iocg->hweight_active - iocg->hweight_donating),
- WEIGHT_ONE) + iocg->hweight_after_donation;
+ iocg->hweight_inuse = max_t(u64, 1,
+ DIV64_U64_ROUND_UP(
+ (u64)gamma * (iocg->hweight_active - iocg->hweight_donating),
+ WEIGHT_ONE) + iocg->hweight_after_donation);
/* w' = s' * b' / b'_p */
inuse = DIV64_U64_ROUND_UP(
--
2.27.0
In fact, we did encounter such a special situation where the kernel printed out `iocg: invalid donation weights in /a/b/c: active=1 donating=1 after=0`, and then it immediately panic. I analyzed the code but could not figure out how this happened; it might be a concurrency issue or some other hidden bug. Our kernel is not the latest, but it includes the patch edaa26334c117a584add6053f48d63a988d25a6e (iocost: Fix divide-by-zero on donation from low hweight cgroup). 在 2024/11/22 15:26,“戴坤海 Tony Dai”<daikunhai@didiglobal.com <mailto:daikunhai@didiglobal.com>> 写入: The hweight_inuse calculation in transfer_surpluses() could potentially result in a value of 0, which would lead to division by zero errors in subsequent calculations that use this value as a divisor. Signed-off-by: Kunhai Dai <daikunhai@didiglobal.com <mailto:daikunhai@didiglobal.com>> --- block/blk-iocost.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/block/blk-iocost.c b/block/blk-iocost.c index 384aa15e8260..65cdb55d30cc 100644 --- a/block/blk-iocost.c +++ b/block/blk-iocost.c @@ -1999,9 +1999,10 @@ static void transfer_surpluses(struct list_head *surpluses, struct ioc_now *now) parent = iocg->ancestors[iocg->level - 1]; /* b' = gamma * b_f + b_t' */ - iocg->hweight_inuse = DIV64_U64_ROUND_UP( - (u64)gamma * (iocg->hweight_active - iocg->hweight_donating), - WEIGHT_ONE) + iocg->hweight_after_donation; + iocg->hweight_inuse = max_t(u64, 1, + DIV64_U64_ROUND_UP( + (u64)gamma * (iocg->hweight_active - iocg->hweight_donating), + WEIGHT_ONE) + iocg->hweight_after_donation); /* w' = s' * b' / b'_p */ inuse = DIV64_U64_ROUND_UP( -- 2.27.0
Hi,
在 2024/11/22 15:26, Kunhai Dai 写道:
> The hweight_inuse calculation in transfer_surpluses() could potentially
> result in a value of 0, which would lead to division by zero errors in
> subsequent calculations that use this value as a divisor.
>
> Signed-off-by: Kunhai Dai <daikunhai@didiglobal.com>
> ---
> block/blk-iocost.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/block/blk-iocost.c b/block/blk-iocost.c
> index 384aa15e8260..65cdb55d30cc 100644
> --- a/block/blk-iocost.c
> +++ b/block/blk-iocost.c
> @@ -1999,9 +1999,10 @@ static void transfer_surpluses(struct list_head *surpluses, struct ioc_now *now)
> parent = iocg->ancestors[iocg->level - 1];
>
> /* b' = gamma * b_f + b_t' */
> - iocg->hweight_inuse = DIV64_U64_ROUND_UP(
> - (u64)gamma * (iocg->hweight_active - iocg->hweight_donating),
> - WEIGHT_ONE) + iocg->hweight_after_donation;
> + iocg->hweight_inuse = max_t(u64, 1,
> + DIV64_U64_ROUND_UP(
> + (u64)gamma * (iocg->hweight_active - iocg->hweight_donating),
> + WEIGHT_ONE) + iocg->hweight_after_donation);
I'm confused, how could DIV64_U64_Round_UP() end up less than 1?
#define DIV64_U64_ROUND_UP(ll, d) \
({ u64 _tmp = (d); div64_u64((ll) + _tmp - 1, _tmp); })
AFAIK, the only case that could happen is that
iocg->hweight_active - iocg->hweight_donating is 0, then I don't
get it now how cound active iocg donate all the hweight, if this
really happend perhaps the better solution is to avoid such case.
Thanks,
Kuai
>
> /* w' = s' * b' / b'_p */
> inuse = DIV64_U64_ROUND_UP(
>
In fact, we did encounter such a special situation where the kernel printed out `iocg: invalid donation weights in /a/b/c: active=1 donating=1 after=0`, and then it immediately panic. I analyzed the code but could not figure out how this happened; it might be a concurrency issue or some other hidden bug.
Our kernel is not the latest, but it includes the patch edaa26334c117a584add6053f48d63a988d25a6e (iocost: Fix divide-by-zero on donation from low hweight cgroup).
在 2024/11/22 16:16,“Yu Kuai”<yukuai1@huaweicloud.com <mailto:yukuai1@huaweicloud.com>> 写入:
Hi,
在 2024/11/22 15:26, Kunhai Dai 写道:
> The hweight_inuse calculation in transfer_surpluses() could potentially
> result in a value of 0, which would lead to division by zero errors in
> subsequent calculations that use this value as a divisor.
>
> Signed-off-by: Kunhai Dai <daikunhai@didiglobal.com <mailto:daikunhai@didiglobal.com>>
> ---
> block/blk-iocost.c | 7 ++++---
> 1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/block/blk-iocost.c b/block/blk-iocost.c
> index 384aa15e8260..65cdb55d30cc 100644
> --- a/block/blk-iocost.c
> +++ b/block/blk-iocost.c
> @@ -1999,9 +1999,10 @@ static void transfer_surpluses(struct list_head *surpluses, struct ioc_now *now)
> parent = iocg->ancestors[iocg->level - 1];
>
> /* b' = gamma * b_f + b_t' */
> - iocg->hweight_inuse = DIV64_U64_ROUND_UP(
> - (u64)gamma * (iocg->hweight_active - iocg->hweight_donating),
> - WEIGHT_ONE) + iocg->hweight_after_donation;
> + iocg->hweight_inuse = max_t(u64, 1,
> + DIV64_U64_ROUND_UP(
> + (u64)gamma * (iocg->hweight_active - iocg->hweight_donating),
> + WEIGHT_ONE) + iocg->hweight_after_donation);
I'm confused, how could DIV64_U64_Round_UP() end up less than 1?
#define DIV64_U64_ROUND_UP(ll, d) \
({ u64 _tmp = (d); div64_u64((ll) + _tmp - 1, _tmp); })
AFAIK, the only case that could happen is that
iocg->hweight_active - iocg->hweight_donating is 0, then I don't
get it now how cound active iocg donate all the hweight, if this
really happend perhaps the better solution is to avoid such case.
Thanks,
Kuai
>
> /* w' = s' * b' / b'_p */
> inuse = DIV64_U64_ROUND_UP(
>
Hi
在 2024/11/24 21:44, 戴坤海 Tony Dai 写道:
> In fact, we did encounter such a special situation where the kernel printed out `iocg: invalid donation weights in /a/b/c: active=1 donating=1 after=0`, and then it immediately panic. I analyzed the code but could not figure out how this happened; it might be a concurrency issue or some other hidden bug.
Do you have a reporducer for this? I'd like to take a look at the
WARN first.
Thanks,
Kuai
>
> Our kernel is not the latest, but it includes the patch edaa26334c117a584add6053f48d63a988d25a6e (iocost: Fix divide-by-zero on donation from low hweight cgroup).
>
> 在 2024/11/22 16:16,“Yu Kuai”<yukuai1@huaweicloud.com <mailto:yukuai1@huaweicloud.com>> 写入:
>
>
> Hi,
>
>
> 在 2024/11/22 15:26, Kunhai Dai 写道:
>> The hweight_inuse calculation in transfer_surpluses() could potentially
>> result in a value of 0, which would lead to division by zero errors in
>> subsequent calculations that use this value as a divisor.
>>
>> Signed-off-by: Kunhai Dai <daikunhai@didiglobal.com <mailto:daikunhai@didiglobal.com>>
>> ---
>> block/blk-iocost.c | 7 ++++---
>> 1 file changed, 4 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/blk-iocost.c b/block/blk-iocost.c
>> index 384aa15e8260..65cdb55d30cc 100644
>> --- a/block/blk-iocost.c
>> +++ b/block/blk-iocost.c
>> @@ -1999,9 +1999,10 @@ static void transfer_surpluses(struct list_head *surpluses, struct ioc_now *now)
>> parent = iocg->ancestors[iocg->level - 1];
>>
>> /* b' = gamma * b_f + b_t' */
>> - iocg->hweight_inuse = DIV64_U64_ROUND_UP(
>> - (u64)gamma * (iocg->hweight_active - iocg->hweight_donating),
>> - WEIGHT_ONE) + iocg->hweight_after_donation;
>> + iocg->hweight_inuse = max_t(u64, 1,
>> + DIV64_U64_ROUND_UP(
>> + (u64)gamma * (iocg->hweight_active - iocg->hweight_donating),
>> + WEIGHT_ONE) + iocg->hweight_after_donation);
>
>
> I'm confused, how could DIV64_U64_Round_UP() end up less than 1?
>
>
> #define DIV64_U64_ROUND_UP(ll, d) \
> ({ u64 _tmp = (d); div64_u64((ll) + _tmp - 1, _tmp); })
>
>
> AFAIK, the only case that could happen is that
> iocg->hweight_active - iocg->hweight_donating is 0, then I don't
> get it now how cound active iocg donate all the hweight, if this
> really happend perhaps the better solution is to avoid such case.
>
>
> Thanks,
> Kuai
>
>
>>
>> /* w' = s' * b' / b'_p */
>> inuse = DIV64_U64_ROUND_UP(
>>
>
>
>
>
>
© 2016 - 2026 Red Hat, Inc.