blk_mq_hctx_next_cpu() implements a form of cpumask_next_and_wrap() using
cpumask_next_and_cpu() and blk_mq_first_mapped_cpu():
[ 5.398453] WARNING: CPU: 3 PID: 162 at include/linux/cpumask.h:110 __blk_mq_delay_run_hw_queue+0x16b/0x180
[ 5.399317] Modules linked in:
[ 5.399646] CPU: 3 PID: 162 Comm: ssh-keygen Tainted: G N 6.0.0-rc4-00004-g93003cb24006 #55
[ 5.400135] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
[ 5.405430] Call Trace:
[ 5.406152] <TASK>
[ 5.406452] blk_mq_sched_insert_requests+0x67/0x150
[ 5.406759] blk_mq_flush_plug_list+0xd0/0x280
[ 5.406987] ? bit_wait+0x60/0x60
[ 5.407317] __blk_flush_plug+0xdb/0x120
[ 5.407561] ? bit_wait+0x60/0x60
[ 5.407765] io_schedule_prepare+0x38/0x40
[...]
This triggers a warning when next_cpu == nr_cpu_ids - 1, so rewrite it
using cpumask_next_and_wrap() directly. The backwards-going goto can be
removed, as the cpumask_next*() operation already ANDs hctx->cpumask and
cpu_online_mask, which implies checking for an online CPU.
No change in behaviour intended.
Suggested-by: Yury Norov <yury.norov@gmail.com>
Signed-off-by: Valentin Schneider <vschneid@redhat.com>
---
block/blk-mq.c | 39 +++++++++++++--------------------------
1 file changed, 13 insertions(+), 26 deletions(-)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index c96c8c4f751b..1520794dd9ea 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2038,42 +2038,29 @@ static inline int blk_mq_first_mapped_cpu(struct blk_mq_hw_ctx *hctx)
*/
static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
{
- bool tried = false;
int next_cpu = hctx->next_cpu;
if (hctx->queue->nr_hw_queues == 1)
return WORK_CPU_UNBOUND;
- if (--hctx->next_cpu_batch <= 0) {
-select_cpu:
- next_cpu = cpumask_next_and(next_cpu, hctx->cpumask,
- cpu_online_mask);
- if (next_cpu >= nr_cpu_ids)
- next_cpu = blk_mq_first_mapped_cpu(hctx);
+ if (--hctx->next_cpu_batch > 0 && cpu_online(next_cpu))
+ return next_cpu;
+
+ next_cpu = cpumask_next_and_wrap(next_cpu, hctx->cpumask, cpu_online_mask, next_cpu, false);
+ if (next_cpu < nr_cpu_ids) {
hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
+ hctx->next_cpu = next_cpu;
+ return next_cpu;
}
/*
- * Do unbound schedule if we can't find a online CPU for this hctx,
- * and it should only happen in the path of handling CPU DEAD.
+ * No other online CPU in hctx->cpumask.
+ *
+ * Make sure to re-select CPU next time once after CPUs
+ * in hctx->cpumask become online again.
*/
- if (!cpu_online(next_cpu)) {
- if (!tried) {
- tried = true;
- goto select_cpu;
- }
-
- /*
- * Make sure to re-select CPU next time once after CPUs
- * in hctx->cpumask become online again.
- */
- hctx->next_cpu = next_cpu;
- hctx->next_cpu_batch = 1;
- return WORK_CPU_UNBOUND;
- }
-
- hctx->next_cpu = next_cpu;
- return next_cpu;
+ hctx->next_cpu_batch = 1;
+ return WORK_CPU_UNBOUND;
}
/**
--
2.31.1
On Thu, Oct 06, 2022 at 01:21:12PM +0100, Valentin Schneider wrote:
> blk_mq_hctx_next_cpu() implements a form of cpumask_next_and_wrap() using
> cpumask_next_and_cpu() and blk_mq_first_mapped_cpu():
>
> [ 5.398453] WARNING: CPU: 3 PID: 162 at include/linux/cpumask.h:110 __blk_mq_delay_run_hw_queue+0x16b/0x180
> [ 5.399317] Modules linked in:
> [ 5.399646] CPU: 3 PID: 162 Comm: ssh-keygen Tainted: G N 6.0.0-rc4-00004-g93003cb24006 #55
> [ 5.400135] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
> [ 5.405430] Call Trace:
> [ 5.406152] <TASK>
> [ 5.406452] blk_mq_sched_insert_requests+0x67/0x150
> [ 5.406759] blk_mq_flush_plug_list+0xd0/0x280
> [ 5.406987] ? bit_wait+0x60/0x60
> [ 5.407317] __blk_flush_plug+0xdb/0x120
> [ 5.407561] ? bit_wait+0x60/0x60
> [ 5.407765] io_schedule_prepare+0x38/0x40
> [...]
>
> This triggers a warning when next_cpu == nr_cpu_ids - 1, so rewrite it
> using cpumask_next_and_wrap() directly. The backwards-going goto can be
> removed, as the cpumask_next*() operation already ANDs hctx->cpumask and
> cpu_online_mask, which implies checking for an online CPU.
>
> No change in behaviour intended.
>
> Suggested-by: Yury Norov <yury.norov@gmail.com>
> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
> ---
> block/blk-mq.c | 39 +++++++++++++--------------------------
> 1 file changed, 13 insertions(+), 26 deletions(-)
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index c96c8c4f751b..1520794dd9ea 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2038,42 +2038,29 @@ static inline int blk_mq_first_mapped_cpu(struct blk_mq_hw_ctx *hctx)
> */
> static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
> {
> - bool tried = false;
> int next_cpu = hctx->next_cpu;
>
> if (hctx->queue->nr_hw_queues == 1)
> return WORK_CPU_UNBOUND;
>
> - if (--hctx->next_cpu_batch <= 0) {
> -select_cpu:
> - next_cpu = cpumask_next_and(next_cpu, hctx->cpumask,
> - cpu_online_mask);
> - if (next_cpu >= nr_cpu_ids)
> - next_cpu = blk_mq_first_mapped_cpu(hctx);
> + if (--hctx->next_cpu_batch > 0 && cpu_online(next_cpu))
> + return next_cpu;
> +
> + next_cpu = cpumask_next_and_wrap(next_cpu, hctx->cpumask, cpu_online_mask, next_cpu, false);
Last two parameters are simply useless. In fact, in many cases they
are useless for cpumask_next_wrap(). I'm working on simplifying the
cpumask_next_wrap() so that it would take just 2 parameters - pivot
point and cpumask.
Regarding 'next' version - we already have find_next_and_bit_wrap(),
and I think cpumask_next_and_wrap() should use it.
For the context: those last parameters are needed to exclude part of
cpumask from traversing, and to implement for-loop. Now that we have
for_each_cpu_wrap() based on for_each_set_bit_wrap(), it's possible
to remove them. I'm working on it.
> + if (next_cpu < nr_cpu_ids) {
> hctx->next_cpu_batch = BLK_MQ_CPU_WORK_BATCH;
> + hctx->next_cpu = next_cpu;
> + return next_cpu;
> }
>
> /*
> - * Do unbound schedule if we can't find a online CPU for this hctx,
> - * and it should only happen in the path of handling CPU DEAD.
> + * No other online CPU in hctx->cpumask.
> + *
> + * Make sure to re-select CPU next time once after CPUs
> + * in hctx->cpumask become online again.
> */
> - if (!cpu_online(next_cpu)) {
> - if (!tried) {
> - tried = true;
> - goto select_cpu;
> - }
> -
> - /*
> - * Make sure to re-select CPU next time once after CPUs
> - * in hctx->cpumask become online again.
> - */
> - hctx->next_cpu = next_cpu;
> - hctx->next_cpu_batch = 1;
> - return WORK_CPU_UNBOUND;
> - }
> -
> - hctx->next_cpu = next_cpu;
> - return next_cpu;
> + hctx->next_cpu_batch = 1;
> + return WORK_CPU_UNBOUND;
> }
>
> /**
> --
> 2.31.1
On 06/10/22 06:50, Yury Norov wrote:
> On Thu, Oct 06, 2022 at 01:21:12PM +0100, Valentin Schneider wrote:
>> blk_mq_hctx_next_cpu() implements a form of cpumask_next_and_wrap() using
>> cpumask_next_and_cpu() and blk_mq_first_mapped_cpu():
>>
>> [ 5.398453] WARNING: CPU: 3 PID: 162 at include/linux/cpumask.h:110 __blk_mq_delay_run_hw_queue+0x16b/0x180
>> [ 5.399317] Modules linked in:
>> [ 5.399646] CPU: 3 PID: 162 Comm: ssh-keygen Tainted: G N 6.0.0-rc4-00004-g93003cb24006 #55
>> [ 5.400135] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.14.0-0-g155821a1990b-prebuilt.qemu.org 04/01/2014
>> [ 5.405430] Call Trace:
>> [ 5.406152] <TASK>
>> [ 5.406452] blk_mq_sched_insert_requests+0x67/0x150
>> [ 5.406759] blk_mq_flush_plug_list+0xd0/0x280
>> [ 5.406987] ? bit_wait+0x60/0x60
>> [ 5.407317] __blk_flush_plug+0xdb/0x120
>> [ 5.407561] ? bit_wait+0x60/0x60
>> [ 5.407765] io_schedule_prepare+0x38/0x40
>> [...]
>>
>> This triggers a warning when next_cpu == nr_cpu_ids - 1, so rewrite it
>> using cpumask_next_and_wrap() directly. The backwards-going goto can be
>> removed, as the cpumask_next*() operation already ANDs hctx->cpumask and
>> cpu_online_mask, which implies checking for an online CPU.
>>
>> No change in behaviour intended.
>>
>> Suggested-by: Yury Norov <yury.norov@gmail.com>
>> Signed-off-by: Valentin Schneider <vschneid@redhat.com>
>> ---
>> block/blk-mq.c | 39 +++++++++++++--------------------------
>> 1 file changed, 13 insertions(+), 26 deletions(-)
>>
>> diff --git a/block/blk-mq.c b/block/blk-mq.c
>> index c96c8c4f751b..1520794dd9ea 100644
>> --- a/block/blk-mq.c
>> +++ b/block/blk-mq.c
>> @@ -2038,42 +2038,29 @@ static inline int blk_mq_first_mapped_cpu(struct blk_mq_hw_ctx *hctx)
>> */
>> static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
>> {
>> - bool tried = false;
>> int next_cpu = hctx->next_cpu;
>>
>> if (hctx->queue->nr_hw_queues == 1)
>> return WORK_CPU_UNBOUND;
>>
>> - if (--hctx->next_cpu_batch <= 0) {
>> -select_cpu:
>> - next_cpu = cpumask_next_and(next_cpu, hctx->cpumask,
>> - cpu_online_mask);
>> - if (next_cpu >= nr_cpu_ids)
>> - next_cpu = blk_mq_first_mapped_cpu(hctx);
>> + if (--hctx->next_cpu_batch > 0 && cpu_online(next_cpu))
>> + return next_cpu;
>> +
>> + next_cpu = cpumask_next_and_wrap(next_cpu, hctx->cpumask, cpu_online_mask, next_cpu, false);
>
> Last two parameters are simply useless. In fact, in many cases they
> are useless for cpumask_next_wrap(). I'm working on simplifying the
> cpumask_next_wrap() so that it would take just 2 parameters - pivot
> point and cpumask.
>
> Regarding 'next' version - we already have find_next_and_bit_wrap(),
> and I think cpumask_next_and_wrap() should use it.
>
Oh, I had missed those, that makes more sense indeed.
> For the context: those last parameters are needed to exclude part of
> cpumask from traversing, and to implement for-loop. Now that we have
> for_each_cpu_wrap() based on for_each_set_bit_wrap(), it's possible
> to remove them. I'm working on it.
Sounds good.
> >> static int blk_mq_hctx_next_cpu(struct blk_mq_hw_ctx *hctx)
> >> {
> >> - bool tried = false;
> >> int next_cpu = hctx->next_cpu;
> >>
> >> if (hctx->queue->nr_hw_queues == 1)
> >> return WORK_CPU_UNBOUND;
> >>
> >> - if (--hctx->next_cpu_batch <= 0) {
> >> -select_cpu:
> >> - next_cpu = cpumask_next_and(next_cpu, hctx->cpumask,
> >> - cpu_online_mask);
> >> - if (next_cpu >= nr_cpu_ids)
> >> - next_cpu = blk_mq_first_mapped_cpu(hctx);
> >> + if (--hctx->next_cpu_batch > 0 && cpu_online(next_cpu))
> >> + return next_cpu;
> >> +
> >> + next_cpu = cpumask_next_and_wrap(next_cpu, hctx->cpumask, cpu_online_mask, next_cpu, false);
> >
> > Last two parameters are simply useless. In fact, in many cases they
> > are useless for cpumask_next_wrap(). I'm working on simplifying the
> > cpumask_next_wrap() so that it would take just 2 parameters - pivot
> > point and cpumask.
> >
> > Regarding 'next' version - we already have find_next_and_bit_wrap(),
> > and I think cpumask_next_and_wrap() should use it.
> >
>
> Oh, I had missed those, that makes more sense indeed.
>
> > For the context: those last parameters are needed to exclude part of
> > cpumask from traversing, and to implement for-loop. Now that we have
> > for_each_cpu_wrap() based on for_each_set_bit_wrap(), it's possible
> > to remove them. I'm working on it.
>
> Sounds good.
Hi Valentin, all,
I'd like to share my work-in-progress for cpumask_next_wrap(). It's
now in testing (at least, it boots on x86_64 VM).
I'd like to collect early comments on the rework. If you like it, please
align your 'and' version with this.
https://github.com/norov/linux/commits/__bitmap-for-next
Thanks,
Yury
© 2016 - 2026 Red Hat, Inc.