block/bfq-iosched.c | 5 +++++ 1 file changed, 5 insertions(+)
This fixes crashes in bfq_add_bfqq_busy due to waker_bfqq being NULL,
but woken_list_node still being hashed. This would happen when
bfq_init_rq() expects a brand new allocated queue to be returned from
bfq_get_bfqq_handle_split() and unconditionally updates waker_bfqq
without resetting woken_list_node. Since we can always return oom_bfqq
when attempting to allocate, we cannot assume waker_bfqq starts as NULL.
We must either reset woken_list_node, or avoid setting woken_list at all
for oom_bfqq - opt to do the former.
Crashes would have a stacktrace like:
[160595.656560] bfq_add_bfqq_busy+0x110/0x1ec
[160595.661142] bfq_add_request+0x6bc/0x980
[160595.666602] bfq_insert_request+0x8ec/0x1240
[160595.671762] bfq_insert_requests+0x58/0x9c
[160595.676420] blk_mq_sched_insert_request+0x11c/0x198
[160595.682107] blk_mq_submit_bio+0x270/0x62c
[160595.686759] __submit_bio_noacct_mq+0xec/0x178
[160595.691926] submit_bio+0x120/0x184
[160595.695990] ext4_mpage_readpages+0x77c/0x7c8
[160595.701026] ext4_readpage+0x60/0xb0
[160595.705158] filemap_read_page+0x54/0x114
[160595.711961] filemap_fault+0x228/0x5f4
[160595.716272] do_read_fault+0xe0/0x1f0
[160595.720487] do_fault+0x40/0x1c8
Tested by injecting random failures into bfq_get_queue, crashes go away
completely.
Fixes: 8ef3fc3a043c ("block, bfq: make shared queues inherit wakers")
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
---
RFC mainly because it's not clear to me the best policy here - but the
patch is tested and fixes a real crash we started seeing in 5.15
This is following up my ramble over at
https://lore.kernel.org/lkml/CACGdZYLMnfcqwbAXDx+x9vUOMn2cz55oc+8WySBS3J2Xd_q7Lg@mail.gmail.com/
block/bfq-iosched.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 7ea427817f7f..5d2861119d20 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -6793,7 +6793,12 @@ static struct bfq_queue *bfq_init_rq(struct request *rq)
* reset. So insert new_bfqq into the
* woken_list of the waker. See
* bfq_check_waker for details.
+ *
+ * Also, if we got oom_bfqq, we must check if
+ * it's already in a woken_list
*/
+ if (unlikely(!hlist_unhashed(&bfqq->woken_list_node)))
+ hlist_del_init(&bfqq->woken_list_node);
if (bfqq->waker_bfqq)
hlist_add_head(&bfqq->woken_list_node,
&bfqq->waker_bfqq->woken_list);
--
2.38.1.273.g43a17bfeac-goog
Hi,
在 2022/11/03 9:39, Khazhismel Kumykov 写道:
> This fixes crashes in bfq_add_bfqq_busy due to waker_bfqq being NULL,
> but woken_list_node still being hashed. This would happen when
> bfq_init_rq() expects a brand new allocated queue to be returned from
From what I see, bfqq->waker_bfqq is updated in bfq_init_rq() only if
'new_queue' is false, but if 'new_queue' is false, the returned 'bfqq'
from bfq_get_bfqq_handle_split() will never be oom_bfqq, so I'm confused
here...
> bfq_get_bfqq_handle_split() and unconditionally updates waker_bfqq
> without resetting woken_list_node. Since we can always return oom_bfqq
> when attempting to allocate, we cannot assume waker_bfqq starts as NULL.
> We must either reset woken_list_node, or avoid setting woken_list at all
> for oom_bfqq - opt to do the former.
Once oom_bfqq is used, I think the io is treated as issued from root
group. Hence I don't think it's necessary to set woken_list or
waker_bfqq for oom_bfqq.
Thanks,
Kuai
>
> Crashes would have a stacktrace like:
> [160595.656560] bfq_add_bfqq_busy+0x110/0x1ec
> [160595.661142] bfq_add_request+0x6bc/0x980
> [160595.666602] bfq_insert_request+0x8ec/0x1240
> [160595.671762] bfq_insert_requests+0x58/0x9c
> [160595.676420] blk_mq_sched_insert_request+0x11c/0x198
> [160595.682107] blk_mq_submit_bio+0x270/0x62c
> [160595.686759] __submit_bio_noacct_mq+0xec/0x178
> [160595.691926] submit_bio+0x120/0x184
> [160595.695990] ext4_mpage_readpages+0x77c/0x7c8
> [160595.701026] ext4_readpage+0x60/0xb0
> [160595.705158] filemap_read_page+0x54/0x114
> [160595.711961] filemap_fault+0x228/0x5f4
> [160595.716272] do_read_fault+0xe0/0x1f0
> [160595.720487] do_fault+0x40/0x1c8
>
> Tested by injecting random failures into bfq_get_queue, crashes go away
> completely.
>
> Fixes: 8ef3fc3a043c ("block, bfq: make shared queues inherit wakers")
> Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
> ---
> RFC mainly because it's not clear to me the best policy here - but the
> patch is tested and fixes a real crash we started seeing in 5.15
>
> This is following up my ramble over at
> https://lore.kernel.org/lkml/CACGdZYLMnfcqwbAXDx+x9vUOMn2cz55oc+8WySBS3J2Xd_q7Lg@mail.gmail.com/
>
> block/bfq-iosched.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 7ea427817f7f..5d2861119d20 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -6793,7 +6793,12 @@ static struct bfq_queue *bfq_init_rq(struct request *rq)
> * reset. So insert new_bfqq into the
> * woken_list of the waker. See
> * bfq_check_waker for details.
> + *
> + * Also, if we got oom_bfqq, we must check if
> + * it's already in a woken_list
> */
> + if (unlikely(!hlist_unhashed(&bfqq->woken_list_node)))
> + hlist_del_init(&bfqq->woken_list_node);
> if (bfqq->waker_bfqq)
> hlist_add_head(&bfqq->woken_list_node,
> &bfqq->waker_bfqq->woken_list);
>
On Wed, Nov 2, 2022 at 7:56 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > Hi, > > 在 2022/11/03 9:39, Khazhismel Kumykov 写道: > > This fixes crashes in bfq_add_bfqq_busy due to waker_bfqq being NULL, > > but woken_list_node still being hashed. This would happen when > > bfq_init_rq() expects a brand new allocated queue to be returned from > > From what I see, bfqq->waker_bfqq is updated in bfq_init_rq() only if > 'new_queue' is false, but if 'new_queue' is false, the returned 'bfqq' > from bfq_get_bfqq_handle_split() will never be oom_bfqq, so I'm confused > here... There's two calls for bfq_get_bfqq_handle_split in this function - the second one is after the check you mentioned, and is the problematic one. > > > bfq_get_bfqq_handle_split() and unconditionally updates waker_bfqq > > without resetting woken_list_node. Since we can always return oom_bfqq > > when attempting to allocate, we cannot assume waker_bfqq starts as NULL. > > We must either reset woken_list_node, or avoid setting woken_list at all > > for oom_bfqq - opt to do the former. > > Once oom_bfqq is used, I think the io is treated as issued from root > group. Hence I don't think it's necessary to set woken_list or > waker_bfqq for oom_bfqq. Ack, I was wondering what's right here since, evidently, *someone* had already set oom_bfqq->waker_bfqq to *something* (although... maybe it was an earlier init_rq). But maybe it's better to do nothing if we *know* it's oom_bfqq. Is it a correct interpretation here that setting waker_bfqq won't accomplish anything anyways on oom_bfqq, so better off not? > > Thanks, > Kuai
Hi, 在 2022/11/03 11:05, Khazhy Kumykov 写道: > On Wed, Nov 2, 2022 at 7:56 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: >> >> Hi, >> >> 在 2022/11/03 9:39, Khazhismel Kumykov 写道: >>> This fixes crashes in bfq_add_bfqq_busy due to waker_bfqq being NULL, >>> but woken_list_node still being hashed. This would happen when >>> bfq_init_rq() expects a brand new allocated queue to be returned from >> >> From what I see, bfqq->waker_bfqq is updated in bfq_init_rq() only if >> 'new_queue' is false, but if 'new_queue' is false, the returned 'bfqq' >> from bfq_get_bfqq_handle_split() will never be oom_bfqq, so I'm confused >> here... > There's two calls for bfq_get_bfqq_handle_split in this function - the > second one is after the check you mentioned, and is the problematic > one. Yes, thanks for the explanation. Now I understand how the problem triggers. >> >>> bfq_get_bfqq_handle_split() and unconditionally updates waker_bfqq >>> without resetting woken_list_node. Since we can always return oom_bfqq >>> when attempting to allocate, we cannot assume waker_bfqq starts as NULL. >>> We must either reset woken_list_node, or avoid setting woken_list at all >>> for oom_bfqq - opt to do the former. >> >> Once oom_bfqq is used, I think the io is treated as issued from root >> group. Hence I don't think it's necessary to set woken_list or >> waker_bfqq for oom_bfqq. > Ack, I was wondering what's right here since, evidently, *someone* had > already set oom_bfqq->waker_bfqq to *something* (although... maybe it > was an earlier init_rq). But maybe it's better to do nothing if we > *know* it's oom_bfqq. I need to have a check how oom_bfqq get involved with waker_bfqq, and then see if it's reasonable. Probably Jan and Paolo will have better view on this. Thanks, Kuai > > Is it a correct interpretation here that setting waker_bfqq won't > accomplish anything anyways on oom_bfqq, so better off not?
On Thu 03-11-22 11:51:15, Yu Kuai wrote: > Hi, > > 在 2022/11/03 11:05, Khazhy Kumykov 写道: > > On Wed, Nov 2, 2022 at 7:56 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > > > > > Hi, > > > > > > 在 2022/11/03 9:39, Khazhismel Kumykov 写道: > > > > This fixes crashes in bfq_add_bfqq_busy due to waker_bfqq being NULL, > > > > but woken_list_node still being hashed. This would happen when > > > > bfq_init_rq() expects a brand new allocated queue to be returned from > > > > > > From what I see, bfqq->waker_bfqq is updated in bfq_init_rq() only if > > > 'new_queue' is false, but if 'new_queue' is false, the returned 'bfqq' > > > from bfq_get_bfqq_handle_split() will never be oom_bfqq, so I'm confused > > > here... > > There's two calls for bfq_get_bfqq_handle_split in this function - the > > second one is after the check you mentioned, and is the problematic > > one. > Yes, thanks for the explanation. Now I understand how the problem > triggers. > > > > > > > > bfq_get_bfqq_handle_split() and unconditionally updates waker_bfqq > > > > without resetting woken_list_node. Since we can always return oom_bfqq > > > > when attempting to allocate, we cannot assume waker_bfqq starts as NULL. > > > > We must either reset woken_list_node, or avoid setting woken_list at all > > > > for oom_bfqq - opt to do the former. > > > > > > Once oom_bfqq is used, I think the io is treated as issued from root > > > group. Hence I don't think it's necessary to set woken_list or > > > waker_bfqq for oom_bfqq. > > Ack, I was wondering what's right here since, evidently, *someone* had > > already set oom_bfqq->waker_bfqq to *something* (although... maybe it > > was an earlier init_rq). But maybe it's better to do nothing if we > > *know* it's oom_bfqq. > > I need to have a check how oom_bfqq get involved with waker_bfqq, and > then see if it's reasonable. > > Probably Jan and Paolo will have better view on this. Thanks for the CC Kuai and thanks to Khazy for spotting the bug. The oom_bfqq is just a fallback bfqq and as such it should be extempted from all special handling like waker detection etc. All this stuff is just for optimizing performance and when we are OOM, we have far larger troubles than to optimize performance. So how I think we should really fix this is that we extempt oom_bfqq from waker detection in bfq_check_waker() by adding: bfqq == bfqd->oom_bfqq || bfqd->last_completed_rq_bfq == bfqd->oom_bfqq) to the initial check and then also if bfq_get_bfqq_handle_split() returns oom_bfqq we should just skip carrying over the waker information. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR
On Thu, Nov 3, 2022 at 1:47 AM Jan Kara <jack@suse.cz> wrote: > > On Thu 03-11-22 11:51:15, Yu Kuai wrote: > > Hi, > > > > 在 2022/11/03 11:05, Khazhy Kumykov 写道: > > > On Wed, Nov 2, 2022 at 7:56 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > > > > > > > Hi, > > > > > > > > 在 2022/11/03 9:39, Khazhismel Kumykov 写道: > > > > > This fixes crashes in bfq_add_bfqq_busy due to waker_bfqq being NULL, > > > > > but woken_list_node still being hashed. This would happen when > > > > > bfq_init_rq() expects a brand new allocated queue to be returned from > > > > > > > > From what I see, bfqq->waker_bfqq is updated in bfq_init_rq() only if > > > > 'new_queue' is false, but if 'new_queue' is false, the returned 'bfqq' > > > > from bfq_get_bfqq_handle_split() will never be oom_bfqq, so I'm confused > > > > here... > > > There's two calls for bfq_get_bfqq_handle_split in this function - the > > > second one is after the check you mentioned, and is the problematic > > > one. > > Yes, thanks for the explanation. Now I understand how the problem > > triggers. > > > > > > > > > > > bfq_get_bfqq_handle_split() and unconditionally updates waker_bfqq > > > > > without resetting woken_list_node. Since we can always return oom_bfqq > > > > > when attempting to allocate, we cannot assume waker_bfqq starts as NULL. > > > > > We must either reset woken_list_node, or avoid setting woken_list at all > > > > > for oom_bfqq - opt to do the former. > > > > > > > > Once oom_bfqq is used, I think the io is treated as issued from root > > > > group. Hence I don't think it's necessary to set woken_list or > > > > waker_bfqq for oom_bfqq. > > > Ack, I was wondering what's right here since, evidently, *someone* had > > > already set oom_bfqq->waker_bfqq to *something* (although... maybe it > > > was an earlier init_rq). But maybe it's better to do nothing if we > > > *know* it's oom_bfqq. > > > > I need to have a check how oom_bfqq get involved with waker_bfqq, and > > then see if it's reasonable. > > > > Probably Jan and Paolo will have better view on this. > > Thanks for the CC Kuai and thanks to Khazy for spotting the bug. The > oom_bfqq is just a fallback bfqq and as such it should be extempted from > all special handling like waker detection etc. All this stuff is just for > optimizing performance and when we are OOM, we have far larger troubles > than to optimize performance. > > So how I think we should really fix this is that we extempt oom_bfqq from > waker detection in bfq_check_waker() by adding: > > bfqq == bfqd->oom_bfqq || > bfqd->last_completed_rq_bfq == bfqd->oom_bfqq) > > to the initial check and then also if bfq_get_bfqq_handle_split() returns > oom_bfqq we should just skip carrying over the waker information. Thanks for the tip! I'll send a followup, including your suggestions. I do have some other questions in this area, if you could help me understand. Looking again at bfq_init_rq, inside of the !new_queue section - we call bfq_split_bfqq() to "split" our bfqq, then in the next line bfq_get_bfqq_handle_split inspects bic_to_bfqq(bic, is_sync), and if it's NULL, allocates a new queue. However, if we have an async rq, this call will return the pre-existing async bfqq, as the call to bfq_split_bfqq() only clears the sync bfqq, ever. The best understanding I have now is: "bic->bfqq[aync] is never NULL (and we don't merge async queues) so we'll never reach this !new_queue section anyways if it's async". Is that accurate? Thanks, Khazhy > > Honza > -- > Jan Kara <jack@suse.com> > SUSE Labs, CR
On Fri 04-11-22 14:25:32, Khazhy Kumykov wrote: > On Thu, Nov 3, 2022 at 1:47 AM Jan Kara <jack@suse.cz> wrote: > > > > On Thu 03-11-22 11:51:15, Yu Kuai wrote: > > > Hi, > > > > > > 在 2022/11/03 11:05, Khazhy Kumykov 写道: > > > > On Wed, Nov 2, 2022 at 7:56 PM Yu Kuai <yukuai1@huaweicloud.com> wrote: > > > > > > > > > > Hi, > > > > > > > > > > 在 2022/11/03 9:39, Khazhismel Kumykov 写道: > > > > > > This fixes crashes in bfq_add_bfqq_busy due to waker_bfqq being NULL, > > > > > > but woken_list_node still being hashed. This would happen when > > > > > > bfq_init_rq() expects a brand new allocated queue to be returned from > > > > > > > > > > From what I see, bfqq->waker_bfqq is updated in bfq_init_rq() only if > > > > > 'new_queue' is false, but if 'new_queue' is false, the returned 'bfqq' > > > > > from bfq_get_bfqq_handle_split() will never be oom_bfqq, so I'm confused > > > > > here... > > > > There's two calls for bfq_get_bfqq_handle_split in this function - the > > > > second one is after the check you mentioned, and is the problematic > > > > one. > > > Yes, thanks for the explanation. Now I understand how the problem > > > triggers. > > > > > > > > > > > > > > bfq_get_bfqq_handle_split() and unconditionally updates waker_bfqq > > > > > > without resetting woken_list_node. Since we can always return oom_bfqq > > > > > > when attempting to allocate, we cannot assume waker_bfqq starts as NULL. > > > > > > We must either reset woken_list_node, or avoid setting woken_list at all > > > > > > for oom_bfqq - opt to do the former. > > > > > > > > > > Once oom_bfqq is used, I think the io is treated as issued from root > > > > > group. Hence I don't think it's necessary to set woken_list or > > > > > waker_bfqq for oom_bfqq. > > > > Ack, I was wondering what's right here since, evidently, *someone* had > > > > already set oom_bfqq->waker_bfqq to *something* (although... maybe it > > > > was an earlier init_rq). But maybe it's better to do nothing if we > > > > *know* it's oom_bfqq. > > > > > > I need to have a check how oom_bfqq get involved with waker_bfqq, and > > > then see if it's reasonable. > > > > > > Probably Jan and Paolo will have better view on this. > > > > Thanks for the CC Kuai and thanks to Khazy for spotting the bug. The > > oom_bfqq is just a fallback bfqq and as such it should be extempted from > > all special handling like waker detection etc. All this stuff is just for > > optimizing performance and when we are OOM, we have far larger troubles > > than to optimize performance. > > > > So how I think we should really fix this is that we extempt oom_bfqq from > > waker detection in bfq_check_waker() by adding: > > > > bfqq == bfqd->oom_bfqq || > > bfqd->last_completed_rq_bfq == bfqd->oom_bfqq) > > > > to the initial check and then also if bfq_get_bfqq_handle_split() returns > > oom_bfqq we should just skip carrying over the waker information. > Thanks for the tip! I'll send a followup, including your suggestions. > > I do have some other questions in this area, if you could help me > understand. Looking again at bfq_init_rq, inside of the !new_queue > section - we call bfq_split_bfqq() to "split" our bfqq, then in the > next line bfq_get_bfqq_handle_split inspects bic_to_bfqq(bic, > is_sync), and if it's NULL, allocates a new queue. However, if we have > an async rq, this call will return the pre-existing async bfqq, as the > call to bfq_split_bfqq() only clears the sync bfqq, ever. The best > understanding I have now is: "bic->bfqq[aync] is never NULL (and we > don't merge async queues) so we'll never reach this !new_queue section > anyways if it's async". Is that accurate? So you are right that async queues are never merged or split. In fact, if you have a look at bfq_get_queue(), you'll notice that async queue is common for all processes with the same ioprio & blkcg. So all these games with splitting, merging, waker detection etc. impact only sync queues. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR
This fixes crashes in bfq_add_bfqq_busy due to waker_bfqq being NULL,
but woken_list_node still being hashed. This would happen when
bfq_init_rq() expects a brand new allocated queue to be returned from
bfq_get_bfqq_handle_split() and unconditionally updates waker_bfqq
without resetting woken_list_node. Since we can always return oom_bfqq
when attempting to allocate, we cannot assume waker_bfqq starts as NULL.
Avoid setting woken_bfqq for oom_bfqq entirely, as it's not useful.
Crashes would have a stacktrace like:
[160595.656560] bfq_add_bfqq_busy+0x110/0x1ec
[160595.661142] bfq_add_request+0x6bc/0x980
[160595.666602] bfq_insert_request+0x8ec/0x1240
[160595.671762] bfq_insert_requests+0x58/0x9c
[160595.676420] blk_mq_sched_insert_request+0x11c/0x198
[160595.682107] blk_mq_submit_bio+0x270/0x62c
[160595.686759] __submit_bio_noacct_mq+0xec/0x178
[160595.691926] submit_bio+0x120/0x184
[160595.695990] ext4_mpage_readpages+0x77c/0x7c8
[160595.701026] ext4_readpage+0x60/0xb0
[160595.705158] filemap_read_page+0x54/0x114
[160595.711961] filemap_fault+0x228/0x5f4
[160595.716272] do_read_fault+0xe0/0x1f0
[160595.720487] do_fault+0x40/0x1c8
Tested by injecting random failures into bfq_get_queue, crashes go away
completely.
Fixes: 8ef3fc3a043c ("block, bfq: make shared queues inherit wakers")
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
---
block/bfq-iosched.c | 9 +++++++--
1 file changed, 7 insertions(+), 2 deletions(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index 7ea427817f7f..ca04ec868c40 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -6784,6 +6784,12 @@ static struct bfq_queue *bfq_init_rq(struct request *rq)
bfqq = bfq_get_bfqq_handle_split(bfqd, bic, bio,
true, is_sync,
NULL);
+ if (unlikely(bfqq == &bfqd->oom_bfqq))
+ bfqq_already_existing = true;
+ } else
+ bfqq_already_existing = true;
+
+ if (!bfqq_already_existing) {
bfqq->waker_bfqq = old_bfqq->waker_bfqq;
bfqq->tentative_waker_bfqq = NULL;
@@ -6797,8 +6803,7 @@ static struct bfq_queue *bfq_init_rq(struct request *rq)
if (bfqq->waker_bfqq)
hlist_add_head(&bfqq->woken_list_node,
&bfqq->waker_bfqq->woken_list);
- } else
- bfqq_already_existing = true;
+ }
}
}
--
2.38.1.431.g37b22c650d-goog
On Tue, 8 Nov 2022 10:10:29 -0800, Khazhismel Kumykov wrote:
> This fixes crashes in bfq_add_bfqq_busy due to waker_bfqq being NULL,
> but woken_list_node still being hashed. This would happen when
> bfq_init_rq() expects a brand new allocated queue to be returned from
> bfq_get_bfqq_handle_split() and unconditionally updates waker_bfqq
> without resetting woken_list_node. Since we can always return oom_bfqq
> when attempting to allocate, we cannot assume waker_bfqq starts as NULL.
>
> [...]
Applied, thanks!
[1/2] bfq: fix waker_bfqq inconsistency crash
commit: a1795c2ccb1e4c49220d2a0d381540024d71647c
[2/2] bfq: ignore oom_bfqq in bfq_check_waker
commit: 99771d73ff4539f2337b84917f4792abf0d8931b
Best regards,
--
Jens Axboe
On Tue 08-11-22 10:10:29, Khazhismel Kumykov wrote:
> This fixes crashes in bfq_add_bfqq_busy due to waker_bfqq being NULL,
> but woken_list_node still being hashed. This would happen when
> bfq_init_rq() expects a brand new allocated queue to be returned from
> bfq_get_bfqq_handle_split() and unconditionally updates waker_bfqq
> without resetting woken_list_node. Since we can always return oom_bfqq
> when attempting to allocate, we cannot assume waker_bfqq starts as NULL.
>
> Avoid setting woken_bfqq for oom_bfqq entirely, as it's not useful.
>
> Crashes would have a stacktrace like:
> [160595.656560] bfq_add_bfqq_busy+0x110/0x1ec
> [160595.661142] bfq_add_request+0x6bc/0x980
> [160595.666602] bfq_insert_request+0x8ec/0x1240
> [160595.671762] bfq_insert_requests+0x58/0x9c
> [160595.676420] blk_mq_sched_insert_request+0x11c/0x198
> [160595.682107] blk_mq_submit_bio+0x270/0x62c
> [160595.686759] __submit_bio_noacct_mq+0xec/0x178
> [160595.691926] submit_bio+0x120/0x184
> [160595.695990] ext4_mpage_readpages+0x77c/0x7c8
> [160595.701026] ext4_readpage+0x60/0xb0
> [160595.705158] filemap_read_page+0x54/0x114
> [160595.711961] filemap_fault+0x228/0x5f4
> [160595.716272] do_read_fault+0xe0/0x1f0
> [160595.720487] do_fault+0x40/0x1c8
>
> Tested by injecting random failures into bfq_get_queue, crashes go away
> completely.
>
> Fixes: 8ef3fc3a043c ("block, bfq: make shared queues inherit wakers")
> Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
Looks good. Thanks! Feel free to add:
Reviewed-by: Jan Kara <jack@suse.cz>
Honza
> ---
> block/bfq-iosched.c | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
> index 7ea427817f7f..ca04ec868c40 100644
> --- a/block/bfq-iosched.c
> +++ b/block/bfq-iosched.c
> @@ -6784,6 +6784,12 @@ static struct bfq_queue *bfq_init_rq(struct request *rq)
> bfqq = bfq_get_bfqq_handle_split(bfqd, bic, bio,
> true, is_sync,
> NULL);
> + if (unlikely(bfqq == &bfqd->oom_bfqq))
> + bfqq_already_existing = true;
> + } else
> + bfqq_already_existing = true;
> +
> + if (!bfqq_already_existing) {
> bfqq->waker_bfqq = old_bfqq->waker_bfqq;
> bfqq->tentative_waker_bfqq = NULL;
>
> @@ -6797,8 +6803,7 @@ static struct bfq_queue *bfq_init_rq(struct request *rq)
> if (bfqq->waker_bfqq)
> hlist_add_head(&bfqq->woken_list_node,
> &bfqq->waker_bfqq->woken_list);
> - } else
> - bfqq_already_existing = true;
> + }
> }
> }
>
> --
> 2.38.1.431.g37b22c650d-goog
>
--
Jan Kara <jack@suse.com>
SUSE Labs, CR
oom_bfqq is just a fallback bfqq, so shouldn't be used with waker
detection.
Suggested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Khazhismel Kumykov <khazhy@google.com>
---
block/bfq-iosched.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c
index ca04ec868c40..267baf84870f 100644
--- a/block/bfq-iosched.c
+++ b/block/bfq-iosched.c
@@ -2135,7 +2135,9 @@ static void bfq_check_waker(struct bfq_data *bfqd, struct bfq_queue *bfqq,
if (!bfqd->last_completed_rq_bfqq ||
bfqd->last_completed_rq_bfqq == bfqq ||
bfq_bfqq_has_short_ttime(bfqq) ||
- now_ns - bfqd->last_completion >= 4 * NSEC_PER_MSEC)
+ now_ns - bfqd->last_completion >= 4 * NSEC_PER_MSEC ||
+ bfqd->last_completed_rq_bfqq == &bfqd->oom_bfqq ||
+ bfqq == &bfqd->oom_bfqq)
return;
/*
--
2.38.1.431.g37b22c650d-goog
On Tue 08-11-22 10:10:30, Khazhismel Kumykov wrote: > oom_bfqq is just a fallback bfqq, so shouldn't be used with waker > detection. > > Suggested-by: Jan Kara <jack@suse.cz> > Signed-off-by: Khazhismel Kumykov <khazhy@google.com> Looks good. Feel free to add: Reviewed-by: Jan Kara <jack@suse.cz> Honza > --- > block/bfq-iosched.c | 4 +++- > 1 file changed, 3 insertions(+), 1 deletion(-) > > diff --git a/block/bfq-iosched.c b/block/bfq-iosched.c > index ca04ec868c40..267baf84870f 100644 > --- a/block/bfq-iosched.c > +++ b/block/bfq-iosched.c > @@ -2135,7 +2135,9 @@ static void bfq_check_waker(struct bfq_data *bfqd, struct bfq_queue *bfqq, > if (!bfqd->last_completed_rq_bfqq || > bfqd->last_completed_rq_bfqq == bfqq || > bfq_bfqq_has_short_ttime(bfqq) || > - now_ns - bfqd->last_completion >= 4 * NSEC_PER_MSEC) > + now_ns - bfqd->last_completion >= 4 * NSEC_PER_MSEC || > + bfqd->last_completed_rq_bfqq == &bfqd->oom_bfqq || > + bfqq == &bfqd->oom_bfqq) > return; > > /* > -- > 2.38.1.431.g37b22c650d-goog > -- Jan Kara <jack@suse.com> SUSE Labs, CR
© 2016 - 2026 Red Hat, Inc.