fs/bcachefs/io_misc.c | 2 -- 1 file changed, 2 deletions(-)
bch2_extent_fallocate should not return -BCH_ERR_transaction_restart_nested when there are no buckets available.
Returning -BCH_ERR_transaction_restart_nested causes the sector allocation to retry even if there's no space available.
Reported-by: syzbot+1e515cab343dbe5aa38a@syzkaller.appspotmail.com
Signed-off-by: Camila Alvarez <cam.alvarez.i@gmail.com>
---
fs/bcachefs/io_misc.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/fs/bcachefs/io_misc.c b/fs/bcachefs/io_misc.c
index 82f9170dab3f..b5b92ef030df 100644
--- a/fs/bcachefs/io_misc.c
+++ b/fs/bcachefs/io_misc.c
@@ -90,8 +90,6 @@ int bch2_extent_fallocate(struct btree_trans *trans,
opts.data_replicas,
opts.data_replicas,
BCH_WATERMARK_normal, 0, &cl, &wp);
- if (bch2_err_matches(ret, BCH_ERR_operation_blocked))
- ret = -BCH_ERR_transaction_restart_nested;
if (ret)
goto err;
--
2.34.1
On Tue, Jul 23, 2024 at 09:08:00PM GMT, Camila Alvarez wrote: > bch2_extent_fallocate should not return -BCH_ERR_transaction_restart_nested when there are no buckets available. > Returning -BCH_ERR_transaction_restart_nested causes the sector allocation to retry even if there's no space available. It seems like you're saying this fixes a livelock, but I don't see that: if we return operation_blocked, the closure_sync we do before we return will still wait. More explanation? > > Reported-by: syzbot+1e515cab343dbe5aa38a@syzkaller.appspotmail.com > Signed-off-by: Camila Alvarez <cam.alvarez.i@gmail.com> > --- > fs/bcachefs/io_misc.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/fs/bcachefs/io_misc.c b/fs/bcachefs/io_misc.c > index 82f9170dab3f..b5b92ef030df 100644 > --- a/fs/bcachefs/io_misc.c > +++ b/fs/bcachefs/io_misc.c > @@ -90,8 +90,6 @@ int bch2_extent_fallocate(struct btree_trans *trans, > opts.data_replicas, > opts.data_replicas, > BCH_WATERMARK_normal, 0, &cl, &wp); > - if (bch2_err_matches(ret, BCH_ERR_operation_blocked)) > - ret = -BCH_ERR_transaction_restart_nested; > if (ret) > goto err; > > -- > 2.34.1 >
On Fri, 2 Aug 2024, Kent Overstreet wrote: > On Tue, Jul 23, 2024 at 09:08:00PM GMT, Camila Alvarez wrote: >> bch2_extent_fallocate should not return -BCH_ERR_transaction_restart_nested when there are no buckets available. >> Returning -BCH_ERR_transaction_restart_nested causes the sector allocation to retry even if there's no space available. > > It seems like you're saying this fixes a livelock, but I don't see that: > if we return operation_blocked, the closure_sync we do before we return > will still wait. > > More explanation? > Yes, sorry for the delay. The closure_sync will still wait, but even after waiting there's no space left. I think the issue is that we're trying to allocate a bucket when none are available: closure_wait(&c->freelist_wait, cl); journal_write_done is called at some point: bch2_do_discard(c) closure_wake_up(&c->freelist_wait) ... That apparently doesn't free any buckets (should we add some sort of assertion to notify us when this happens?), so there is still no space left, so bch2_bucket_alloc_trans keeps returning -BCH_ERR_freelist_empty. With this bch2_alloc_sectors_start_trans return -BCH_err_bucket_alloc_blocked, which bch2_extent_fallocate turns into -BCH_ERR_transaction_restart_nested. Finally __bchfs_fallocate sees the error as BCH_ERR_transaction_restart, so it tries again. The issue is that the journal never discards any buckets so it just keeps retrying indefinitely. I've been thinking, and maybe we don't want to just keep __bchfs_fallocate from retrying, but would prefer to control how many times it does. Does all of this make any sense to you? >> >> Reported-by: syzbot+1e515cab343dbe5aa38a@syzkaller.appspotmail.com >> Signed-off-by: Camila Alvarez <cam.alvarez.i@gmail.com> >> --- >> fs/bcachefs/io_misc.c | 2 -- >> 1 file changed, 2 deletions(-) >> >> diff --git a/fs/bcachefs/io_misc.c b/fs/bcachefs/io_misc.c >> index 82f9170dab3f..b5b92ef030df 100644 >> --- a/fs/bcachefs/io_misc.c >> +++ b/fs/bcachefs/io_misc.c >> @@ -90,8 +90,6 @@ int bch2_extent_fallocate(struct btree_trans *trans, >> opts.data_replicas, >> opts.data_replicas, >> BCH_WATERMARK_normal, 0, &cl, &wp); >> - if (bch2_err_matches(ret, BCH_ERR_operation_blocked)) >> - ret = -BCH_ERR_transaction_restart_nested; >> if (ret) >> goto err; >> >> -- >> 2.34.1 >> >
© 2016 - 2026 Red Hat, Inc.