fs/bcachefs/error.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)
In bch2_bkey_pick_read_device, we're in an RCU lock. So, we can't call
any potentially-sleeping functions. However, we call bch2_dev_rcu,
which calls bch2_fs_inconsistent in its error case. That then calls
bch2_prt_print on a non-atomic printbuf, as well as uses the blocking
variant of bch2_print_string_as_lines, both of which lead to calls to
potentially-sleeping functions, namely krealloc with GFP_KERNEL
and console_lock respectively.
Give a nonzero atomic to the printbuf, and use the nonblocking variant
of bch2_print_string_as_lines.
Reported-by: syzbot+c82cd2906e2f192410bb@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=c82cd2906e2f192410bb
Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com>
---
fs/bcachefs/error.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/fs/bcachefs/error.c b/fs/bcachefs/error.c
index d4dfd13a8076..6f0f2f12c004 100644
--- a/fs/bcachefs/error.c
+++ b/fs/bcachefs/error.c
@@ -60,6 +60,8 @@ static bool bch2_fs_trans_inconsistent(struct bch_fs *c, struct btree_trans *tra
{
struct printbuf buf = PRINTBUF;
+ buf.atomic++;
+
bch2_log_msg_start(c, &buf);
prt_vprintf(&buf, fmt, args);
@@ -68,7 +70,9 @@ static bool bch2_fs_trans_inconsistent(struct bch_fs *c, struct btree_trans *tra
if (trans)
bch2_trans_updates_to_text(&buf, trans);
bool ret = __bch2_inconsistent_error(c, &buf);
- bch2_print_string_as_lines(KERN_ERR, buf.buf);
+ bch2_print_string_as_lines_nonblocking(KERN_ERR, buf.buf);
+
+ buf.atomic--;
printbuf_exit(&buf);
return ret;
--
2.49.0
On Wed, Apr 02, 2025 at 09:40:40PM +0530, Bharadwaj Raju wrote:
> In bch2_bkey_pick_read_device, we're in an RCU lock. So, we can't call
> any potentially-sleeping functions. However, we call bch2_dev_rcu,
> which calls bch2_fs_inconsistent in its error case. That then calls
> bch2_prt_print on a non-atomic printbuf, as well as uses the blocking
> variant of bch2_print_string_as_lines, both of which lead to calls to
> potentially-sleeping functions, namely krealloc with GFP_KERNEL
> and console_lock respectively.
>
> Give a nonzero atomic to the printbuf, and use the nonblocking variant
> of bch2_print_string_as_lines.
Sorry, beat you to it :)
You also missed the one the syzbot report actually hit -
bch2_inconsistent_error().
commit fef0ac7dbdd3c2166462720a2c0c9b16ad0680a5
Author: Kent Overstreet <kent.overstreet@linux.dev>
Date: Wed Apr 2 11:02:12 2025 -0400
bcachefs: Fix scheduling while atomic
bch2_inconsistent(), bch2_fs_inconsistent() be called from interrupt
context, or with rcu_read_lock() held.
The one syzbot found is in
bch2_bkey_pick_read_device
bch2_dev_rcu
bch2_fs_inconsistent
We're starting to switch to lift the printbufs up to higher levels so we
can emit better log messages and print them all in one go (avoid
garbling), so that conversion will help with spotting these in the
future; when we declare a printbuf it must be flagged if we're in an
atomic context.
Reported-by: syzbot+c82cd2906e2f192410bb@syzkaller.appspotmail.com
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
diff --git a/fs/bcachefs/error.c b/fs/bcachefs/error.c
index d4dfd13a8076..b885bd92834c 100644
--- a/fs/bcachefs/error.c
+++ b/fs/bcachefs/error.c
@@ -45,6 +45,8 @@ bool __bch2_inconsistent_error(struct bch_fs *c, struct printbuf *out)
bool bch2_inconsistent_error(struct bch_fs *c)
{
struct printbuf buf = PRINTBUF;
+ buf.atomic++;
+
printbuf_indent_add_nextline(&buf, 2);
bool ret = __bch2_inconsistent_error(c, &buf);
@@ -59,6 +61,7 @@ static bool bch2_fs_trans_inconsistent(struct bch_fs *c, struct btree_trans *tra
const char *fmt, va_list args)
{
struct printbuf buf = PRINTBUF;
+ buf.atomic++;
bch2_log_msg_start(c, &buf);
>
> Reported-by: syzbot+c82cd2906e2f192410bb@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=c82cd2906e2f192410bb
> Signed-off-by: Bharadwaj Raju <bharadwaj.raju777@gmail.com>
> ---
> fs/bcachefs/error.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/fs/bcachefs/error.c b/fs/bcachefs/error.c
> index d4dfd13a8076..6f0f2f12c004 100644
> --- a/fs/bcachefs/error.c
> +++ b/fs/bcachefs/error.c
> @@ -60,6 +60,8 @@ static bool bch2_fs_trans_inconsistent(struct bch_fs *c, struct btree_trans *tra
> {
> struct printbuf buf = PRINTBUF;
>
> + buf.atomic++;
> +
> bch2_log_msg_start(c, &buf);
>
> prt_vprintf(&buf, fmt, args);
> @@ -68,7 +70,9 @@ static bool bch2_fs_trans_inconsistent(struct bch_fs *c, struct btree_trans *tra
> if (trans)
> bch2_trans_updates_to_text(&buf, trans);
> bool ret = __bch2_inconsistent_error(c, &buf);
> - bch2_print_string_as_lines(KERN_ERR, buf.buf);
> + bch2_print_string_as_lines_nonblocking(KERN_ERR, buf.buf);
> +
> + buf.atomic--;
>
> printbuf_exit(&buf);
> return ret;
> --
> 2.49.0
>
On Wed, Apr 2, 2025 at 9:47 PM Kent Overstreet <kent.overstreet@linux.dev> wrote: > > On Wed, Apr 02, 2025 at 09:40:40PM +0530, Bharadwaj Raju wrote: > > In bch2_bkey_pick_read_device, we're in an RCU lock. So, we can't call > > any potentially-sleeping functions. However, we call bch2_dev_rcu, > > which calls bch2_fs_inconsistent in its error case. That then calls > > bch2_prt_print on a non-atomic printbuf, as well as uses the blocking > > variant of bch2_print_string_as_lines, both of which lead to calls to > > potentially-sleeping functions, namely krealloc with GFP_KERNEL > > and console_lock respectively. > > > > Give a nonzero atomic to the printbuf, and use the nonblocking variant > > of bch2_print_string_as_lines. > > Sorry, beat you to it :) > > You also missed the one the syzbot report actually hit - > bch2_inconsistent_error(). Oops, thank you. If I'm not wrong, though, the bch2_print_string_as_lines still needs to be changed to bch2_print_string_as_lines_nonblocking? In my testing that also produces the same BUG warning. Should I make a patch for that?
On Wed, Apr 02, 2025 at 10:03:10PM +0530, Bharadwaj Raju wrote: > On Wed, Apr 2, 2025 at 9:47 PM Kent Overstreet > <kent.overstreet@linux.dev> wrote: > > > > On Wed, Apr 02, 2025 at 09:40:40PM +0530, Bharadwaj Raju wrote: > > > In bch2_bkey_pick_read_device, we're in an RCU lock. So, we can't call > > > any potentially-sleeping functions. However, we call bch2_dev_rcu, > > > which calls bch2_fs_inconsistent in its error case. That then calls > > > bch2_prt_print on a non-atomic printbuf, as well as uses the blocking > > > variant of bch2_print_string_as_lines, both of which lead to calls to > > > potentially-sleeping functions, namely krealloc with GFP_KERNEL > > > and console_lock respectively. > > > > > > Give a nonzero atomic to the printbuf, and use the nonblocking variant > > > of bch2_print_string_as_lines. > > > > Sorry, beat you to it :) > > > > You also missed the one the syzbot report actually hit - > > bch2_inconsistent_error(). > > Oops, thank you. > > If I'm not wrong, though, the bch2_print_string_as_lines > still needs to be changed to bch2_print_string_as_lines_nonblocking? > > In my testing that also produces the same BUG warning. > > Should I make a patch for that? Yeah, you're right - please do. If you're feeling particularly adventurous - print_string_as_lines() is a hack, I think we should be able to do something more robust by skipping printk (that's where the 1k limit comes from) and calling something lower level - that will require digging into the printk codepath and finding lower level we can call. I also just noticed that print_string_as_lines() needs to check for being passed a NULL pointer - in case the printbuf memory allocation fails. Want to get that one too?
On Wed, Apr 2, 2025 at 10:08 PM Kent Overstreet <kent.overstreet@linux.dev> wrote: > > On Wed, Apr 02, 2025 at 10:03:10PM +0530, Bharadwaj Raju wrote: > > On Wed, Apr 2, 2025 at 9:47 PM Kent Overstreet > > <kent.overstreet@linux.dev> wrote: > > If you're feeling particularly adventurous - print_string_as_lines() is > a hack, I think we should be able to do something more robust by > skipping printk (that's where the 1k limit comes from) and calling > something lower level - that will require digging into the printk > codepath and finding lower level we can call. I tried looking into the printk codepath, namely vprintk_emit -> vprintk_store. It doesn't seem like there's a convenient single lower-level entrypoint we could call which just avoids the 1k limit, rather there's a lot of internal code mixed with the truncation that we'd have to just copy if we want printk behavior. I don't think that's a reasonable option for us. > I also just noticed that print_string_as_lines() needs to check for > being passed a NULL pointer - in case the printbuf memory allocation > fails. Want to get that one too? From what I'm seeing __bch2_print_string_as_lines already checks for the lines pointer being NULL. The only unchecked pointer is prefix, which I don't think needs to be checked since it will be something constant from kern_levels.h, not something which could be NULL.
On Fri, Apr 04, 2025 at 12:22:45PM +0530, Bharadwaj Raju wrote: > On Wed, Apr 2, 2025 at 10:08 PM Kent Overstreet > <kent.overstreet@linux.dev> wrote: > > > > On Wed, Apr 02, 2025 at 10:03:10PM +0530, Bharadwaj Raju wrote: > > > On Wed, Apr 2, 2025 at 9:47 PM Kent Overstreet > > > <kent.overstreet@linux.dev> wrote: > > > > If you're feeling particularly adventurous - print_string_as_lines() is > > a hack, I think we should be able to do something more robust by > > skipping printk (that's where the 1k limit comes from) and calling > > something lower level - that will require digging into the printk > > codepath and finding lower level we can call. > > I tried looking into the printk codepath, namely vprintk_emit -> vprintk_store. > It doesn't seem like there's a convenient single lower-level > entrypoint we could > call which just avoids the 1k limit, rather there's a lot of internal > code mixed with > the truncation that we'd have to just copy if we want printk behavior. > I don't think that's a reasonable option for us. Yeah that does look a bit messy - and it doesn't exactly explain where the 1k limit comes in. > > I also just noticed that print_string_as_lines() needs to check for > > being passed a NULL pointer - in case the printbuf memory allocation > > fails. Want to get that one too? > > From what I'm seeing __bch2_print_string_as_lines already checks > for the lines pointer being NULL. The only unchecked pointer is prefix, > which I don't think needs to be checked since it will be something constant > from kern_levels.h, not something which could be NULL. Yep, I should've checked the code before I said that :)
© 2016 - 2026 Red Hat, Inc.