jbd2: gracefully abort on checkpointing state corruptions

[PATCH] jbd2: gracefully abort on checkpointing state corruptions

Posted by Milos Nikic 1 month ago

This patch targets two internal state machine invariants in checkpoint.c
residing inside functions that natively return integer error codes.

- In jbd2_cleanup_journal_tail(): A blocknr of 0 indicates a severely
corrupted journal superblock. Replaced the J_ASSERT with a WARN_ON_ONCE
and a graceful journal abort, returning -EUCLEAN.

- In jbd2_log_do_checkpoint(): Replaced the J_ASSERT_BH checking for
an unexpected buffer_jwrite state. If the warning triggers, we
explicitly drop the just-taken get_bh() reference and call __flush_batch()
to safely clean up any previously queued buffers in the j_chkpt_bhs array,
preventing a memory leak before returning -EUCLEAN.

Signed-off-by: Milos Nikic <nikic.milos@gmail.com>
---
 fs/jbd2/checkpoint.c | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
index de89c5bef607..cdfbfd27afae 100644
--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -267,7 +267,17 @@ int jbd2_log_do_checkpoint(journal_t *journal)
 			 */
 			BUFFER_TRACE(bh, "queue");
 			get_bh(bh);
-			J_ASSERT_BH(bh, !buffer_jwrite(bh));
+			if (WARN_ON_ONCE(buffer_jwrite(bh))) {
+				put_bh(bh); /* drop the ref we just took */
+				spin_unlock(&journal->j_list_lock);
+				jbd2_journal_abort(journal, -EUCLEAN);
+
+				/* Clean up any previously batched buffers */
+				if (batch_count)
+					__flush_batch(journal, &batch_count);
+
+				return -EUCLEAN;
+			}
 			journal->j_chkpt_bhs[batch_count++] = bh;
 			transaction->t_chp_stats.cs_written++;
 			transaction->t_checkpoint_list = jh->b_cpnext;
@@ -325,7 +335,10 @@ int jbd2_cleanup_journal_tail(journal_t *journal)
 
 	if (!jbd2_journal_get_log_tail(journal, &first_tid, &blocknr))
 		return 1;
-	J_ASSERT(blocknr != 0);
+	if (WARN_ON_ONCE(blocknr == 0)) {
+		jbd2_journal_abort(journal, -EUCLEAN);
+		return -EUCLEAN;
+	}
 
 	/*
 	 * We need to make sure that any blocks that were recently written out
-- 
2.53.0

Re: [PATCH] jbd2: gracefully abort on checkpointing state corruptions

Posted by Jan Kara 3 weeks, 2 days ago

On Mon 09-03-26 16:08:38, Milos Nikic wrote:
> This patch targets two internal state machine invariants in checkpoint.c
> residing inside functions that natively return integer error codes.
> 
> - In jbd2_cleanup_journal_tail(): A blocknr of 0 indicates a severely
> corrupted journal superblock. Replaced the J_ASSERT with a WARN_ON_ONCE
> and a graceful journal abort, returning -EUCLEAN.
> 
> - In jbd2_log_do_checkpoint(): Replaced the J_ASSERT_BH checking for
> an unexpected buffer_jwrite state. If the warning triggers, we
> explicitly drop the just-taken get_bh() reference and call __flush_batch()
> to safely clean up any previously queued buffers in the j_chkpt_bhs array,
> preventing a memory leak before returning -EUCLEAN.
> 
> Signed-off-by: Milos Nikic <nikic.milos@gmail.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/jbd2/checkpoint.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> index de89c5bef607..cdfbfd27afae 100644
> --- a/fs/jbd2/checkpoint.c
> +++ b/fs/jbd2/checkpoint.c
> @@ -267,7 +267,17 @@ int jbd2_log_do_checkpoint(journal_t *journal)
>  			 */
>  			BUFFER_TRACE(bh, "queue");
>  			get_bh(bh);
> -			J_ASSERT_BH(bh, !buffer_jwrite(bh));
> +			if (WARN_ON_ONCE(buffer_jwrite(bh))) {
> +				put_bh(bh); /* drop the ref we just took */
> +				spin_unlock(&journal->j_list_lock);
> +				jbd2_journal_abort(journal, -EUCLEAN);
> +
> +				/* Clean up any previously batched buffers */
> +				if (batch_count)
> +					__flush_batch(journal, &batch_count);
> +
> +				return -EUCLEAN;
> +			}
>  			journal->j_chkpt_bhs[batch_count++] = bh;
>  			transaction->t_chp_stats.cs_written++;
>  			transaction->t_checkpoint_list = jh->b_cpnext;
> @@ -325,7 +335,10 @@ int jbd2_cleanup_journal_tail(journal_t *journal)
>  
>  	if (!jbd2_journal_get_log_tail(journal, &first_tid, &blocknr))
>  		return 1;
> -	J_ASSERT(blocknr != 0);
> +	if (WARN_ON_ONCE(blocknr == 0)) {
> +		jbd2_journal_abort(journal, -EUCLEAN);
> +		return -EUCLEAN;
> +	}
>  
>  	/*
>  	 * We need to make sure that any blocks that were recently written out
> -- 
> 2.53.0
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

Re: [PATCH] jbd2: gracefully abort on checkpointing state corruptions

Posted by Milos Nikic 3 weeks, 2 days ago

On Mon, Mar 16, 2026 at 10:34 AM Jan Kara <jack@suse.cz> wrote:
>
> On Mon 09-03-26 16:08:38, Milos Nikic wrote:
> > This patch targets two internal state machine invariants in checkpoint.c
> > residing inside functions that natively return integer error codes.
> >
> > - In jbd2_cleanup_journal_tail(): A blocknr of 0 indicates a severely
> > corrupted journal superblock. Replaced the J_ASSERT with a WARN_ON_ONCE
> > and a graceful journal abort, returning -EUCLEAN.
> >
> > - In jbd2_log_do_checkpoint(): Replaced the J_ASSERT_BH checking for
> > an unexpected buffer_jwrite state. If the warning triggers, we
> > explicitly drop the just-taken get_bh() reference and call __flush_batch()
> > to safely clean up any previously queued buffers in the j_chkpt_bhs array,
> > preventing a memory leak before returning -EUCLEAN.
> >
> > Signed-off-by: Milos Nikic <nikic.milos@gmail.com>
>
> Looks good. Feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
>                                                                 Honza

Hi Jan,

Thank you for the review!
Just a quick heads-up: I recently sent a v2 of this patch to the list
to address some minor feedback from Baokun (specifically, changing
-EUCLEAN to -EFSCORRUPTED, and ensuring jbd2_journal_abort is called
after __flush_batch).
Does your Reviewed-by still apply to the v2? If so, I can spin up a
quick v3 just to formally collect your tag, or if you prefer, you can
just grab v2 from the list and append it there.

Thanks, Milos


>
> > ---
> >  fs/jbd2/checkpoint.c | 17 +++++++++++++++--
> >  1 file changed, 15 insertions(+), 2 deletions(-)
> >
> > diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> > index de89c5bef607..cdfbfd27afae 100644
> > --- a/fs/jbd2/checkpoint.c
> > +++ b/fs/jbd2/checkpoint.c
> > @@ -267,7 +267,17 @@ int jbd2_log_do_checkpoint(journal_t *journal)
> >                        */
> >                       BUFFER_TRACE(bh, "queue");
> >                       get_bh(bh);
> > -                     J_ASSERT_BH(bh, !buffer_jwrite(bh));
> > +                     if (WARN_ON_ONCE(buffer_jwrite(bh))) {
> > +                             put_bh(bh); /* drop the ref we just took */
> > +                             spin_unlock(&journal->j_list_lock);
> > +                             jbd2_journal_abort(journal, -EUCLEAN);
> > +
> > +                             /* Clean up any previously batched buffers */
> > +                             if (batch_count)
> > +                                     __flush_batch(journal, &batch_count);
> > +
> > +                             return -EUCLEAN;
> > +                     }
> >                       journal->j_chkpt_bhs[batch_count++] = bh;
> >                       transaction->t_chp_stats.cs_written++;
> >                       transaction->t_checkpoint_list = jh->b_cpnext;
> > @@ -325,7 +335,10 @@ int jbd2_cleanup_journal_tail(journal_t *journal)
> >
> >       if (!jbd2_journal_get_log_tail(journal, &first_tid, &blocknr))
> >               return 1;
> > -     J_ASSERT(blocknr != 0);
> > +     if (WARN_ON_ONCE(blocknr == 0)) {
> > +             jbd2_journal_abort(journal, -EUCLEAN);
> > +             return -EUCLEAN;
> > +     }
> >
> >       /*
> >        * We need to make sure that any blocks that were recently written out
> > --
> > 2.53.0
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR

Re: [PATCH] jbd2: gracefully abort on checkpointing state corruptions

Posted by Baokun Li 4 weeks, 1 day ago

On 3/10/26 7:08 AM, Milos Nikic wrote:
> This patch targets two internal state machine invariants in checkpoint.c
> residing inside functions that natively return integer error codes.
>
> - In jbd2_cleanup_journal_tail(): A blocknr of 0 indicates a severely
> corrupted journal superblock. Replaced the J_ASSERT with a WARN_ON_ONCE
> and a graceful journal abort, returning -EUCLEAN.
>
> - In jbd2_log_do_checkpoint(): Replaced the J_ASSERT_BH checking for
> an unexpected buffer_jwrite state. If the warning triggers, we
> explicitly drop the just-taken get_bh() reference and call __flush_batch()
> to safely clean up any previously queued buffers in the j_chkpt_bhs array,
> preventing a memory leak before returning -EUCLEAN.
>
> Signed-off-by: Milos Nikic <nikic.milos@gmail.com>

Looks good to me, just two minor nits:

 * Replacing EUCLEAN with EFSCORRUPTED would make more sense.
 * Putting jbd2_journal_abort after __flush_batch reads more naturally.

Otherwise, feel free to add:

Reviewed-by: Baokun Li <libaokun@linux.alibaba.com>

> ---
>  fs/jbd2/checkpoint.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
>
> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> index de89c5bef607..cdfbfd27afae 100644
> --- a/fs/jbd2/checkpoint.c
> +++ b/fs/jbd2/checkpoint.c
> @@ -267,7 +267,17 @@ int jbd2_log_do_checkpoint(journal_t *journal)
>  			 */
>  			BUFFER_TRACE(bh, "queue");
>  			get_bh(bh);
> -			J_ASSERT_BH(bh, !buffer_jwrite(bh));
> +			if (WARN_ON_ONCE(buffer_jwrite(bh))) {
> +				put_bh(bh); /* drop the ref we just took */
> +				spin_unlock(&journal->j_list_lock);
> +				jbd2_journal_abort(journal, -EUCLEAN);
> +
> +				/* Clean up any previously batched buffers */
> +				if (batch_count)
> +					__flush_batch(journal, &batch_count);
> +
> +				return -EUCLEAN;
> +			}
>  			journal->j_chkpt_bhs[batch_count++] = bh;
>  			transaction->t_chp_stats.cs_written++;
>  			transaction->t_checkpoint_list = jh->b_cpnext;
> @@ -325,7 +335,10 @@ int jbd2_cleanup_journal_tail(journal_t *journal)
>  
>  	if (!jbd2_journal_get_log_tail(journal, &first_tid, &blocknr))
>  		return 1;
> -	J_ASSERT(blocknr != 0);
> +	if (WARN_ON_ONCE(blocknr == 0)) {
> +		jbd2_journal_abort(journal, -EUCLEAN);
> +		return -EUCLEAN;
> +	}
>  
>  	/*
>  	 * We need to make sure that any blocks that were recently written out

Re: [PATCH] jbd2: gracefully abort on checkpointing state corruptions

Posted by Zhang Yi 4 weeks, 1 day ago

On 3/10/2026 7:08 AM, Milos Nikic wrote:
> This patch targets two internal state machine invariants in checkpoint.c
> residing inside functions that natively return integer error codes.
> 
> - In jbd2_cleanup_journal_tail(): A blocknr of 0 indicates a severely
> corrupted journal superblock. Replaced the J_ASSERT with a WARN_ON_ONCE
> and a graceful journal abort, returning -EUCLEAN.
> 
> - In jbd2_log_do_checkpoint(): Replaced the J_ASSERT_BH checking for
> an unexpected buffer_jwrite state. If the warning triggers, we
> explicitly drop the just-taken get_bh() reference and call __flush_batch()
> to safely clean up any previously queued buffers in the j_chkpt_bhs array,
> preventing a memory leak before returning -EUCLEAN.
> 
> Signed-off-by: Milos Nikic <nikic.milos@gmail.com>

Thank you for the patch. Looks good to me.

Reviewed-by: Zhang Yi <yi.zhang@huawei.com>

> ---
>  fs/jbd2/checkpoint.c | 17 +++++++++++++++--
>  1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> index de89c5bef607..cdfbfd27afae 100644
> --- a/fs/jbd2/checkpoint.c
> +++ b/fs/jbd2/checkpoint.c
> @@ -267,7 +267,17 @@ int jbd2_log_do_checkpoint(journal_t *journal)
>  			 */
>  			BUFFER_TRACE(bh, "queue");
>  			get_bh(bh);
> -			J_ASSERT_BH(bh, !buffer_jwrite(bh));
> +			if (WARN_ON_ONCE(buffer_jwrite(bh))) {
> +				put_bh(bh); /* drop the ref we just took */
> +				spin_unlock(&journal->j_list_lock);
> +				jbd2_journal_abort(journal, -EUCLEAN);
> +
> +				/* Clean up any previously batched buffers */
> +				if (batch_count)
> +					__flush_batch(journal, &batch_count);
> +
> +				return -EUCLEAN;
> +			}
>  			journal->j_chkpt_bhs[batch_count++] = bh;
>  			transaction->t_chp_stats.cs_written++;
>  			transaction->t_checkpoint_list = jh->b_cpnext;
> @@ -325,7 +335,10 @@ int jbd2_cleanup_journal_tail(journal_t *journal)
>  
>  	if (!jbd2_journal_get_log_tail(journal, &first_tid, &blocknr))
>  		return 1;
> -	J_ASSERT(blocknr != 0);
> +	if (WARN_ON_ONCE(blocknr == 0)) {
> +		jbd2_journal_abort(journal, -EUCLEAN);
> +		return -EUCLEAN;
> +	}
>  
>  	/*
>  	 * We need to make sure that any blocks that were recently written out

Re: [PATCH] jbd2: gracefully abort on checkpointing state corruptions

Posted by Andreas Dilger 1 month ago

On Mar 9, 2026, at 17:08, Milos Nikic <nikic.milos@gmail.com> wrote:
> 
> This patch targets two internal state machine invariants in checkpoint.c
> residing inside functions that natively return integer error codes.
> 
> - In jbd2_cleanup_journal_tail(): A blocknr of 0 indicates a severely
> corrupted journal superblock. Replaced the J_ASSERT with a WARN_ON_ONCE
> and a graceful journal abort, returning -EUCLEAN.
> 
> - In jbd2_log_do_checkpoint(): Replaced the J_ASSERT_BH checking for
> an unexpected buffer_jwrite state. If the warning triggers, we
> explicitly drop the just-taken get_bh() reference and call __flush_batch()
> to safely clean up any previously queued buffers in the j_chkpt_bhs array,
> preventing a memory leak before returning -EUCLEAN.
> 
> Signed-off-by: Milos Nikic <nikic.milos@gmail.com>

Reviewed-by: Andreas Dilger <adilger@dilger.ca <mailto:adilger@dilger.ca>>

> ---
> fs/jbd2/checkpoint.c | 17 +++++++++++++++--
> 1 file changed, 15 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> index de89c5bef607..cdfbfd27afae 100644
> --- a/fs/jbd2/checkpoint.c
> +++ b/fs/jbd2/checkpoint.c
> @@ -267,7 +267,17 @@ int jbd2_log_do_checkpoint(journal_t *journal)
> */
> BUFFER_TRACE(bh, "queue");
> get_bh(bh);
> - J_ASSERT_BH(bh, !buffer_jwrite(bh));
> + if (WARN_ON_ONCE(buffer_jwrite(bh))) {
> + put_bh(bh); /* drop the ref we just took */
> + spin_unlock(&journal->j_list_lock);
> + jbd2_journal_abort(journal, -EUCLEAN);
> +
> + /* Clean up any previously batched buffers */
> + if (batch_count)
> + __flush_batch(journal, &batch_count);
> +
> + return -EUCLEAN;
> + }
> journal->j_chkpt_bhs[batch_count++] = bh;
> transaction->t_chp_stats.cs_written++;
> transaction->t_checkpoint_list = jh->b_cpnext;
> @@ -325,7 +335,10 @@ int jbd2_cleanup_journal_tail(journal_t *journal)
> 
> if (!jbd2_journal_get_log_tail(journal, &first_tid, &blocknr))
> return 1;
> - J_ASSERT(blocknr != 0);
> + if (WARN_ON_ONCE(blocknr == 0)) {
> + jbd2_journal_abort(journal, -EUCLEAN);
> + return -EUCLEAN;
> + }
> 
> /*
> * We need to make sure that any blocks that were recently written out
> -- 
> 2.53.0
> 
>