[v3] jbd2: audit and convert legacy J_ASSERT usage

[PATCH v3 0/2] jbd2: audit and convert legacy J_ASSERT usage

Posted by Milos Nikic 1 month, 1 week ago

Hello Jan and the ext4 team,

This patch series follows up on the previous discussion regarding
converting hard J_ASSERT panics into graceful journal aborts.

In v1, we addressed a specific panic on unlock. Per Jan's suggestion,
I have audited fs/jbd2/transaction.c for other low-hanging fruit
where state machine invariants are enforced by J_ASSERT inside
functions that natively support error returns.

Changes in v3:

    Patch 2: Added pr_err() statements inside the ambiguous WARN_ON_ONCE()
    blocks (where multiple conditions are checked via logical OR/AND) to
    explicitly dump the b_transaction, b_next_transaction, and
    j_committing_transaction pointers. This provides necessary context for
    debugging state machine corruptions from the dmesg stack trace.

Changes in v2:

    Patch 1: Unmodified from v1. Collected Reviewed-by tags.

    Patch 2: New patch resulting from the broader audit. Systematically
    replaces J_ASSERTs with WARN_ON_ONCE and graceful -EINVAL returns
    across 6 core transaction lifecycle functions. Careful attention was
    paid to ensuring spinlocks are safely dropped before triggering
    jbd2_journal_abort(), and no memory is leaked on the error paths.

Milos Nikic (2):
  jbd2: gracefully abort instead of panicking on unlocked buffer
  jbd2: gracefully abort on transaction state corruptions

 fs/jbd2/transaction.c | 115 +++++++++++++++++++++++++++++++++---------
 1 file changed, 91 insertions(+), 24 deletions(-)

-- 
2.53.0

Re: [PATCH v3 0/2] jbd2: audit and convert legacy J_ASSERT usage

Posted by yebin (H) 1 month, 1 week ago

The macro `J_ASSERT_JH` is a rather troublesome implementation. There
are numerous calls to `J_ASSERT_JH` within
`jbd2_journal_commit_transaction()`, and after compilation, these may
all jump to the same address for execution, making it difficult to
determine exactly where the assertion is being triggered. If there is a
functional issue in just a single file system, using `BUG_ON` to handle
it seems a bit too aggressive.
I wonder if you all have any good ideas or suggestions.

On 2026/3/3 8:55, Milos Nikic wrote:
> Hello Jan and the ext4 team,
>
> This patch series follows up on the previous discussion regarding
> converting hard J_ASSERT panics into graceful journal aborts.
>
> In v1, we addressed a specific panic on unlock. Per Jan's suggestion,
> I have audited fs/jbd2/transaction.c for other low-hanging fruit
> where state machine invariants are enforced by J_ASSERT inside
> functions that natively support error returns.
>
> Changes in v3:
>
>      Patch 2: Added pr_err() statements inside the ambiguous WARN_ON_ONCE()
>      blocks (where multiple conditions are checked via logical OR/AND) to
>      explicitly dump the b_transaction, b_next_transaction, and
>      j_committing_transaction pointers. This provides necessary context for
>      debugging state machine corruptions from the dmesg stack trace.
>
> Changes in v2:
>
>      Patch 1: Unmodified from v1. Collected Reviewed-by tags.
>
>      Patch 2: New patch resulting from the broader audit. Systematically
>      replaces J_ASSERTs with WARN_ON_ONCE and graceful -EINVAL returns
>      across 6 core transaction lifecycle functions. Careful attention was
>      paid to ensuring spinlocks are safely dropped before triggering
>      jbd2_journal_abort(), and no memory is leaked on the error paths.
>
> Milos Nikic (2):
>    jbd2: gracefully abort instead of panicking on unlocked buffer
>    jbd2: gracefully abort on transaction state corruptions
>
>   fs/jbd2/transaction.c | 115 +++++++++++++++++++++++++++++++++---------
>   1 file changed, 91 insertions(+), 24 deletions(-)
>