jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error

[PATCH] jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error

Posted by libaokun@huaweicloud.com 1 year, 6 months ago

From: Baokun Li <libaokun1@huawei.com>

In __jbd2_log_wait_for_space(), we might call jbd2_cleanup_journal_tail()
to recover some journal space. But if an error occurs while executing
jbd2_cleanup_journal_tail() (e.g., an EIO), we don't stop waiting for free
space right away, we try other branches, and if j_committing_transaction
is NULL (i.e., the tid is 0), we will get the following complain:

============================================
JBD2: I/O error when updating journal superblock for sdd-8.
__jbd2_log_wait_for_space: needed 256 blocks and only had 217 space available
__jbd2_log_wait_for_space: no way to get more journal space in sdd-8
------------[ cut here ]------------
WARNING: CPU: 2 PID: 139804 at fs/jbd2/checkpoint.c:109 __jbd2_log_wait_for_space+0x251/0x2e0
Modules linked in:
CPU: 2 PID: 139804 Comm: kworker/u8:3 Not tainted 6.6.0+ #1
RIP: 0010:__jbd2_log_wait_for_space+0x251/0x2e0
Call Trace:
 <TASK>
 add_transaction_credits+0x5d1/0x5e0
 start_this_handle+0x1ef/0x6a0
 jbd2__journal_start+0x18b/0x340
 ext4_dirty_inode+0x5d/0xb0
 __mark_inode_dirty+0xe4/0x5d0
 generic_update_time+0x60/0x70
[...]
============================================

So only if jbd2_cleanup_journal_tail() returns 1, i.e., there is nothing to
clean up at the moment, continue to try to reclaim free space in other ways.

Note that this fix relies on commit 6f6a6fda2945 ("jbd2: fix ocfs2 corrupt
when updating journal superblock fails") to make jbd2_cleanup_journal_tail
return the correct error code.

Fixes: 8c3f25d8950c ("jbd2: don't give up looking for space so easily in __jbd2_log_wait_for_space")
Cc: stable@kernel.org
Signed-off-by: Baokun Li <libaokun1@huawei.com>
---
 fs/jbd2/checkpoint.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
index 951f78634adf..7b593591273a 100644
--- a/fs/jbd2/checkpoint.c
+++ b/fs/jbd2/checkpoint.c
@@ -86,8 +86,11 @@ __releases(&journal->j_state_lock)
 			write_unlock(&journal->j_state_lock);
 			if (chkpt) {
 				jbd2_log_do_checkpoint(journal);
-			} else if (jbd2_cleanup_journal_tail(journal) == 0) {
-				/* We were able to recover space; yay! */
+			} else if (jbd2_cleanup_journal_tail(journal) <= 0) {
+				/*
+				 * We were able to recover space or the
+				 * journal was aborted due to an error.
+				 */
 				;
 			} else if (tid) {
 				/*
-- 
2.39.2

Re: [PATCH] jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error

Posted by Theodore Ts'o 1 year, 5 months ago

On Thu, 18 Jul 2024 19:53:36 +0800, libaokun@huaweicloud.com wrote:
> In __jbd2_log_wait_for_space(), we might call jbd2_cleanup_journal_tail()
> to recover some journal space. But if an error occurs while executing
> jbd2_cleanup_journal_tail() (e.g., an EIO), we don't stop waiting for free
> space right away, we try other branches, and if j_committing_transaction
> is NULL (i.e., the tid is 0), we will get the following complain:
> 
> ============================================
> JBD2: I/O error when updating journal superblock for sdd-8.
> __jbd2_log_wait_for_space: needed 256 blocks and only had 217 space available
> __jbd2_log_wait_for_space: no way to get more journal space in sdd-8
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 139804 at fs/jbd2/checkpoint.c:109 __jbd2_log_wait_for_space+0x251/0x2e0
> Modules linked in:
> CPU: 2 PID: 139804 Comm: kworker/u8:3 Not tainted 6.6.0+ #1
> RIP: 0010:__jbd2_log_wait_for_space+0x251/0x2e0
> Call Trace:
>  <TASK>
>  add_transaction_credits+0x5d1/0x5e0
>  start_this_handle+0x1ef/0x6a0
>  jbd2__journal_start+0x18b/0x340
>  ext4_dirty_inode+0x5d/0xb0
>  __mark_inode_dirty+0xe4/0x5d0
>  generic_update_time+0x60/0x70
> [...]
> ============================================
> 
> [...]

Applied, thanks!

[1/1] jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error
      commit: f5cacdc6f2bb2a9bf214469dd7112b43dd2dd68a

Best regards,
-- 
Theodore Ts'o <tytso@mit.edu>

Re: [PATCH] jbd2: stop waiting for space when jbd2_cleanup_journal_tail() returns error

Posted by Jan Kara 1 year, 6 months ago

On Thu 18-07-24 19:53:36, libaokun@huaweicloud.com wrote:
> From: Baokun Li <libaokun1@huawei.com>
> 
> In __jbd2_log_wait_for_space(), we might call jbd2_cleanup_journal_tail()
> to recover some journal space. But if an error occurs while executing
> jbd2_cleanup_journal_tail() (e.g., an EIO), we don't stop waiting for free
> space right away, we try other branches, and if j_committing_transaction
> is NULL (i.e., the tid is 0), we will get the following complain:
> 
> ============================================
> JBD2: I/O error when updating journal superblock for sdd-8.
> __jbd2_log_wait_for_space: needed 256 blocks and only had 217 space available
> __jbd2_log_wait_for_space: no way to get more journal space in sdd-8
> ------------[ cut here ]------------
> WARNING: CPU: 2 PID: 139804 at fs/jbd2/checkpoint.c:109 __jbd2_log_wait_for_space+0x251/0x2e0
> Modules linked in:
> CPU: 2 PID: 139804 Comm: kworker/u8:3 Not tainted 6.6.0+ #1
> RIP: 0010:__jbd2_log_wait_for_space+0x251/0x2e0
> Call Trace:
>  <TASK>
>  add_transaction_credits+0x5d1/0x5e0
>  start_this_handle+0x1ef/0x6a0
>  jbd2__journal_start+0x18b/0x340
>  ext4_dirty_inode+0x5d/0xb0
>  __mark_inode_dirty+0xe4/0x5d0
>  generic_update_time+0x60/0x70
> [...]
> ============================================
> 
> So only if jbd2_cleanup_journal_tail() returns 1, i.e., there is nothing to
> clean up at the moment, continue to try to reclaim free space in other ways.
> 
> Note that this fix relies on commit 6f6a6fda2945 ("jbd2: fix ocfs2 corrupt
> when updating journal superblock fails") to make jbd2_cleanup_journal_tail
> return the correct error code.
> 
> Fixes: 8c3f25d8950c ("jbd2: don't give up looking for space so easily in __jbd2_log_wait_for_space")
> Cc: stable@kernel.org
> Signed-off-by: Baokun Li <libaokun1@huawei.com>

Looks good. Feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

								Honza

> ---
>  fs/jbd2/checkpoint.c | 7 +++++--
>  1 file changed, 5 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/jbd2/checkpoint.c b/fs/jbd2/checkpoint.c
> index 951f78634adf..7b593591273a 100644
> --- a/fs/jbd2/checkpoint.c
> +++ b/fs/jbd2/checkpoint.c
> @@ -86,8 +86,11 @@ __releases(&journal->j_state_lock)
>  			write_unlock(&journal->j_state_lock);
>  			if (chkpt) {
>  				jbd2_log_do_checkpoint(journal);
> -			} else if (jbd2_cleanup_journal_tail(journal) == 0) {
> -				/* We were able to recover space; yay! */
> +			} else if (jbd2_cleanup_journal_tail(journal) <= 0) {
> +				/*
> +				 * We were able to recover space or the
> +				 * journal was aborted due to an error.
> +				 */
>  				;
>  			} else if (tid) {
>  				/*
> -- 
> 2.39.2
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR