From: Ye Bin <yebin10@huawei.com>
Now, 'es->s_state' maybe covered by recover journal. And journal errno
maybe not recorded in journal sb as IO error. ext4_update_super() only
update error information when 'sbi->s_add_error_count' large than zero.
Then 'EXT4_ERROR_FS' flag maybe lost.
To solve above issue commit error information after recover journal.
Signed-off-by: Ye Bin <yebin10@huawei.com>
---
fs/ext4/super.c | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index dc3907dff13a..b94754ba8556 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb,
goto err_out;
}
+ if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) &&
+ !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) {
+ EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS;
+ es->s_state |= cpu_to_le16(EXT4_ERROR_FS);
+ err = ext4_commit_super(sb);
+ if (err) {
+ ext4_msg(sb, KERN_ERR,
+ "Failed to commit error information, please repair fs force!");
+ goto err_out;
+ }
+ }
+
EXT4_SB(sb)->s_journal = journal;
err = ext4_clear_journal_err(sb, es);
if (err) {
--
2.31.1
On Tue 14-02-23 10:29:04, Ye Bin wrote: > From: Ye Bin <yebin10@huawei.com> > > Now, 'es->s_state' maybe covered by recover journal. And journal errno > maybe not recorded in journal sb as IO error. ext4_update_super() only > update error information when 'sbi->s_add_error_count' large than zero. > Then 'EXT4_ERROR_FS' flag maybe lost. > To solve above issue commit error information after recover journal. > > Signed-off-by: Ye Bin <yebin10@huawei.com> > --- > fs/ext4/super.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index dc3907dff13a..b94754ba8556 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, > goto err_out; > } > > + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && > + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { > + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; > + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > + err = ext4_commit_super(sb); > + if (err) { > + ext4_msg(sb, KERN_ERR, > + "Failed to commit error information, please repair fs force!"); > + goto err_out; > + } > + } > + Hum, I'm not sure I follow here. If journal replay has overwritten the superblock (and thus the stored error info), then I'd expect es->s_error_count got overwritten (possibly to 0) as well. And this is actually relatively realistic scenario with errors=remount-ro behavior when the first fs error happens. What I intended in my original suggestion was to save es->s_error_count, es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before doing journal replay in ext4_load_journal() and then after journal replay merge this info back to the superblock - if EXT4_ERROR_FS was set, set it now as well, take max of old and new s_error_count, set s_first_error_* if it is now unset, set s_last_error_* if stored timestamp is newer than current timestamp. Or am I overengineering it now? :) Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR
On 2023/2/17 1:31, Jan Kara wrote: > On Tue 14-02-23 10:29:04, Ye Bin wrote: >> From: Ye Bin <yebin10@huawei.com> >> >> Now, 'es->s_state' maybe covered by recover journal. And journal errno >> maybe not recorded in journal sb as IO error. ext4_update_super() only >> update error information when 'sbi->s_add_error_count' large than zero. >> Then 'EXT4_ERROR_FS' flag maybe lost. >> To solve above issue commit error information after recover journal. >> >> Signed-off-by: Ye Bin <yebin10@huawei.com> >> --- >> fs/ext4/super.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >> index dc3907dff13a..b94754ba8556 100644 >> --- a/fs/ext4/super.c >> +++ b/fs/ext4/super.c >> @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, >> goto err_out; >> } >> >> + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && >> + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { >> + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; >> + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); >> + err = ext4_commit_super(sb); >> + if (err) { >> + ext4_msg(sb, KERN_ERR, >> + "Failed to commit error information, please repair fs force!"); >> + goto err_out; >> + } >> + } >> + > Hum, I'm not sure I follow here. If journal replay has overwritten the > superblock (and thus the stored error info), then I'd expect > es->s_error_count got overwritten (possibly to 0) as well. And this is > actually relatively realistic scenario with errors=remount-ro behavior when > the first fs error happens. > > What I intended in my original suggestion was to save es->s_error_count, > es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before > doing journal replay in ext4_load_journal() and then after journal replay > merge this info back to the superblock Actually,commit 1c13d5c08728 ("ext4: Save error information to the superblock for analysis") already merged error info back to the superblock after journal replay except 'es->s_state'. The problem I have now is that the error flag in the journal superblock was not recorded, but the error message was recorded in the superblock. So it leads to ext4_clear_journal_err() does not detect errors and marks the file system as an error. Because ext4_update_super() is only set error flag when 'sbi->s_add_error_count > 0'. Although 'sbi->s_mount_state' is written to the super block when umount, but it is also conditional. So I handle the scenario "es->s_error_count && !jbd2_journal_errno(journal) && !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS)". Maybe we can just store 'EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS' back to the superblock. But i prefer to mark fs as error if it contain detail error info without EXT4_ERROR_FS. > - if EXT4_ERROR_FS was set, set it > now as well, take max of old and new s_error_count, set s_first_error_* if > it is now unset, set s_last_error_* if stored timestamp is newer than > current timestamp. > > Or am I overengineering it now? :) > > Honza
On Fri 17-02-23 09:44:57, yebin (H) wrote: > On 2023/2/17 1:31, Jan Kara wrote: > > On Tue 14-02-23 10:29:04, Ye Bin wrote: > > > From: Ye Bin <yebin10@huawei.com> > > > > > > Now, 'es->s_state' maybe covered by recover journal. And journal errno > > > maybe not recorded in journal sb as IO error. ext4_update_super() only > > > update error information when 'sbi->s_add_error_count' large than zero. > > > Then 'EXT4_ERROR_FS' flag maybe lost. > > > To solve above issue commit error information after recover journal. > > > > > > Signed-off-by: Ye Bin <yebin10@huawei.com> > > > --- > > > fs/ext4/super.c | 12 ++++++++++++ > > > 1 file changed, 12 insertions(+) > > > > > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > > > index dc3907dff13a..b94754ba8556 100644 > > > --- a/fs/ext4/super.c > > > +++ b/fs/ext4/super.c > > > @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, > > > goto err_out; > > > } > > > + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && > > > + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { > > > + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; > > > + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > > > + err = ext4_commit_super(sb); > > > + if (err) { > > > + ext4_msg(sb, KERN_ERR, > > > + "Failed to commit error information, please repair fs force!"); > > > + goto err_out; > > > + } > > > + } > > > + > > Hum, I'm not sure I follow here. If journal replay has overwritten the > > superblock (and thus the stored error info), then I'd expect > > es->s_error_count got overwritten (possibly to 0) as well. And this is > > actually relatively realistic scenario with errors=remount-ro behavior when > > the first fs error happens. > > > > What I intended in my original suggestion was to save es->s_error_count, > > es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before > > doing journal replay in ext4_load_journal() and then after journal replay > > merge this info back to the superblock > Actually,commit 1c13d5c08728 ("ext4: Save error information to the > superblock for analysis") > already merged error info back to the superblock after journal replay except > 'es->s_state'. > The problem I have now is that the error flag in the journal superblock was > not recorded, > but the error message was recorded in the superblock. So it leads to > ext4_clear_journal_err() > does not detect errors and marks the file system as an error. Because > ext4_update_super() is > only set error flag when 'sbi->s_add_error_count > 0'. Although > 'sbi->s_mount_state' is > written to the super block when umount, but it is also conditional. > So I handle the scenario "es->s_error_count && !jbd2_journal_errno(journal) > && > !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS)". Maybe we can just store > 'EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS' back to the superblock. But i > prefer to mark fs as error if it contain detail error info without > EXT4_ERROR_FS. Aha, thanks for explanation! So now I finally understand what the problem exactly is. I'm not sure relying on es->s_error_count is too good. Probably it works but I'd be calmer if when saving error info we also did: bool error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); copy other info err = jbd2_journal_load(journal); restore other info if (error_fs) es->s_state |= cpu_to_le16(EXT4_ERROR_FS); /* Write out restored error information to the superblock */ err2 = ext4_commit_super(sb); and be done with this. I don't think trying to optimize away the committing of the superblock when we had to replay the journal is really worth it. Does this solve your concerns? Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR
On 2023/2/17 18:56, Jan Kara wrote: > On Fri 17-02-23 09:44:57, yebin (H) wrote: >> On 2023/2/17 1:31, Jan Kara wrote: >>> On Tue 14-02-23 10:29:04, Ye Bin wrote: >>>> From: Ye Bin <yebin10@huawei.com> >>>> >>>> Now, 'es->s_state' maybe covered by recover journal. And journal errno >>>> maybe not recorded in journal sb as IO error. ext4_update_super() only >>>> update error information when 'sbi->s_add_error_count' large than zero. >>>> Then 'EXT4_ERROR_FS' flag maybe lost. >>>> To solve above issue commit error information after recover journal. >>>> >>>> Signed-off-by: Ye Bin <yebin10@huawei.com> >>>> --- >>>> fs/ext4/super.c | 12 ++++++++++++ >>>> 1 file changed, 12 insertions(+) >>>> >>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>>> index dc3907dff13a..b94754ba8556 100644 >>>> --- a/fs/ext4/super.c >>>> +++ b/fs/ext4/super.c >>>> @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, >>>> goto err_out; >>>> } >>>> + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && >>>> + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { >>>> + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; >>>> + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); >>>> + err = ext4_commit_super(sb); >>>> + if (err) { >>>> + ext4_msg(sb, KERN_ERR, >>>> + "Failed to commit error information, please repair fs force!"); >>>> + goto err_out; >>>> + } >>>> + } >>>> + >>> Hum, I'm not sure I follow here. If journal replay has overwritten the >>> superblock (and thus the stored error info), then I'd expect >>> es->s_error_count got overwritten (possibly to 0) as well. And this is >>> actually relatively realistic scenario with errors=remount-ro behavior when >>> the first fs error happens. >>> >>> What I intended in my original suggestion was to save es->s_error_count, >>> es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before >>> doing journal replay in ext4_load_journal() and then after journal replay >>> merge this info back to the superblock >> Actually,commit 1c13d5c08728 ("ext4: Save error information to the >> superblock for analysis") >> already merged error info back to the superblock after journal replay except >> 'es->s_state'. >> The problem I have now is that the error flag in the journal superblock was >> not recorded, >> but the error message was recorded in the superblock. So it leads to >> ext4_clear_journal_err() >> does not detect errors and marks the file system as an error. Because >> ext4_update_super() is >> only set error flag when 'sbi->s_add_error_count > 0'. Although >> 'sbi->s_mount_state' is >> written to the super block when umount, but it is also conditional. >> So I handle the scenario "es->s_error_count && !jbd2_journal_errno(journal) >> && >> !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS)". Maybe we can just store >> 'EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS' back to the superblock. But i >> prefer to mark fs as error if it contain detail error info without >> EXT4_ERROR_FS. > Aha, thanks for explanation! So now I finally understand what the problem > exactly is. I'm not sure relying on es->s_error_count is too good. Probably > it works but I'd be calmer if when saving error info we also did: > > bool error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); > > copy other info > err = jbd2_journal_load(journal); > restore other info > if (error_fs) > es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > /* Write out restored error information to the superblock */ > err2 = ext4_commit_super(sb); > > and be done with this. I don't think trying to optimize away the committing > of the superblock when we had to replay the journal is really worth it. > > Does this solve your concerns? > > Honza Thanks for your suggestion. I think if journal super block has 'j_errno' ext4_clear_journal_err() will commit error info. The scenario we need to deal with is:(1) journal super block has no 'j_errno'; (2) super block has detail error info, but 'es->s_state' has no 'EXT4_ERROR_FS', It means super block in journal has no error flag and the newest super block has error flag. so we need to store error flag to 'es->s_state', and commit it to disk.If 'es->s_state' has 'EXT4_ERROR_FS', it means super block in journal has error flag, so we do not need to store error flag in super block. I don't know if I can explain my idea of repair.
On Sat 18-02-23 10:18:42, yebin (H) wrote: > On 2023/2/17 18:56, Jan Kara wrote: > > On Fri 17-02-23 09:44:57, yebin (H) wrote: > > > On 2023/2/17 1:31, Jan Kara wrote: > > > > On Tue 14-02-23 10:29:04, Ye Bin wrote: > > > > > From: Ye Bin <yebin10@huawei.com> > > > > > > > > > > Now, 'es->s_state' maybe covered by recover journal. And journal errno > > > > > maybe not recorded in journal sb as IO error. ext4_update_super() only > > > > > update error information when 'sbi->s_add_error_count' large than zero. > > > > > Then 'EXT4_ERROR_FS' flag maybe lost. > > > > > To solve above issue commit error information after recover journal. > > > > > > > > > > Signed-off-by: Ye Bin <yebin10@huawei.com> > > > > > --- > > > > > fs/ext4/super.c | 12 ++++++++++++ > > > > > 1 file changed, 12 insertions(+) > > > > > > > > > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > > > > > index dc3907dff13a..b94754ba8556 100644 > > > > > --- a/fs/ext4/super.c > > > > > +++ b/fs/ext4/super.c > > > > > @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, > > > > > goto err_out; > > > > > } > > > > > + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && > > > > > + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { > > > > > + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; > > > > > + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > > > > > + err = ext4_commit_super(sb); > > > > > + if (err) { > > > > > + ext4_msg(sb, KERN_ERR, > > > > > + "Failed to commit error information, please repair fs force!"); > > > > > + goto err_out; > > > > > + } > > > > > + } > > > > > + > > > > Hum, I'm not sure I follow here. If journal replay has overwritten the > > > > superblock (and thus the stored error info), then I'd expect > > > > es->s_error_count got overwritten (possibly to 0) as well. And this is > > > > actually relatively realistic scenario with errors=remount-ro behavior when > > > > the first fs error happens. > > > > > > > > What I intended in my original suggestion was to save es->s_error_count, > > > > es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before > > > > doing journal replay in ext4_load_journal() and then after journal replay > > > > merge this info back to the superblock > > > Actually,commit 1c13d5c08728 ("ext4: Save error information to the > > > superblock for analysis") > > > already merged error info back to the superblock after journal replay except > > > 'es->s_state'. > > > The problem I have now is that the error flag in the journal superblock was > > > not recorded, > > > but the error message was recorded in the superblock. So it leads to > > > ext4_clear_journal_err() > > > does not detect errors and marks the file system as an error. Because > > > ext4_update_super() is > > > only set error flag when 'sbi->s_add_error_count > 0'. Although > > > 'sbi->s_mount_state' is > > > written to the super block when umount, but it is also conditional. > > > So I handle the scenario "es->s_error_count && !jbd2_journal_errno(journal) > > > && > > > !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS)". Maybe we can just store > > > 'EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS' back to the superblock. But i > > > prefer to mark fs as error if it contain detail error info without > > > EXT4_ERROR_FS. > > Aha, thanks for explanation! So now I finally understand what the problem > > exactly is. I'm not sure relying on es->s_error_count is too good. Probably > > it works but I'd be calmer if when saving error info we also did: > > > > bool error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); > > > > copy other info > > err = jbd2_journal_load(journal); > > restore other info > > if (error_fs) > > es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > > /* Write out restored error information to the superblock */ > > err2 = ext4_commit_super(sb); > > > > and be done with this. I don't think trying to optimize away the committing > > of the superblock when we had to replay the journal is really worth it. > > > > Does this solve your concerns? > Thanks for your suggestion. > > I think if journal super block has 'j_errno' ext4_clear_journal_err() > will commit error info. The scenario we need to deal with is:(1) journal > super block has no 'j_errno'; (2) super block has detail error info, but > 'es->s_state' has no 'EXT4_ERROR_FS', It means super block in journal has > no error flag and the newest super block has error flag. But my code above should be handling this situation you describe - the check: error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); will check the newest state in the superblock before journal replay. Then we replay the journal - es->s_state may loose the EXT4_ERROR_FS flag if the superblock version in the journal didn't have it. So we do: if (error_fs) es->s_state |= cpu_to_le16(EXT4_ERROR_FS); which makes sure EXT4_ERROR_FS is set either if it was set in the journal or in the newest superblock version before journal replay. > so we need to > store error flag to 'es->s_state', and commit it to disk.If 'es->s_state' > has 'EXT4_ERROR_FS', it means super block in journal has error flag, so > we do not need to store error flag in super block. Why do you think that if es->s_state has EXT4_ERROR_FS, the super block in the journal has that flag? During mount, we load the superblock directly from the first block in the filesystem and until we call jbd2_journal_load(), es->s_state contains this newest value of the superblock state. What am I missing? Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR
On 2023/2/27 19:20, Jan Kara wrote: > On Sat 18-02-23 10:18:42, yebin (H) wrote: >> On 2023/2/17 18:56, Jan Kara wrote: >>> On Fri 17-02-23 09:44:57, yebin (H) wrote: >>>> On 2023/2/17 1:31, Jan Kara wrote: >>>>> On Tue 14-02-23 10:29:04, Ye Bin wrote: >>>>>> From: Ye Bin <yebin10@huawei.com> >>>>>> >>>>>> Now, 'es->s_state' maybe covered by recover journal. And journal errno >>>>>> maybe not recorded in journal sb as IO error. ext4_update_super() only >>>>>> update error information when 'sbi->s_add_error_count' large than zero. >>>>>> Then 'EXT4_ERROR_FS' flag maybe lost. >>>>>> To solve above issue commit error information after recover journal. >>>>>> >>>>>> Signed-off-by: Ye Bin <yebin10@huawei.com> >>>>>> --- >>>>>> fs/ext4/super.c | 12 ++++++++++++ >>>>>> 1 file changed, 12 insertions(+) >>>>>> >>>>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>>>>> index dc3907dff13a..b94754ba8556 100644 >>>>>> --- a/fs/ext4/super.c >>>>>> +++ b/fs/ext4/super.c >>>>>> @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, >>>>>> goto err_out; >>>>>> } >>>>>> + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && >>>>>> + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { >>>>>> + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; >>>>>> + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); >>>>>> + err = ext4_commit_super(sb); >>>>>> + if (err) { >>>>>> + ext4_msg(sb, KERN_ERR, >>>>>> + "Failed to commit error information, please repair fs force!"); >>>>>> + goto err_out; >>>>>> + } >>>>>> + } >>>>>> + >>>>> Hum, I'm not sure I follow here. If journal replay has overwritten the >>>>> superblock (and thus the stored error info), then I'd expect >>>>> es->s_error_count got overwritten (possibly to 0) as well. And this is >>>>> actually relatively realistic scenario with errors=remount-ro behavior when >>>>> the first fs error happens. >>>>> >>>>> What I intended in my original suggestion was to save es->s_error_count, >>>>> es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before >>>>> doing journal replay in ext4_load_journal() and then after journal replay >>>>> merge this info back to the superblock >>>> Actually,commit 1c13d5c08728 ("ext4: Save error information to the >>>> superblock for analysis") >>>> already merged error info back to the superblock after journal replay except >>>> 'es->s_state'. >>>> The problem I have now is that the error flag in the journal superblock was >>>> not recorded, >>>> but the error message was recorded in the superblock. So it leads to >>>> ext4_clear_journal_err() >>>> does not detect errors and marks the file system as an error. Because >>>> ext4_update_super() is >>>> only set error flag when 'sbi->s_add_error_count > 0'. Although >>>> 'sbi->s_mount_state' is >>>> written to the super block when umount, but it is also conditional. >>>> So I handle the scenario "es->s_error_count && !jbd2_journal_errno(journal) >>>> && >>>> !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS)". Maybe we can just store >>>> 'EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS' back to the superblock. But i >>>> prefer to mark fs as error if it contain detail error info without >>>> EXT4_ERROR_FS. >>> Aha, thanks for explanation! So now I finally understand what the problem >>> exactly is. I'm not sure relying on es->s_error_count is too good. Probably >>> it works but I'd be calmer if when saving error info we also did: >>> >>> bool error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); >>> >>> copy other info >>> err = jbd2_journal_load(journal); >>> restore other info >>> if (error_fs) >>> es->s_state |= cpu_to_le16(EXT4_ERROR_FS); >>> /* Write out restored error information to the superblock */ >>> err2 = ext4_commit_super(sb); >>> >>> and be done with this. I don't think trying to optimize away the committing >>> of the superblock when we had to replay the journal is really worth it. >>> >>> Does this solve your concerns? >> Thanks for your suggestion. >> >> I think if journal super block has 'j_errno' ext4_clear_journal_err() >> will commit error info. The scenario we need to deal with is:(1) journal >> super block has no 'j_errno'; (2) super block has detail error info, but >> 'es->s_state' has no 'EXT4_ERROR_FS', It means super block in journal has >> no error flag and the newest super block has error flag. > But my code above should be handling this situation you describe - the > check: > > error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); Here, we do not need to backup 'error_fs', as 'EXT4_SB(sb)->s_mount_state' already record this flag when fs has error flag before do journal replay. > will check the newest state in the superblock before journal replay. Then > we replay the journal - es->s_state may loose the EXT4_ERROR_FS flag if the > superblock version in the journal didn't have it. So we do: > > if (error_fs) > es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > > which makes sure EXT4_ERROR_FS is set either if it was set in the journal > or in the newest superblock version before journal replay. My modification is to deal with the situation we missed, and I don't want to introduce an invalid super block submission. If you think my judgment is too obscure, I can send another version according to your suggestion. >> so we need to >> store error flag to 'es->s_state', and commit it to disk.If 'es->s_state' >> has 'EXT4_ERROR_FS', it means super block in journal has error flag, so >> we do not need to store error flag in super block. > Why do you think that if es->s_state has EXT4_ERROR_FS, the super block in > the journal has that flag? During mount, we load the superblock directly > from the first block in the filesystem and until we call > jbd2_journal_load(), es->s_state contains this newest value of the > superblock state. What am I missing? After jbd2_journal_load() 'es->s_state' is already covered by the super block in journal. If there is super block in journal and 'es->s_state' has error flag this means super block in journal has error flag. If there is no super block in journal and 'es->s_state' has error flag, this means super block already has error flag.In both cases we can do nothing. > > Honza
On Tue 28-02-23 10:24:26, yebin (H) wrote: > On 2023/2/27 19:20, Jan Kara wrote: > > On Sat 18-02-23 10:18:42, yebin (H) wrote: > > > On 2023/2/17 18:56, Jan Kara wrote: > > > > On Fri 17-02-23 09:44:57, yebin (H) wrote: > > > > > On 2023/2/17 1:31, Jan Kara wrote: > > > > > > On Tue 14-02-23 10:29:04, Ye Bin wrote: > > > > > > > From: Ye Bin <yebin10@huawei.com> > > > > > > > > > > > > > > Now, 'es->s_state' maybe covered by recover journal. And journal errno > > > > > > > maybe not recorded in journal sb as IO error. ext4_update_super() only > > > > > > > update error information when 'sbi->s_add_error_count' large than zero. > > > > > > > Then 'EXT4_ERROR_FS' flag maybe lost. > > > > > > > To solve above issue commit error information after recover journal. > > > > > > > > > > > > > > Signed-off-by: Ye Bin <yebin10@huawei.com> > > > > > > > --- > > > > > > > fs/ext4/super.c | 12 ++++++++++++ > > > > > > > 1 file changed, 12 insertions(+) > > > > > > > > > > > > > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > > > > > > > index dc3907dff13a..b94754ba8556 100644 > > > > > > > --- a/fs/ext4/super.c > > > > > > > +++ b/fs/ext4/super.c > > > > > > > @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, > > > > > > > goto err_out; > > > > > > > } > > > > > > > + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && > > > > > > > + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { > > > > > > > + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; > > > > > > > + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > > > > > > > + err = ext4_commit_super(sb); > > > > > > > + if (err) { > > > > > > > + ext4_msg(sb, KERN_ERR, > > > > > > > + "Failed to commit error information, please repair fs force!"); > > > > > > > + goto err_out; > > > > > > > + } > > > > > > > + } > > > > > > > + > > > > > > Hum, I'm not sure I follow here. If journal replay has overwritten the > > > > > > superblock (and thus the stored error info), then I'd expect > > > > > > es->s_error_count got overwritten (possibly to 0) as well. And this is > > > > > > actually relatively realistic scenario with errors=remount-ro behavior when > > > > > > the first fs error happens. > > > > > > > > > > > > What I intended in my original suggestion was to save es->s_error_count, > > > > > > es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before > > > > > > doing journal replay in ext4_load_journal() and then after journal replay > > > > > > merge this info back to the superblock > > > > > Actually,commit 1c13d5c08728 ("ext4: Save error information to the > > > > > superblock for analysis") > > > > > already merged error info back to the superblock after journal replay except > > > > > 'es->s_state'. > > > > > The problem I have now is that the error flag in the journal superblock was > > > > > not recorded, > > > > > but the error message was recorded in the superblock. So it leads to > > > > > ext4_clear_journal_err() > > > > > does not detect errors and marks the file system as an error. Because > > > > > ext4_update_super() is > > > > > only set error flag when 'sbi->s_add_error_count > 0'. Although > > > > > 'sbi->s_mount_state' is > > > > > written to the super block when umount, but it is also conditional. > > > > > So I handle the scenario "es->s_error_count && !jbd2_journal_errno(journal) > > > > > && > > > > > !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS)". Maybe we can just store > > > > > 'EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS' back to the superblock. But i > > > > > prefer to mark fs as error if it contain detail error info without > > > > > EXT4_ERROR_FS. > > > > Aha, thanks for explanation! So now I finally understand what the problem > > > > exactly is. I'm not sure relying on es->s_error_count is too good. Probably > > > > it works but I'd be calmer if when saving error info we also did: > > > > > > > > bool error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); > > > > > > > > copy other info > > > > err = jbd2_journal_load(journal); > > > > restore other info > > > > if (error_fs) > > > > es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > > > > /* Write out restored error information to the superblock */ > > > > err2 = ext4_commit_super(sb); > > > > > > > > and be done with this. I don't think trying to optimize away the committing > > > > of the superblock when we had to replay the journal is really worth it. > > > > > > > > Does this solve your concerns? > > > Thanks for your suggestion. > > > > > > I think if journal super block has 'j_errno' ext4_clear_journal_err() > > > will commit error info. The scenario we need to deal with is:(1) journal > > > super block has no 'j_errno'; (2) super block has detail error info, but > > > 'es->s_state' has no 'EXT4_ERROR_FS', It means super block in journal has > > > no error flag and the newest super block has error flag. > > But my code above should be handling this situation you describe - the > > check: > > > > error_fs = es->s_state & cpu_to_le16(EXT4_ERROR_FS); > > Here, we do not need to backup 'error_fs', as > 'EXT4_SB(sb)->s_mount_state' already record this flag when fs has error > flag before do journal replay. Yeah, right. We don't need error_fs variable and can just look at EXT4_SB(sb)->s_mount_state. > > will check the newest state in the superblock before journal replay. Then > > we replay the journal - es->s_state may loose the EXT4_ERROR_FS flag if the > > superblock version in the journal didn't have it. So we do: > > > > if (error_fs) > > es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > > > > which makes sure EXT4_ERROR_FS is set either if it was set in the journal > > or in the newest superblock version before journal replay. > > My modification is to deal with the situation we missed, and I don't want > to introduce an invalid super block submission. If you think my judgment > is too obscure, I can send another version according to your suggestion. So is the extra superblock write your only concern? Honestly, I prefer code simplicity over saved one superblock write in case we had to replay the journal (which should be rare anyway). If you look at the code, we can writeout superblock several times in that mount path anyway. > > > so we need to > > > store error flag to 'es->s_state', and commit it to disk.If 'es->s_state' > > > has 'EXT4_ERROR_FS', it means super block in journal has error flag, so > > > we do not need to store error flag in super block. > > Why do you think that if es->s_state has EXT4_ERROR_FS, the super block in > > the journal has that flag? During mount, we load the superblock directly > > from the first block in the filesystem and until we call > > jbd2_journal_load(), es->s_state contains this newest value of the > > superblock state. What am I missing? > After jbd2_journal_load() 'es->s_state' is already covered by the super > block in journal. If there is super block in journal and 'es->s_state' > has error flag this means super block in journal has error flag. If there > is no super block in journal and 'es->s_state' has error flag, this means > super block already has error flag.In both cases we can do nothing. If what you wanted to say: "It is not necessary to write the superblock if EXT4_ERROR_FS is already set." I tend to agree although not 100% because journal replay could result in rewinding 's_last_error_*' fields and so writing superblock would still make sense even if EXT4_ERROR_FS is set in es->s_state after journal reply. That's why I wouldn't complicate things and just always write the superblock after journal replay. Honza -- Jan Kara <jack@suse.com> SUSE Labs, CR
On 2023/2/17 1:31, Jan Kara wrote: > On Tue 14-02-23 10:29:04, Ye Bin wrote: >> From: Ye Bin <yebin10@huawei.com> >> >> Now, 'es->s_state' maybe covered by recover journal. And journal errno >> maybe not recorded in journal sb as IO error. ext4_update_super() only >> update error information when 'sbi->s_add_error_count' large than zero. >> Then 'EXT4_ERROR_FS' flag maybe lost. >> To solve above issue commit error information after recover journal. >> >> Signed-off-by: Ye Bin <yebin10@huawei.com> >> --- >> fs/ext4/super.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) > Hum, I'm not sure I follow here. If journal replay has overwritten the > superblock (and thus the stored error info), then I'd expect > es->s_error_count got overwritten (possibly to 0) as well. And this is > actually relatively realistic scenario with errors=remount-ro behavior when > the first fs error happens. > > What I intended in my original suggestion was to save es->s_error_count, > es->s_state & EXT4_ERROR_FS, es->s_first_error_*, es->s_last_error_* before > doing journal replay in ext4_load_journal() and then after journal replay > merge this info back to the superblock - if EXT4_ERROR_FS was set, set it > now as well, take max of old and new s_error_count, set s_first_error_* if > it is now unset, set s_last_error_* if stored timestamp is newer than > current timestamp. > > Or am I overengineering it now? :) > > Honza This is exactly how the code is designed now! The code has now saved all the above information except EXT4_ERROR_FS by the following two pieces of logic, as follows: ---------------- In struct ext4_super_block ---------------- 1412 #define EXT4_S_ERR_START offsetof(struct ext4_super_block, s_error_count) 1413 __le32 s_error_count; /* number of fs errors */ 1414 __le32 s_first_error_time; /* first time an error happened */ 1415 __le32 s_first_error_ino; /* inode involved in first error */ 1416 __le64 s_first_error_block; /* block involved of first error */ 1417 __u8 s_first_error_func[32] __nonstring; /* function where the error happened */ 1418 __le32 s_first_error_line; /* line number where error happened */ 1419 __le32 s_last_error_time; /* most recent time of an error */ 1420 __le32 s_last_error_ino; /* inode involved in last error */ 1421 __le32 s_last_error_line; /* line number where error happened */ 1422 __le64 s_last_error_block; /* block involved of last error */ 1423 __u8 s_last_error_func[32] __nonstring; /* function where the error happened */ 1424 #define EXT4_S_ERR_END offsetof(struct ext4_super_block, s_mount_opts) ----------------------------------------------------------- ---------------- In ext4_load_journal() ---------------- 5929 char *save = kmalloc(EXT4_S_ERR_LEN, GFP_KERNEL); 5930 if (save) 5931 memcpy(save, ((char *) es) + 5932 EXT4_S_ERR_START, EXT4_S_ERR_LEN); 5933 err = jbd2_journal_load(journal); 5934 if (save) 5935 memcpy(((char *) es) + EXT4_S_ERR_START, 5936 save, EXT4_S_ERR_LEN); 5937 kfree(save); -------------------------------------------------------- As you said, we should also save EXT4_ERROR_FS to es->s_state. But we are not saving this now, so I think we just need to add `es->s_state |= cpu_to_le16(EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS);` to save the possible EXT4_ERROR_FS flag after copying the error area data to es. -- With Best Regards, Baokun Li .
On 2023/2/14 10:29, Ye Bin wrote: > From: Ye Bin <yebin10@huawei.com> > > Now, 'es->s_state' maybe covered by recover journal. And journal errno > maybe not recorded in journal sb as IO error. ext4_update_super() only > update error information when 'sbi->s_add_error_count' large than zero. > Then 'EXT4_ERROR_FS' flag maybe lost. > To solve above issue commit error information after recover journal. > > Signed-off-by: Ye Bin <yebin10@huawei.com> > --- > fs/ext4/super.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index dc3907dff13a..b94754ba8556 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct super_block *sb, > goto err_out; > } > > + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && > + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { > + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; > + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); > + err = ext4_commit_super(sb); > + if (err) { > + ext4_msg(sb, KERN_ERR, > + "Failed to commit error information, please repair fs force!"); > + goto err_out; > + } > + } > + > EXT4_SB(sb)->s_journal = journal; > err = ext4_clear_journal_err(sb, es); > if (err) { I think we don't need such a complicated judgment, after the journal replay and saving the error info, if there is EXT4_ERROR_FS flag in ext4_sb_info->s_mount_state, just add this flag directly to es->s_state. This way the EXT4_ERROR_FS flag and the error message will be written to disk the next time ext4_commit_super() is executed. The code change is as follows: diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 260c1b3e3ef2..341b11c589b3 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -5935,6 +5935,7 @@ static int ext4_load_journal(struct super_block *sb, memcpy(((char *) es) + EXT4_S_ERR_START, save, EXT4_S_ERR_LEN); kfree(save); + es->s_state |= cpu_to_le16(EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS); } if (err) { -- With Best Regards, Baokun Li .
On 2023/2/16 15:17, Baokun Li wrote: > On 2023/2/14 10:29, Ye Bin wrote: >> From: Ye Bin <yebin10@huawei.com> >> >> Now, 'es->s_state' maybe covered by recover journal. And journal errno >> maybe not recorded in journal sb as IO error. ext4_update_super() only >> update error information when 'sbi->s_add_error_count' large than zero. >> Then 'EXT4_ERROR_FS' flag maybe lost. >> To solve above issue commit error information after recover journal. >> >> Signed-off-by: Ye Bin <yebin10@huawei.com> >> --- >> fs/ext4/super.c | 12 ++++++++++++ >> 1 file changed, 12 insertions(+) >> >> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >> index dc3907dff13a..b94754ba8556 100644 >> --- a/fs/ext4/super.c >> +++ b/fs/ext4/super.c >> @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct >> super_block *sb, >> goto err_out; >> } >> + if (unlikely(es->s_error_count && !jbd2_journal_errno(journal) && >> + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { >> + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; >> + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); >> + err = ext4_commit_super(sb); >> + if (err) { >> + ext4_msg(sb, KERN_ERR, >> + "Failed to commit error information, please repair >> fs force!"); >> + goto err_out; >> + } >> + } >> + >> EXT4_SB(sb)->s_journal = journal; >> err = ext4_clear_journal_err(sb, es); >> if (err) { > I think we don't need such a complicated judgment, after the journal > replay and saving the error info, > if there is EXT4_ERROR_FS flag in ext4_sb_info->s_mount_state, just > add this flag directly to es->s_state. > This way the EXT4_ERROR_FS flag and the error message will be written > to disk the next time Thanks for your suggestion. There are two reasons for this: 1. We want to write the error mark to the disk as soon as possible. 2. Here we deal with the case where there is no error mark bit but there is an error record. In this case, the file system should be marked with an error and the user should be prompted. > ext4_commit_super() is executed. The code change is as follows: > > diff --git a/fs/ext4/super.c b/fs/ext4/super.c > index 260c1b3e3ef2..341b11c589b3 100644 > --- a/fs/ext4/super.c > +++ b/fs/ext4/super.c > @@ -5935,6 +5935,7 @@ static int ext4_load_journal(struct super_block > *sb, > memcpy(((char *) es) + EXT4_S_ERR_START, > save, EXT4_S_ERR_LEN); > kfree(save); > + es->s_state |= cpu_to_le16(EXT4_SB(sb)->s_mount_state > & EXT4_ERROR_FS); > } > > if (err) { >
On 2023/2/16 15:44, yebin (H) wrote: > > > On 2023/2/16 15:17, Baokun Li wrote: >> On 2023/2/14 10:29, Ye Bin wrote: >>> From: Ye Bin <yebin10@huawei.com> >>> >>> Now, 'es->s_state' maybe covered by recover journal. And journal errno >>> maybe not recorded in journal sb as IO error. ext4_update_super() only >>> update error information when 'sbi->s_add_error_count' large than zero. >>> Then 'EXT4_ERROR_FS' flag maybe lost. >>> To solve above issue commit error information after recover journal. >>> >>> Signed-off-by: Ye Bin <yebin10@huawei.com> >>> --- >>> fs/ext4/super.c | 12 ++++++++++++ >>> 1 file changed, 12 insertions(+) >>> >>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>> index dc3907dff13a..b94754ba8556 100644 >>> --- a/fs/ext4/super.c >>> +++ b/fs/ext4/super.c >>> @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct >>> super_block *sb, >>> goto err_out; >>> } >>> + if (unlikely(es->s_error_count && >>> !jbd2_journal_errno(journal) && >>> + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { >>> + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; >>> + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); >>> + err = ext4_commit_super(sb); >>> + if (err) { >>> + ext4_msg(sb, KERN_ERR, >>> + "Failed to commit error information, please repair >>> fs force!"); >>> + goto err_out; >>> + } >>> + } >>> + >>> EXT4_SB(sb)->s_journal = journal; >>> err = ext4_clear_journal_err(sb, es); >>> if (err) { >> I think we don't need such a complicated judgment, after the journal >> replay and saving the error info, >> if there is EXT4_ERROR_FS flag in ext4_sb_info->s_mount_state, just >> add this flag directly to es->s_state. >> This way the EXT4_ERROR_FS flag and the error message will be written >> to disk the next time > > Thanks for your suggestion. There are two reasons for this: > 1. We want to write the error mark to the disk as soon as possible. > 2. Here we deal with the case where there is no error mark bit but > there is an error record. > In this case, the file system should be marked with an error and the > user should be prompted. The EXT4_ERROR_FS flag is always written to disk at the same time as the error info, except when the journal is replayed. During journal replay the error info is additionally copied to disk first, and the EXT4_ERROR_FS flag in the sbi is not written to disk until the ext4_put_super() is called. It is only when a failure occurs during this time that there is an error info but no EXT4_ERROR_FS flag. So we just need to make sure that the EXT4_ERROR_FS flag is also written to disk at the same time as the error info after the journal replay. >> ext4_commit_super() is executed. The code change is as follows: >> >> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >> index 260c1b3e3ef2..341b11c589b3 100644 >> --- a/fs/ext4/super.c >> +++ b/fs/ext4/super.c >> @@ -5935,6 +5935,7 @@ static int ext4_load_journal(struct super_block >> *sb, >> memcpy(((char *) es) + EXT4_S_ERR_START, >> save, EXT4_S_ERR_LEN); >> kfree(save); >> + es->s_state |= cpu_to_le16(EXT4_SB(sb)->s_mount_state >> & EXT4_ERROR_FS); >> } >> >> if (err) { >> > > > -- With Best Regards, Baokun Li .
On 2023/2/16 17:17, Baokun Li wrote: > On 2023/2/16 15:44, yebin (H) wrote: >> >> >> On 2023/2/16 15:17, Baokun Li wrote: >>> On 2023/2/14 10:29, Ye Bin wrote: >>>> From: Ye Bin <yebin10@huawei.com> >>>> >>>> Now, 'es->s_state' maybe covered by recover journal. And journal errno >>>> maybe not recorded in journal sb as IO error. ext4_update_super() only >>>> update error information when 'sbi->s_add_error_count' large than >>>> zero. >>>> Then 'EXT4_ERROR_FS' flag maybe lost. >>>> To solve above issue commit error information after recover journal. >>>> >>>> Signed-off-by: Ye Bin <yebin10@huawei.com> >>>> --- >>>> fs/ext4/super.c | 12 ++++++++++++ >>>> 1 file changed, 12 insertions(+) >>>> >>>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>>> index dc3907dff13a..b94754ba8556 100644 >>>> --- a/fs/ext4/super.c >>>> +++ b/fs/ext4/super.c >>>> @@ -5932,6 +5932,18 @@ static int ext4_load_journal(struct >>>> super_block *sb, >>>> goto err_out; >>>> } >>>> + if (unlikely(es->s_error_count && >>>> !jbd2_journal_errno(journal) && >>>> + !(le16_to_cpu(es->s_state) & EXT4_ERROR_FS))) { >>>> + EXT4_SB(sb)->s_mount_state |= EXT4_ERROR_FS; >>>> + es->s_state |= cpu_to_le16(EXT4_ERROR_FS); >>>> + err = ext4_commit_super(sb); >>>> + if (err) { >>>> + ext4_msg(sb, KERN_ERR, >>>> + "Failed to commit error information, please >>>> repair fs force!"); >>>> + goto err_out; >>>> + } >>>> + } >>>> + >>>> EXT4_SB(sb)->s_journal = journal; >>>> err = ext4_clear_journal_err(sb, es); >>>> if (err) { >>> I think we don't need such a complicated judgment, after the journal >>> replay and saving the error info, >>> if there is EXT4_ERROR_FS flag in ext4_sb_info->s_mount_state, just >>> add this flag directly to es->s_state. >>> This way the EXT4_ERROR_FS flag and the error message will be >>> written to disk the next time >> >> Thanks for your suggestion. There are two reasons for this: >> 1. We want to write the error mark to the disk as soon as possible. >> 2. Here we deal with the case where there is no error mark bit but >> there is an error record. >> In this case, the file system should be marked with an error and the >> user should be prompted. > The EXT4_ERROR_FS flag is always written to disk at the same time as > the error info, > except when the journal is replayed. During journal replay the error > info is additionally > copied to disk first, and the EXT4_ERROR_FS flag in the sbi is not > written to disk until > the ext4_put_super() is called. It is only when a failure occurs > during this time that > there is an error info but no EXT4_ERROR_FS flag. So we just need to > make sure that > the EXT4_ERROR_FS flag is also written to disk at the same time as the > error info > after the journal replay. The situation you said is based on the situation after the repair. What about the existing image with such inconsistency? >>> ext4_commit_super() is executed. The code change is as follows: >>> >>> diff --git a/fs/ext4/super.c b/fs/ext4/super.c >>> index 260c1b3e3ef2..341b11c589b3 100644 >>> --- a/fs/ext4/super.c >>> +++ b/fs/ext4/super.c >>> @@ -5935,6 +5935,7 @@ static int ext4_load_journal(struct >>> super_block *sb, >>> memcpy(((char *) es) + EXT4_S_ERR_START, >>> save, EXT4_S_ERR_LEN); >>> kfree(save); >>> + es->s_state |= >>> cpu_to_le16(EXT4_SB(sb)->s_mount_state & EXT4_ERROR_FS); >>> } >>> >>> if (err) { >>> >> >> >>
© 2016 - 2025 Red Hat, Inc.