[PATCH] writeback: Avoid skipping inode writeback

Jing Xia posted 1 patch 4 years ago
There is a newer version of this series
fs/fs-writeback.c | 3 +++
1 file changed, 3 insertions(+)
[PATCH] writeback: Avoid skipping inode writeback
Posted by Jing Xia 4 years ago
We have run into an issue that a task gets stuck in
balance_dirty_pages_ratelimited() when perform I/O stress testing.
The reason we observed is that an I_DIRTY_PAGES inode with lots
of dirty pages is in b_dirty_time list and standard background
writeback cannot writeback the inode.
After studing the relevant code, the following scenario may lead
to the issue:

task1                                   task2
-----                                   -----
fuse_flush
 write_inode_now //in b_dirty_time
  writeback_single_inode
   __writeback_single_inode
                                 fuse_write_end
                                  filemap_dirty_folio
                                   __xa_set_mark:PAGECACHE_TAG_DIRTY
    lock inode->i_lock
    if mapping tagged PAGECACHE_TAG_DIRTY
    inode->i_state |= I_DIRTY_PAGES
    unlock inode->i_lock
                                   __mark_inode_dirty:I_DIRTY_PAGES
                                      lock inode->i_lock
                                      -was dirty,inode stays in
                                      -b_dirty_time
                                      unlock inode->i_lock

   if(!(inode->i_state & I_DIRTY_All))
      -not true,so nothing done

This patch moves the dirty inode to b_dirty list when the inode
currently is not queued in b_io or b_more_io list at the end of
writeback_single_inode.

Signed-off-by: Jing Xia <jing.xia@unisoc.com>
---
 fs/fs-writeback.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
index 591fe9cf1659..d7763feaf14a 100644
--- a/fs/fs-writeback.c
+++ b/fs/fs-writeback.c
@@ -1712,6 +1712,9 @@ static int writeback_single_inode(struct inode *inode,
 	 */
 	if (!(inode->i_state & I_DIRTY_ALL))
 		inode_cgwb_move_to_attached(inode, wb);
+	else if (!(inode->i_state & I_SYNC_QUEUED) && (inode->i_state & I_DIRTY))
+		redirty_tail_locked(inode, wb);
+
 	spin_unlock(&wb->list_lock);
 	inode_sync_complete(inode);
 out:
-- 
2.17.1
Re: [PATCH] writeback: Avoid skipping inode writeback
Posted by Christoph Hellwig 4 years ago
On Thu, May 05, 2022 at 09:47:31PM +0800, Jing Xia wrote:
>  	if (!(inode->i_state & I_DIRTY_ALL))
>  		inode_cgwb_move_to_attached(inode, wb);
> +	else if (!(inode->i_state & I_SYNC_QUEUED) && (inode->i_state & I_DIRTY))

Please turn this into

	else if ((inode->i_state & I_DIRTY) &&
	         !(inode->i_state & I_SYNC_QUEUED))

to keep it a little more readable.

Otherwise looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>
Re: [PATCH] writeback: Avoid skipping inode writeback
Posted by jing xia 4 years ago
On Mon, May 9, 2022 at 2:46 PM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Thu, May 05, 2022 at 09:47:31PM +0800, Jing Xia wrote:
> >       if (!(inode->i_state & I_DIRTY_ALL))
> >               inode_cgwb_move_to_attached(inode, wb);
> > +     else if (!(inode->i_state & I_SYNC_QUEUED) && (inode->i_state & I_DIRTY))
>
> Please turn this into
>
>         else if ((inode->i_state & I_DIRTY) &&
>                  !(inode->i_state & I_SYNC_QUEUED))
>
> to keep it a little more readable.
>
> Otherwise looks good:
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>

Ok. And thanks for the review.
Re: [PATCH] writeback: Avoid skipping inode writeback
Posted by Jan Kara 4 years ago
On Thu 05-05-22 21:47:31, Jing Xia wrote:
> We have run into an issue that a task gets stuck in
> balance_dirty_pages_ratelimited() when perform I/O stress testing.
> The reason we observed is that an I_DIRTY_PAGES inode with lots
> of dirty pages is in b_dirty_time list and standard background
> writeback cannot writeback the inode.
> After studing the relevant code, the following scenario may lead
> to the issue:
> 
> task1                                   task2
> -----                                   -----
> fuse_flush
>  write_inode_now //in b_dirty_time
>   writeback_single_inode
>    __writeback_single_inode
>                                  fuse_write_end
>                                   filemap_dirty_folio
>                                    __xa_set_mark:PAGECACHE_TAG_DIRTY
>     lock inode->i_lock
>     if mapping tagged PAGECACHE_TAG_DIRTY
>     inode->i_state |= I_DIRTY_PAGES
>     unlock inode->i_lock
>                                    __mark_inode_dirty:I_DIRTY_PAGES
>                                       lock inode->i_lock
>                                       -was dirty,inode stays in
>                                       -b_dirty_time
>                                       unlock inode->i_lock
> 
>    if(!(inode->i_state & I_DIRTY_All))
>       -not true,so nothing done
> 
> This patch moves the dirty inode to b_dirty list when the inode
> currently is not queued in b_io or b_more_io list at the end of
> writeback_single_inode.
> 
> Signed-off-by: Jing Xia <jing.xia@unisoc.com>

Thanks for report and the fix! The patch looks good so feel free to add:

Reviewed-by: Jan Kara <jack@suse.cz>

Also please add tags:

CC: stable@vger.kernel.org
Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")

Thanks.
								Honza

> diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> index 591fe9cf1659..d7763feaf14a 100644
> --- a/fs/fs-writeback.c
> +++ b/fs/fs-writeback.c
> @@ -1712,6 +1712,9 @@ static int writeback_single_inode(struct inode *inode,
>  	 */
>  	if (!(inode->i_state & I_DIRTY_ALL))
>  		inode_cgwb_move_to_attached(inode, wb);
> +	else if (!(inode->i_state & I_SYNC_QUEUED) && (inode->i_state & I_DIRTY))
> +		redirty_tail_locked(inode, wb);
> +
>  	spin_unlock(&wb->list_lock);
>  	inode_sync_complete(inode);
>  out:
> -- 
> 2.17.1
> 
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR
Re: [PATCH] writeback: Avoid skipping inode writeback
Posted by jing xia 4 years ago
Thanks, I'll update the patch.

On Thu, May 5, 2022 at 11:40 PM Jan Kara <jack@suse.cz> wrote:
>
> On Thu 05-05-22 21:47:31, Jing Xia wrote:
> > We have run into an issue that a task gets stuck in
> > balance_dirty_pages_ratelimited() when perform I/O stress testing.
> > The reason we observed is that an I_DIRTY_PAGES inode with lots
> > of dirty pages is in b_dirty_time list and standard background
> > writeback cannot writeback the inode.
> > After studing the relevant code, the following scenario may lead
> > to the issue:
> >
> > task1                                   task2
> > -----                                   -----
> > fuse_flush
> >  write_inode_now //in b_dirty_time
> >   writeback_single_inode
> >    __writeback_single_inode
> >                                  fuse_write_end
> >                                   filemap_dirty_folio
> >                                    __xa_set_mark:PAGECACHE_TAG_DIRTY
> >     lock inode->i_lock
> >     if mapping tagged PAGECACHE_TAG_DIRTY
> >     inode->i_state |= I_DIRTY_PAGES
> >     unlock inode->i_lock
> >                                    __mark_inode_dirty:I_DIRTY_PAGES
> >                                       lock inode->i_lock
> >                                       -was dirty,inode stays in
> >                                       -b_dirty_time
> >                                       unlock inode->i_lock
> >
> >    if(!(inode->i_state & I_DIRTY_All))
> >       -not true,so nothing done
> >
> > This patch moves the dirty inode to b_dirty list when the inode
> > currently is not queued in b_io or b_more_io list at the end of
> > writeback_single_inode.
> >
> > Signed-off-by: Jing Xia <jing.xia@unisoc.com>
>
> Thanks for report and the fix! The patch looks good so feel free to add:
>
> Reviewed-by: Jan Kara <jack@suse.cz>
>
> Also please add tags:
>
> CC: stable@vger.kernel.org
> Fixes: 0ae45f63d4ef ("vfs: add support for a lazytime mount option")
>
> Thanks.
>                                                                 Honza
>
> > diff --git a/fs/fs-writeback.c b/fs/fs-writeback.c
> > index 591fe9cf1659..d7763feaf14a 100644
> > --- a/fs/fs-writeback.c
> > +++ b/fs/fs-writeback.c
> > @@ -1712,6 +1712,9 @@ static int writeback_single_inode(struct inode *inode,
> >        */
> >       if (!(inode->i_state & I_DIRTY_ALL))
> >               inode_cgwb_move_to_attached(inode, wb);
> > +     else if (!(inode->i_state & I_SYNC_QUEUED) && (inode->i_state & I_DIRTY))
> > +             redirty_tail_locked(inode, wb);
> > +
> >       spin_unlock(&wb->list_lock);
> >       inode_sync_complete(inode);
> >  out:
> > --
> > 2.17.1
> >
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR