afs: Fix dangling folio ref counts in writeback | Patchew

[PATCH] afs: Fix dangling folio ref counts in writeback

Vishal Moola (Oracle) posted 1 patch 2 years, 8 months ago

Download mbox

fs/afs/write.c | 1 +
1 file changed, 1 insertion(+)

Expand all Fold all

[PATCH] afs: Fix dangling folio ref counts in writeback

Posted by Vishal Moola (Oracle) 2 years, 8 months ago

Commit acc8d8588cb7 converted afs_writepages_region() to write back a
folio batch. If writeback needs rescheduling, the function exits without
dropping the references to the folios in fbatch. This patch fixes that.

This has only been compile tested.

Fixes: acc8d8588cb7 ("afs: convert afs_writepages_region() to use filemap_get_folios_tag()")
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
---
 fs/afs/write.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index c822d6006033..a724228e4d94 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -764,6 +764,7 @@ static int afs_writepages_region(struct address_space *mapping,
 					if (skips >= 5 || need_resched()) {
 						*_next = start;
 						_leave(" = 0 [%llx]", *_next);
+						folio_batch_release(&fbatch);
 						return 0;
 					}
 					skips++;
-- 
2.40.1

Re: [PATCH] afs: Fix dangling folio ref counts in writeback

Posted by David Howells 2 years, 7 months ago

Vishal Moola (Oracle) <vishal.moola@gmail.com> wrote:

>  					if (skips >= 5 || need_resched()) {
>  						*_next = start;
>  						_leave(" = 0 [%llx]", *_next);
> +						folio_batch_release(&fbatch);

This should go before the _leave().

>  						return 0;
>  					}

Looks okay otherwise.

David

[PATCH] afs: Fix waiting for writeback then skipping folio

Posted by Vishal Moola (Oracle) 2 years, 8 months ago

Commit acc8d8588cb7 converted afs_writepages_region() to write back a
folio batch. The function waits for writeback to a folio, but then
proceeds to the rest of the batch without trying to write that folio
again. This patch fixes has it attempt to write the folio again.

This has only been compile tested.

Fixes: acc8d8588cb7 ("afs: convert afs_writepages_region() to use filemap_get_folios_tag()")
Signed-off-by: Vishal Moola (Oracle) <vishal.moola@gmail.com>
---
 fs/afs/write.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/fs/afs/write.c b/fs/afs/write.c
index a724228e4d94..18ccb613dff8 100644
--- a/fs/afs/write.c
+++ b/fs/afs/write.c
@@ -731,6 +731,7 @@ static int afs_writepages_region(struct address_space *mapping,
 			 * (changing page->mapping to NULL), or even swizzled
 			 * back from swapper_space to tmpfs file mapping
 			 */
+try_again:
 			if (wbc->sync_mode != WB_SYNC_NONE) {
 				ret = folio_lock_killable(folio);
 				if (ret < 0) {
@@ -757,6 +758,7 @@ static int afs_writepages_region(struct address_space *mapping,
 #ifdef CONFIG_AFS_FSCACHE
 					folio_wait_fscache(folio);
 #endif
+					goto try_again;
 				} else {
 					start += folio_size(folio);
 				}
-- 
2.40.1

Re: [PATCH] afs: Fix waiting for writeback then skipping folio

Posted by David Howells 2 years, 7 months ago

Vishal Moola (Oracle) <vishal.moola@gmail.com> wrote:

> +					goto try_again;
>  				} else {
>  					start += folio_size(folio);

The "else" is then redundant.

David

Re: [PATCH] afs: Fix waiting for writeback then skipping folio

Posted by Andrew Morton 2 years, 8 months ago

On Wed,  7 Jun 2023 13:41:20 -0700 "Vishal Moola (Oracle)" <vishal.moola@gmail.com> wrote:

> Commit acc8d8588cb7 converted afs_writepages_region() to write back a
> folio batch. The function waits for writeback to a folio, but then
> proceeds to the rest of the batch without trying to write that folio
> again. This patch fixes has it attempt to write the folio again.
> 
> This has only been compile tested.

This seems fairly serious?

> --- a/fs/afs/write.c
> +++ b/fs/afs/write.c
> @@ -731,6 +731,7 @@ static int afs_writepages_region(struct address_space *mapping,
>  			 * (changing page->mapping to NULL), or even swizzled
>  			 * back from swapper_space to tmpfs file mapping
>  			 */
> +try_again:
>  			if (wbc->sync_mode != WB_SYNC_NONE) {
>  				ret = folio_lock_killable(folio);
>  				if (ret < 0) {
> @@ -757,6 +758,7 @@ static int afs_writepages_region(struct address_space *mapping,
>  #ifdef CONFIG_AFS_FSCACHE
>  					folio_wait_fscache(folio);
>  #endif
> +					goto try_again;
>  				} else {
>  					start += folio_size(folio);
>  				}

From my reading, we'll fail to write out the dirty data.  Presumably
not easily observable, as it will get written out again later on.  But
we're also calling afs_write_back_from_locked_folio() with an unlocked
folio, which might cause mayhem.

So I'm suspecting that a cc:stable is needed.  David, could you please
take a look and perhaps retest?

Thanks.

Re: [PATCH] afs: Fix waiting for writeback then skipping folio

Posted by David Howells 2 years, 7 months ago

Andrew Morton <akpm@linux-foundation.org> wrote:

> > Commit acc8d8588cb7 converted afs_writepages_region() to write back a
> > folio batch. The function waits for writeback to a folio, but then
> > proceeds to the rest of the batch without trying to write that folio
> > again. This patch fixes has it attempt to write the folio again.
> > 
> > This has only been compile tested.
> 
> This seems fairly serious?

We will try to write the again later, but sync()/fsync() might now have
skipped it.

> From my reading, we'll fail to write out the dirty data.  Presumably
> not easily observable, as it will get written out again later on.

As it's a network filesystem, interactions with third parties could cause
apparent corruption.  Closing a file will flush it - but if there's a
simultaneous op of some other kind, a bit of a flush or a sync may get missed
and the copy visible to another user be temporarily missing that bit.

> But we're also calling afs_write_back_from_locked_folio() with an unlocked
> folio, which might cause mayhem.

Without this patch, you mean?  There's a "continue" statement that should send
us back to the top of the loop before we get as far as
afs_write_back_from_locked_folio() - and then the folio_unlock() there would
go bang.

David

Re: [PATCH] afs: Fix waiting for writeback then skipping folio

Posted by Andrew Morton 2 years, 7 months ago

On Fri, 16 Jun 2023 23:43:02 +0100 David Howells <dhowells@redhat.com> wrote:

> Andrew Morton <akpm@linux-foundation.org> wrote:
> 
> > > Commit acc8d8588cb7 converted afs_writepages_region() to write back a
> > > folio batch. The function waits for writeback to a folio, but then
> > > proceeds to the rest of the batch without trying to write that folio
> > > again. This patch fixes has it attempt to write the folio again.
> > > 
> > > This has only been compile tested.
> > 
> > This seems fairly serious?
> 
> We will try to write the again later, but sync()/fsync() might now have
> skipped it.
> 
> > From my reading, we'll fail to write out the dirty data.  Presumably
> > not easily observable, as it will get written out again later on.
> 
> As it's a network filesystem, interactions with third parties could cause
> apparent corruption.  Closing a file will flush it - but if there's a
> simultaneous op of some other kind, a bit of a flush or a sync may get missed
> and the copy visible to another user be temporarily missing that bit.
> 
> > But we're also calling afs_write_back_from_locked_folio() with an unlocked
> > folio, which might cause mayhem.
> 
> Without this patch, you mean?  There's a "continue" statement that should send
> us back to the top of the loop before we get as far as
> afs_write_back_from_locked_folio() - and then the folio_unlock() there would
> go bang.
> 

Well, what I'm really asking is the thing I ask seven times a day:

- what are the end-user visible effects of the bug

- should be fix be backported into earlier kernels

Re: [PATCH] afs: Fix waiting for writeback then skipping folio

Posted by David Howells 2 years, 7 months ago

Andrew Morton <akpm@linux-foundation.org> wrote:

> Well, what I'm really asking is the thing I ask seven times a day:
> 
> - what are the end-user visible effects of the bug

A third party might see an incomplete flush after they've done a sync - which
amounts to temporary file corruption.

> - should be fix be backported into earlier kernels

Yes.

David