fs/nfs/internal.h | 9 ++++++--- 1 file changed, 6 insertions(+), 3 deletions(-)
If the NFS client is doing writeback from a workqueue context, avoid using
__GFP_NORETRY for allocations if the task has set PF_MEMALLOC_NOIO or
PF_MEMALLOC_NOFS. The combination of these flags makes memory allocation
failures much more likely.
We've seen those allocation failures show up when the loopback driver is
doing writeback from a workqueue to a file on NFS, where memory allocation
failure results in errors or corruption within the loopback device's
filesystem.
Suggested-by: Trond Myklebust <trondmy@kernel.org>
Fixes: 0bae835b63c5 ("NFS: Avoid writeback threads getting stuck in mempool_alloc()")
Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
Reviewed-by: Laurence Oberman <loberman@redhat.com>
Tested-by: Laurence Oberman <loberman@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
---
On V3: fix ugly return (Thanks Paulo), add Jeff's R-b
On V2: add missing 'Fixes' and Laurence's R-b T-b
fs/nfs/internal.h | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)
diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
index 69c2c10ee658..d8f768254f16 100644
--- a/fs/nfs/internal.h
+++ b/fs/nfs/internal.h
@@ -671,9 +671,12 @@ nfs_write_match_verf(const struct nfs_writeverf *verf,
static inline gfp_t nfs_io_gfp_mask(void)
{
- if (current->flags & PF_WQ_WORKER)
- return GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN;
- return GFP_KERNEL;
+ gfp_t ret = current_gfp_context(GFP_KERNEL);
+
+ /* For workers __GFP_NORETRY only with __GFP_IO or __GFP_FS */
+ if ((current->flags & PF_WQ_WORKER) && ret == GFP_KERNEL)
+ ret |= __GFP_NORETRY | __GFP_NOWARN;
+ return ret;
}
/*
--
2.47.0
Gentle ping on this one - let me know if anyone has further review or
critique.
Ben
On 9 Jul 2025, at 21:47, Benjamin Coddington wrote:
> If the NFS client is doing writeback from a workqueue context, avoid using
> __GFP_NORETRY for allocations if the task has set PF_MEMALLOC_NOIO or
> PF_MEMALLOC_NOFS. The combination of these flags makes memory allocation
> failures much more likely.
>
> We've seen those allocation failures show up when the loopback driver is
> doing writeback from a workqueue to a file on NFS, where memory allocation
> failure results in errors or corruption within the loopback device's
> filesystem.
>
> Suggested-by: Trond Myklebust <trondmy@kernel.org>
> Fixes: 0bae835b63c5 ("NFS: Avoid writeback threads getting stuck in mempool_alloc()")
> Signed-off-by: Benjamin Coddington <bcodding@redhat.com>
> Reviewed-by: Laurence Oberman <loberman@redhat.com>
> Tested-by: Laurence Oberman <loberman@redhat.com>
> Reviewed-by: Jeff Layton <jlayton@kernel.org>
> ---
>
> On V3: fix ugly return (Thanks Paulo), add Jeff's R-b
> On V2: add missing 'Fixes' and Laurence's R-b T-b
>
> fs/nfs/internal.h | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/fs/nfs/internal.h b/fs/nfs/internal.h
> index 69c2c10ee658..d8f768254f16 100644
> --- a/fs/nfs/internal.h
> +++ b/fs/nfs/internal.h
> @@ -671,9 +671,12 @@ nfs_write_match_verf(const struct nfs_writeverf *verf,
>
> static inline gfp_t nfs_io_gfp_mask(void)
> {
> - if (current->flags & PF_WQ_WORKER)
> - return GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN;
> - return GFP_KERNEL;
> + gfp_t ret = current_gfp_context(GFP_KERNEL);
> +
> + /* For workers __GFP_NORETRY only with __GFP_IO or __GFP_FS */
> + if ((current->flags & PF_WQ_WORKER) && ret == GFP_KERNEL)
> + ret |= __GFP_NORETRY | __GFP_NOWARN;
> + return ret;
> }
>
> /*
> --
> 2.47.0
On Wed, Jul 09, 2025 at 09:47:43PM -0400, Benjamin Coddington wrote: > If the NFS client is doing writeback from a workqueue context, avoid using > __GFP_NORETRY for allocations if the task has set PF_MEMALLOC_NOIO or > PF_MEMALLOC_NOFS. The combination of these flags makes memory allocation > failures much more likely. Can we take a step back and figre out why this blanket usage of __GFP_NORETRY exists at all?
On 10 Jul 2025, at 3:21, Christoph Hellwig wrote: > On Wed, Jul 09, 2025 at 09:47:43PM -0400, Benjamin Coddington wrote: >> If the NFS client is doing writeback from a workqueue context, avoid using >> __GFP_NORETRY for allocations if the task has set PF_MEMALLOC_NOIO or >> PF_MEMALLOC_NOFS. The combination of these flags makes memory allocation >> failures much more likely. > > Can we take a step back and figre out why this blanket usage of > __GFP_NORETRY exists at all? Added in 515dcdcd48736 there's a decent explanation which boils down to: its usually OK for nfsiod to have an allocation failure, we want it to fail quickly and not get hung up waiting for an allocation. Ben
© 2016 - 2025 Red Hat, Inc.