[PATCH RFC 0/2] nfs: free leftover lsegs before freeing a layout in pnfs_put_layout_hdr

Jeff Layton posted 2 patches 9 months, 2 weeks ago
fs/nfs/pnfs.c | 23 +++++++++++++++++++++--
1 file changed, 21 insertions(+), 2 deletions(-)
[PATCH RFC 0/2] nfs: free leftover lsegs before freeing a layout in pnfs_put_layout_hdr
Posted by Jeff Layton 9 months, 2 weeks ago
Sending this as an RFC as I don't have a reliable reproducer for the
problem that Omar reported. I'm also not sure this is the best fix for
the problem. There is probably a case to be made that the real bug is in
the error handling for pnfs_layoutreturn_before_put_layout_hdr().

My guess is that the issue is that we end up with entries on the
plh_return_segs list just before the network goes down. That causes the
LAYOUTRETURN to fail with something that looks retryable, and the lsegs
on the list aren't freed.

It's possible that we just need to catch ENETUNREACH in the LAYOUTRETURN
error handling, but I'm not sure I correctly understand the problem. If
entries are racing onto the list just before the refcount decrement,
then that wouldn't fix it.

The first patch should fix the issue of the leaked lsegs, and the second
should let us know if it ever crops up again.

Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
Jeff Layton (2):
      nfs: free leftover lsegs before freeing a layout in pnfs_put_layout_hdr
      nfs: pr_warn if plh_segs or plh_return_segs are non-empty when freeing

 fs/nfs/pnfs.c | 23 +++++++++++++++++++++--
 1 file changed, 21 insertions(+), 2 deletions(-)
---
base-commit: 5bc1018675ec28a8a60d83b378d8c3991faa5a27
change-id: 20250428-nfs-6-16-87062aa2989d

Best regards,
-- 
Jeff Layton <jlayton@kernel.org>
Re: [PATCH RFC 0/2] nfs: free leftover lsegs before freeing a layout in pnfs_put_layout_hdr
Posted by Jeff Layton 9 months, 1 week ago
On Mon, 2025-04-28 at 13:24 -0700, Jeff Layton wrote:
> Sending this as an RFC as I don't have a reliable reproducer for the
> problem that Omar reported. I'm also not sure this is the best fix for
> the problem. There is probably a case to be made that the real bug is in
> the error handling for pnfs_layoutreturn_before_put_layout_hdr().
> 
> My guess is that the issue is that we end up with entries on the
> plh_return_segs list just before the network goes down. That causes the
> LAYOUTRETURN to fail with something that looks retryable, and the lsegs
> on the list aren't freed.
> 
> It's possible that we just need to catch ENETUNREACH in the LAYOUTRETURN
> error handling, but I'm not sure I correctly understand the problem. If
> entries are racing onto the list just before the refcount decrement,
> then that wouldn't fix it.
> 
> The first patch should fix the issue of the leaked lsegs, and the second
> should let us know if it ever crops up again.
> 
> Signed-off-by: Jeff Layton <jlayton@kernel.org>
> ---
> Jeff Layton (2):
>       nfs: free leftover lsegs before freeing a layout in pnfs_put_layout_hdr
>       nfs: pr_warn if plh_segs or plh_return_segs are non-empty when freeing
> 
>  fs/nfs/pnfs.c | 23 +++++++++++++++++++++--
>  1 file changed, 21 insertions(+), 2 deletions(-)
> ---
> base-commit: 5bc1018675ec28a8a60d83b378d8c3991faa5a27
> change-id: 20250428-nfs-6-16-87062aa2989d
> 
> Best regards,

Trond, Anna, ping? This seems like the right thing to do, but I'd
appreciate a second (and third) set of eyes on this.

Thanks,
-- 
Jeff Layton <jlayton@kernel.org>