Fixup NLM and kNFSD file lock callbacks

[PATCH v1 0/4] Fixup NLM and kNFSD file lock callbacks

Posted by Benjamin Coddington 1 year, 4 months ago

Last year both GFS2 and OCFS2 had some work done to make their locking more
robust when exported over NFS.  Unfortunately, part of that work caused both
NLM (for NFS v3 exports) and kNFSD (for NFSv4.1+ exports) to no longer send
lock notifications to clients.

This in itself is not a huge problem because most NFS clients will still
poll the server in order to acquire a conflicted lock, but now that I've
noticed it I can't help but try to fix it because there are big advantages
for setups that might depend on timely lock notifications, and we've
supported that as a feature for a long time.

Its important for NLM and kNFSD that they do not block their kernel threads
inside filesystem's file_lock implementations because that can produce
deadlocks.  We used to make sure of this by only trusting that
posix_lock_file() can correctly handle blocking lock calls asynchronously,
so the lock managers would only setup their file_lock requests for async
callbacks if the filesystem did not define its own lock() file operation.

However, when GFS2 and OCFS2 grew the capability to correctly
handle blocking lock requests asynchronously, they started signalling this
behavior with EXPORT_OP_ASYNC_LOCK, and the check for also trusting
posix_lock_file() was inadvertently dropped, so now most filesystems no
longer produce lock notifications when exported over NFS.

I tried to fix this by simply including the old check for lock(), but the
resulting include mess and layering violations was more than I could accept.
There's a much cleaner way presented here using an fop_flag, which while
potentially flag-greedy, greatly simplifies the problem and grooms the
way for future uses by both filesystems and lock managers alike.

Criticism welcomed,
Ben

Benjamin Coddington (4):
  fs: Introduce FOP_ASYNC_LOCK
  gfs2/ocfs2: set FOP_ASYNC_LOCK
  NLM/NFSD: Fix lock notifications for async-capable filesystems
  exportfs: Remove EXPORT_OP_ASYNC_LOCK

 Documentation/filesystems/nfs/exporting.rst |  7 -------
 fs/gfs2/export.c                            |  1 -
 fs/gfs2/file.c                              |  2 ++
 fs/lockd/svclock.c                          |  5 ++---
 fs/nfsd/nfs4state.c                         | 19 ++++---------------
 fs/ocfs2/export.c                           |  1 -
 fs/ocfs2/file.c                             |  2 ++
 include/linux/exportfs.h                    | 13 -------------
 include/linux/filelock.h                    |  5 +++++
 include/linux/fs.h                          |  2 ++
 10 files changed, 17 insertions(+), 40 deletions(-)

-- 
2.44.0

Re: [PATCH v1 0/4] Fixup NLM and kNFSD file lock callbacks

Posted by Chuck Lever III 1 year, 4 months ago


> On Sep 11, 2024, at 3:42 PM, Benjamin Coddington <bcodding@redhat.com> wrote:
> 
> Last year both GFS2 and OCFS2 had some work done to make their locking more
> robust when exported over NFS.  Unfortunately, part of that work caused both
> NLM (for NFS v3 exports) and kNFSD (for NFSv4.1+ exports) to no longer send
> lock notifications to clients.
> 
> This in itself is not a huge problem because most NFS clients will still
> poll the server in order to acquire a conflicted lock, but now that I've
> noticed it I can't help but try to fix it because there are big advantages
> for setups that might depend on timely lock notifications, and we've
> supported that as a feature for a long time.
> 
> Its important for NLM and kNFSD that they do not block their kernel threads
> inside filesystem's file_lock implementations because that can produce
> deadlocks.  We used to make sure of this by only trusting that
> posix_lock_file() can correctly handle blocking lock calls asynchronously,
> so the lock managers would only setup their file_lock requests for async
> callbacks if the filesystem did not define its own lock() file operation.
> 
> However, when GFS2 and OCFS2 grew the capability to correctly
> handle blocking lock requests asynchronously, they started signalling this
> behavior with EXPORT_OP_ASYNC_LOCK, and the check for also trusting
> posix_lock_file() was inadvertently dropped, so now most filesystems no
> longer produce lock notifications when exported over NFS.
> 
> I tried to fix this by simply including the old check for lock(), but the
> resulting include mess and layering violations was more than I could accept.
> There's a much cleaner way presented here using an fop_flag, which while
> potentially flag-greedy, greatly simplifies the problem and grooms the
> way for future uses by both filesystems and lock managers alike.
> 
> Criticism welcomed,
> Ben
> 
> Benjamin Coddington (4):
>  fs: Introduce FOP_ASYNC_LOCK
>  gfs2/ocfs2: set FOP_ASYNC_LOCK
>  NLM/NFSD: Fix lock notifications for async-capable filesystems
>  exportfs: Remove EXPORT_OP_ASYNC_LOCK
> 
> Documentation/filesystems/nfs/exporting.rst |  7 -------
> fs/gfs2/export.c                            |  1 -
> fs/gfs2/file.c                              |  2 ++
> fs/lockd/svclock.c                          |  5 ++---
> fs/nfsd/nfs4state.c                         | 19 ++++---------------
> fs/ocfs2/export.c                           |  1 -
> fs/ocfs2/file.c                             |  2 ++
> include/linux/exportfs.h                    | 13 -------------
> include/linux/filelock.h                    |  5 +++++
> include/linux/fs.h                          |  2 ++
> 10 files changed, 17 insertions(+), 40 deletions(-)
> 
> -- 
> 2.44.0
> 

For the NFSD and exportfs hunks:

Acked-by: Chuck Lever <chuck.lever@oracle.com <mailto:chuck.lever@oracle.com>>

"lockd: introduce safe async lock op" is in v6.10. Does this
series need to be backported to v6.10.y ? Should the series
have "Fixes: 2dd10de8e6bc ("lockd: introduce safe async lock
 op")" ?


--
Chuck Lever

Re: [PATCH v1 0/4] Fixup NLM and kNFSD file lock callbacks

Posted by Benjamin Coddington 1 year, 4 months ago

On 12 Sep 2024, at 10:01, Chuck Lever III wrote:

> For the NFSD and exportfs hunks:
>
> Acked-by: Chuck Lever <chuck.lever@oracle.com <mailto:chuck.lever@oracle.com>>
>
> "lockd: introduce safe async lock op" is in v6.10. Does this
> series need to be backported to v6.10.y ? Should the series
> have "Fixes: 2dd10de8e6bc ("lockd: introduce safe async lock
>  op")" ?

Thanks Chuck! Probably yes, if we want notifications fixed up there.  It
should be sufficient to add this to the signoff area for at least the first
three (and fourth for cleanup):

Cc: <stable@vger.kernel.org> # 6.10.x

No problem for me to send a v2 with these if needed.

Ben

Re: [PATCH v1 0/4] Fixup NLM and kNFSD file lock callbacks

Posted by Chuck Lever III 1 year, 4 months ago

> On Sep 12, 2024, at 11:06 AM, Benjamin Coddington <bcodding@redhat.com> wrote:
> 
> On 12 Sep 2024, at 10:01, Chuck Lever III wrote:
> 
>> For the NFSD and exportfs hunks:
>> 
>> Acked-by: Chuck Lever <chuck.lever@oracle.com <mailto:chuck.lever@oracle.com>>
>> 
>> "lockd: introduce safe async lock op" is in v6.10. Does this
>> series need to be backported to v6.10.y ? Should the series
>> have "Fixes: 2dd10de8e6bc ("lockd: introduce safe async lock
>> op")" ?
> 
> Thanks Chuck! Probably yes, if we want notifications fixed up there.  It
> should be sufficient to add this to the signoff area for at least the first
> three (and fourth for cleanup):
> 
> Cc: <stable@vger.kernel.org> # 6.10.x

2dd10de8e6bc landed in v6.7.

I suppose that since v6.10.y is likely to be closed by
the time this series is applied upstream, this tag might
be confusing.

Thus Fixes: 2dd10de8e6bc and a plain Cc: stable should
work best. Then whichever stable kernel is open when your
fixes are merged upstream will automatically get fixed.

None of the current LTS kernels have 2dd10de8e6bc so they
aren't relevant at this point.

--
Chuck Lever

Re: [PATCH v1 0/4] Fixup NLM and kNFSD file lock callbacks

Posted by Benjamin Coddington 1 year, 4 months ago

On 12 Sep 2024, at 14:17, Chuck Lever III wrote:

>> On Sep 12, 2024, at 11:06 AM, Benjamin Coddington <bcodding@redhat.com> wrote:
>>
>> On 12 Sep 2024, at 10:01, Chuck Lever III wrote:
>>
>>> For the NFSD and exportfs hunks:
>>>
>>> Acked-by: Chuck Lever <chuck.lever@oracle.com <mailto:chuck.lever@oracle.com>>
>>>
>>> "lockd: introduce safe async lock op" is in v6.10. Does this
>>> series need to be backported to v6.10.y ? Should the series
>>> have "Fixes: 2dd10de8e6bc ("lockd: introduce safe async lock
>>> op")" ?
>>
>> Thanks Chuck! Probably yes, if we want notifications fixed up there.  It
>> should be sufficient to add this to the signoff area for at least the first
>> three (and fourth for cleanup):
>>
>> Cc: <stable@vger.kernel.org> # 6.10.x
>
> 2dd10de8e6bc landed in v6.7.
>
> I suppose that since v6.10.y is likely to be closed by
> the time this series is applied upstream, this tag might
> be confusing.
>
> Thus Fixes: 2dd10de8e6bc and a plain Cc: stable should
> work best. Then whichever stable kernel is open when your
> fixes are merged upstream will automatically get fixed.

So you want "Fixes: 2dd10de8e6bc" on all these patches?  Fixing the problem
requires all of the first three patches together.  My worry is that a
"Fixes" on each implies a complete fix within that patch, so its really not
appropriate.

The stable-kernel-rules.rst documentation says for a series, the Cc: stable
tag should be suffient to request dependencies within the series, so that's
why I suggested it for the version you requested.

What exactly would you like to see?  I am happy to send a 2nd version.

Ben

Re: [PATCH v1 0/4] Fixup NLM and kNFSD file lock callbacks

Posted by Chuck Lever III 1 year, 4 months ago


> On Sep 12, 2024, at 3:11 PM, Benjamin Coddington <bcodding@redhat.com> wrote:
> 
> On 12 Sep 2024, at 14:17, Chuck Lever III wrote:
> 
>>> On Sep 12, 2024, at 11:06 AM, Benjamin Coddington <bcodding@redhat.com> wrote:
>>> 
>>> On 12 Sep 2024, at 10:01, Chuck Lever III wrote:
>>> 
>>>> For the NFSD and exportfs hunks:
>>>> 
>>>> Acked-by: Chuck Lever <chuck.lever@oracle.com <mailto:chuck.lever@oracle.com>>
>>>> 
>>>> "lockd: introduce safe async lock op" is in v6.10. Does this
>>>> series need to be backported to v6.10.y ? Should the series
>>>> have "Fixes: 2dd10de8e6bc ("lockd: introduce safe async lock
>>>> op")" ?
>>> 
>>> Thanks Chuck! Probably yes, if we want notifications fixed up there.  It
>>> should be sufficient to add this to the signoff area for at least the first
>>> three (and fourth for cleanup):
>>> 
>>> Cc: <stable@vger.kernel.org> # 6.10.x
>> 
>> 2dd10de8e6bc landed in v6.7.
>> 
>> I suppose that since v6.10.y is likely to be closed by
>> the time this series is applied upstream, this tag might
>> be confusing.
>> 
>> Thus Fixes: 2dd10de8e6bc and a plain Cc: stable should
>> work best. Then whichever stable kernel is open when your
>> fixes are merged upstream will automatically get fixed.
> 
> So you want "Fixes: 2dd10de8e6bc" on all these patches?  Fixing the problem
> requires all of the first three patches together.

I didn't indicate which patches to add the tags to, sorry.
3/4 sounds like the right place.

If 4/4 is a clean-up only, no new tags apply to that.


> My worry is that a
> "Fixes" on each implies a complete fix within that patch, so its really not
> appropriate.

Fixes seems to mean different things to different people. It's
OK to drop that tag, but I prefer to see a pointer to the broken
commit. That helps downstream consumers of the commit log to
identify which patches they should be pulling in.


> The stable-kernel-rules.rst documentation says for a series, the Cc: stable
> tag should be suffient to request dependencies within the series, so that's
> why I suggested it for the version you requested.
> 
> What exactly would you like to see?  I am happy to send a 2nd version.

You don't need to send again. Christian can add tags in his repo.

My objection is to the "# 6.10.x" comment -- that doesn't make sense
because for sure, the stable tree will have moved on by the time that
v6.13-rc opens.


--
Chuck Lever

Re: [PATCH v1 0/4] Fixup NLM and kNFSD file lock callbacks

Posted by Christian Brauner 1 year, 4 months ago

On Wed, 11 Sep 2024 15:42:56 -0400, Benjamin Coddington wrote:
> Last year both GFS2 and OCFS2 had some work done to make their locking more
> robust when exported over NFS.  Unfortunately, part of that work caused both
> NLM (for NFS v3 exports) and kNFSD (for NFSv4.1+ exports) to no longer send
> lock notifications to clients.
> 
> This in itself is not a huge problem because most NFS clients will still
> poll the server in order to acquire a conflicted lock, but now that I've
> noticed it I can't help but try to fix it because there are big advantages
> for setups that might depend on timely lock notifications, and we've
> supported that as a feature for a long time.
> 
> [...]

Applied to the vfs.misc.v6.13 branch of the vfs/vfs.git tree.
Patches in the vfs.misc.v6.13 branch should appear in linux-next soon.

Please report any outstanding bugs that were missed during review in a
new review to the original patch series allowing us to drop it.

It's encouraged to provide Acked-bys and Reviewed-bys even though the
patch has now been applied. If possible patch trailers will be updated.

Note that commit hashes shown below are subject to change due to rebase,
trailer updates or similar. If in doubt, please check the listed branch.

tree:   https://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs.git
branch: vfs.misc.v6.13

[1/4] fs: Introduce FOP_ASYNC_LOCK
      https://git.kernel.org/vfs/vfs/c/8cf9a01edc21
[2/4] gfs2/ocfs2: set FOP_ASYNC_LOCK
      https://git.kernel.org/vfs/vfs/c/2253ab99f2e9
[3/4] NLM/NFSD: Fix lock notifications for async-capable filesystems
      https://git.kernel.org/vfs/vfs/c/81be05940ccc
[4/4] exportfs: Remove EXPORT_OP_ASYNC_LOCK
      https://git.kernel.org/vfs/vfs/c/bb06326008c3

Re: [PATCH v1 0/4] Fixup NLM and kNFSD file lock callbacks

Posted by Jeff Layton 1 year, 4 months ago

On Wed, 2024-09-11 at 15:42 -0400, Benjamin Coddington wrote:
> Last year both GFS2 and OCFS2 had some work done to make their locking more
> robust when exported over NFS.  Unfortunately, part of that work caused both
> NLM (for NFS v3 exports) and kNFSD (for NFSv4.1+ exports) to no longer send
> lock notifications to clients.
> 
> This in itself is not a huge problem because most NFS clients will still
> poll the server in order to acquire a conflicted lock, but now that I've
> noticed it I can't help but try to fix it because there are big advantages
> for setups that might depend on timely lock notifications, and we've
> supported that as a feature for a long time.
> 
> Its important for NLM and kNFSD that they do not block their kernel threads
> inside filesystem's file_lock implementations because that can produce
> deadlocks.  We used to make sure of this by only trusting that
> posix_lock_file() can correctly handle blocking lock calls asynchronously,
> so the lock managers would only setup their file_lock requests for async
> callbacks if the filesystem did not define its own lock() file operation.
> 
> However, when GFS2 and OCFS2 grew the capability to correctly
> handle blocking lock requests asynchronously, they started signalling this
> behavior with EXPORT_OP_ASYNC_LOCK, and the check for also trusting
> posix_lock_file() was inadvertently dropped, so now most filesystems no
> longer produce lock notifications when exported over NFS.
> 
> I tried to fix this by simply including the old check for lock(), but the
> resulting include mess and layering violations was more than I could accept.
> There's a much cleaner way presented here using an fop_flag, which while
> potentially flag-greedy, greatly simplifies the problem and grooms the
> way for future uses by both filesystems and lock managers alike.
> 
> Criticism welcomed,
> Ben
> 
> Benjamin Coddington (4):
>   fs: Introduce FOP_ASYNC_LOCK
>   gfs2/ocfs2: set FOP_ASYNC_LOCK
>   NLM/NFSD: Fix lock notifications for async-capable filesystems
>   exportfs: Remove EXPORT_OP_ASYNC_LOCK
> 
>  Documentation/filesystems/nfs/exporting.rst |  7 -------
>  fs/gfs2/export.c                            |  1 -
>  fs/gfs2/file.c                              |  2 ++
>  fs/lockd/svclock.c                          |  5 ++---
>  fs/nfsd/nfs4state.c                         | 19 ++++---------------
>  fs/ocfs2/export.c                           |  1 -
>  fs/ocfs2/file.c                             |  2 ++
>  include/linux/exportfs.h                    | 13 -------------
>  include/linux/filelock.h                    |  5 +++++
>  include/linux/fs.h                          |  2 ++
>  10 files changed, 17 insertions(+), 40 deletions(-)
> 

Thanks for fixing this up, Ben!

Reviewed-by: Jeff Layton <jlayton@kernel.org>

Re: [PATCH v1 0/4] Fixup NLM and kNFSD file lock callbacks

Posted by Christian Brauner 1 year, 4 months ago

On Thu, Sep 12, 2024 at 07:08:07AM GMT, Jeff Layton wrote:
> On Wed, 2024-09-11 at 15:42 -0400, Benjamin Coddington wrote:
> > Last year both GFS2 and OCFS2 had some work done to make their locking more
> > robust when exported over NFS.  Unfortunately, part of that work caused both
> > NLM (for NFS v3 exports) and kNFSD (for NFSv4.1+ exports) to no longer send
> > lock notifications to clients.
> > 
> > This in itself is not a huge problem because most NFS clients will still
> > poll the server in order to acquire a conflicted lock, but now that I've
> > noticed it I can't help but try to fix it because there are big advantages
> > for setups that might depend on timely lock notifications, and we've
> > supported that as a feature for a long time.
> > 
> > Its important for NLM and kNFSD that they do not block their kernel threads
> > inside filesystem's file_lock implementations because that can produce
> > deadlocks.  We used to make sure of this by only trusting that
> > posix_lock_file() can correctly handle blocking lock calls asynchronously,
> > so the lock managers would only setup their file_lock requests for async
> > callbacks if the filesystem did not define its own lock() file operation.
> > 
> > However, when GFS2 and OCFS2 grew the capability to correctly
> > handle blocking lock requests asynchronously, they started signalling this
> > behavior with EXPORT_OP_ASYNC_LOCK, and the check for also trusting
> > posix_lock_file() was inadvertently dropped, so now most filesystems no
> > longer produce lock notifications when exported over NFS.
> > 
> > I tried to fix this by simply including the old check for lock(), but the
> > resulting include mess and layering violations was more than I could accept.
> > There's a much cleaner way presented here using an fop_flag, which while
> > potentially flag-greedy, greatly simplifies the problem and grooms the
> > way for future uses by both filesystems and lock managers alike.
> > 
> > Criticism welcomed,
> > Ben
> > 
> > Benjamin Coddington (4):
> >   fs: Introduce FOP_ASYNC_LOCK
> >   gfs2/ocfs2: set FOP_ASYNC_LOCK
> >   NLM/NFSD: Fix lock notifications for async-capable filesystems
> >   exportfs: Remove EXPORT_OP_ASYNC_LOCK
> > 
> >  Documentation/filesystems/nfs/exporting.rst |  7 -------
> >  fs/gfs2/export.c                            |  1 -
> >  fs/gfs2/file.c                              |  2 ++
> >  fs/lockd/svclock.c                          |  5 ++---
> >  fs/nfsd/nfs4state.c                         | 19 ++++---------------
> >  fs/ocfs2/export.c                           |  1 -
> >  fs/ocfs2/file.c                             |  2 ++
> >  include/linux/exportfs.h                    | 13 -------------
> >  include/linux/filelock.h                    |  5 +++++
> >  include/linux/fs.h                          |  2 ++
> >  10 files changed, 17 insertions(+), 40 deletions(-)
> > 
> 
> Thanks for fixing this up, Ben!
> 
> Reviewed-by: Jeff Layton <jlayton@kernel.org>

It might be a bit late for v6.12 so I would stuff this into a branch for
v6.13. Sound ok?

Re: [PATCH v1 0/4] Fixup NLM and kNFSD file lock callbacks

Posted by Jeff Layton 1 year, 4 months ago

On Thu, 2024-09-12 at 13:32 +0200, Christian Brauner wrote:
> On Thu, Sep 12, 2024 at 07:08:07AM GMT, Jeff Layton wrote:
> > On Wed, 2024-09-11 at 15:42 -0400, Benjamin Coddington wrote:
> > > Last year both GFS2 and OCFS2 had some work done to make their locking more
> > > robust when exported over NFS.  Unfortunately, part of that work caused both
> > > NLM (for NFS v3 exports) and kNFSD (for NFSv4.1+ exports) to no longer send
> > > lock notifications to clients.
> > > 
> > > This in itself is not a huge problem because most NFS clients will still
> > > poll the server in order to acquire a conflicted lock, but now that I've
> > > noticed it I can't help but try to fix it because there are big advantages
> > > for setups that might depend on timely lock notifications, and we've
> > > supported that as a feature for a long time.
> > > 
> > > Its important for NLM and kNFSD that they do not block their kernel threads
> > > inside filesystem's file_lock implementations because that can produce
> > > deadlocks.  We used to make sure of this by only trusting that
> > > posix_lock_file() can correctly handle blocking lock calls asynchronously,
> > > so the lock managers would only setup their file_lock requests for async
> > > callbacks if the filesystem did not define its own lock() file operation.
> > > 
> > > However, when GFS2 and OCFS2 grew the capability to correctly
> > > handle blocking lock requests asynchronously, they started signalling this
> > > behavior with EXPORT_OP_ASYNC_LOCK, and the check for also trusting
> > > posix_lock_file() was inadvertently dropped, so now most filesystems no
> > > longer produce lock notifications when exported over NFS.
> > > 
> > > I tried to fix this by simply including the old check for lock(), but the
> > > resulting include mess and layering violations was more than I could accept.
> > > There's a much cleaner way presented here using an fop_flag, which while
> > > potentially flag-greedy, greatly simplifies the problem and grooms the
> > > way for future uses by both filesystems and lock managers alike.
> > > 
> > > Criticism welcomed,
> > > Ben
> > > 
> > > Benjamin Coddington (4):
> > >   fs: Introduce FOP_ASYNC_LOCK
> > >   gfs2/ocfs2: set FOP_ASYNC_LOCK
> > >   NLM/NFSD: Fix lock notifications for async-capable filesystems
> > >   exportfs: Remove EXPORT_OP_ASYNC_LOCK
> > > 
> > >  Documentation/filesystems/nfs/exporting.rst |  7 -------
> > >  fs/gfs2/export.c                            |  1 -
> > >  fs/gfs2/file.c                              |  2 ++
> > >  fs/lockd/svclock.c                          |  5 ++---
> > >  fs/nfsd/nfs4state.c                         | 19 ++++---------------
> > >  fs/ocfs2/export.c                           |  1 -
> > >  fs/ocfs2/file.c                             |  2 ++
> > >  include/linux/exportfs.h                    | 13 -------------
> > >  include/linux/filelock.h                    |  5 +++++
> > >  include/linux/fs.h                          |  2 ++
> > >  10 files changed, 17 insertions(+), 40 deletions(-)
> > > 
> > 
> > Thanks for fixing this up, Ben!
> > 
> > Reviewed-by: Jeff Layton <jlayton@kernel.org>
> 
> It might be a bit late for v6.12 so I would stuff this into a branch for
> v6.13. Sound ok?

Ok. I figured Chuck would take this set, but I guess it is more VFS-y.

I think this is reasonably safe though, so if Ben needs it before then,
we could pull it in sooner.
-- 
Jeff Layton <jlayton@kernel.org>

Re: [PATCH v1 0/4] Fixup NLM and kNFSD file lock callbacks

Posted by Benjamin Coddington 1 year, 4 months ago

On 12 Sep 2024, at 7:51, Jeff Layton wrote:

> On Thu, 2024-09-12 at 13:32 +0200, Christian Brauner wrote:
>>
>> It might be a bit late for v6.12 so I would stuff this into a branch for
>> v6.13. Sound ok?
>
> Ok. I figured Chuck would take this set, but I guess it is more VFS-y.
>
> I think this is reasonably safe though, so if Ben needs it before then,
> we could pull it in sooner.

Absolutely no rush here, v6.13 is not a problem.

Ben

Re: [PATCH v1 0/4] Fixup NLM and kNFSD file lock callbacks

Posted by Christian Brauner 1 year, 4 months ago

On Wed, Sep 11, 2024 at 03:42:56PM GMT, Benjamin Coddington wrote:
> Last year both GFS2 and OCFS2 had some work done to make their locking more
> robust when exported over NFS.  Unfortunately, part of that work caused both
> NLM (for NFS v3 exports) and kNFSD (for NFSv4.1+ exports) to no longer send
> lock notifications to clients.
> 
> This in itself is not a huge problem because most NFS clients will still
> poll the server in order to acquire a conflicted lock, but now that I've
> noticed it I can't help but try to fix it because there are big advantages
> for setups that might depend on timely lock notifications, and we've
> supported that as a feature for a long time.
> 
> Its important for NLM and kNFSD that they do not block their kernel threads
> inside filesystem's file_lock implementations because that can produce
> deadlocks.  We used to make sure of this by only trusting that
> posix_lock_file() can correctly handle blocking lock calls asynchronously,
> so the lock managers would only setup their file_lock requests for async
> callbacks if the filesystem did not define its own lock() file operation.
> 
> However, when GFS2 and OCFS2 grew the capability to correctly
> handle blocking lock requests asynchronously, they started signalling this
> behavior with EXPORT_OP_ASYNC_LOCK, and the check for also trusting
> posix_lock_file() was inadvertently dropped, so now most filesystems no
> longer produce lock notifications when exported over NFS.
> 
> I tried to fix this by simply including the old check for lock(), but the
> resulting include mess and layering violations was more than I could accept.
> There's a much cleaner way presented here using an fop_flag, which while
> potentially flag-greedy, greatly simplifies the problem and grooms the

It's fine. I've explicitly added the fop_flags so that stuff like this
we would not want to put into f->f_mode can live there.

> way for future uses by both filesystems and lock managers alike.