fs/nfsd/nfs4proc.c | 5 +++++ 1 file changed, 5 insertions(+)
When the server is recovering from a reboot and is in a grace period,
any operation that may result in deletion or reallocation of block
extents should not be allowed. See RFC 8881, section 18.43.3.
If multiple clients write data to the same file, rebooting the server
during writing may result in file corruption. In the worst case, the
exported XFS may also become corrupted. Observed this behavior while
testing pNFS block volume setup.
Co-developed-by: Konstantin Evtushenko <koevtushenko@yandex.com>
Signed-off-by: Konstantin Evtushenko <koevtushenko@yandex.com>
Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com>
---
fs/nfsd/nfs4proc.c | 5 +++++
1 file changed, 5 insertions(+)
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index bfebe6e25638a..3000b43be9221 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -2435,6 +2435,7 @@ static __be32
nfsd4_layoutget(struct svc_rqst *rqstp,
struct nfsd4_compound_state *cstate, union nfsd4_op_u *u)
{
+ struct net *net = SVC_NET(rqstp);
struct nfsd4_layoutget *lgp = &u->layoutget;
struct svc_fh *current_fh = &cstate->current_fh;
const struct nfsd4_layout_ops *ops;
@@ -2486,6 +2487,10 @@ nfsd4_layoutget(struct svc_rqst *rqstp,
if (lgp->lg_seg.length == 0)
goto out;
+ nfserr = nfserr_grace;
+ if (locks_in_grace(net))
+ goto out;
+
nfserr = nfsd4_preprocess_layout_stateid(rqstp, cstate, &lgp->lg_sid,
true, lgp->lg_layout_type, &ls);
if (nfserr) {
--
2.43.0
From: Chuck Lever <chuck.lever@oracle.com> On Mon, 25 Aug 2025 16:11:02 +0300, Sergey Bashirov wrote: > When the server is recovering from a reboot and is in a grace period, > any operation that may result in deletion or reallocation of block > extents should not be allowed. See RFC 8881, section 18.43.3. > > If multiple clients write data to the same file, rebooting the server > during writing may result in file corruption. In the worst case, the > exported XFS may also become corrupted. Observed this behavior while > testing pNFS block volume setup. > > [...] Applied to nfsd-testing, thanks! [1/1] NFSD: Disallow layoutget during grace period commit: c4df20612a34b4713e81e0b3612a84481f6ae82e -- Chuck Lever
On Mon, 2025-08-25 at 16:11 +0300, Sergey Bashirov wrote: > When the server is recovering from a reboot and is in a grace period, > any operation that may result in deletion or reallocation of block > extents should not be allowed. See RFC 8881, section 18.43.3. > > If multiple clients write data to the same file, rebooting the server > during writing may result in file corruption. In the worst case, the > exported XFS may also become corrupted. Observed this behavior while > testing pNFS block volume setup. > > Co-developed-by: Konstantin Evtushenko <koevtushenko@yandex.com> > Signed-off-by: Konstantin Evtushenko <koevtushenko@yandex.com> > Signed-off-by: Sergey Bashirov <sergeybashirov@gmail.com> > --- > fs/nfsd/nfs4proc.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c > index bfebe6e25638a..3000b43be9221 100644 > --- a/fs/nfsd/nfs4proc.c > +++ b/fs/nfsd/nfs4proc.c > @@ -2435,6 +2435,7 @@ static __be32 > nfsd4_layoutget(struct svc_rqst *rqstp, > struct nfsd4_compound_state *cstate, union nfsd4_op_u *u) > { > + struct net *net = SVC_NET(rqstp); > struct nfsd4_layoutget *lgp = &u->layoutget; > struct svc_fh *current_fh = &cstate->current_fh; > const struct nfsd4_layout_ops *ops; > @@ -2486,6 +2487,10 @@ nfsd4_layoutget(struct svc_rqst *rqstp, > if (lgp->lg_seg.length == 0) > goto out; > > + nfserr = nfserr_grace; > + if (locks_in_grace(net)) > + goto out; > + > nfserr = nfsd4_preprocess_layout_stateid(rqstp, cstate, &lgp->lg_sid, > true, lgp->lg_layout_type, &ls); > if (nfserr) { This seems like a reasonable thing to do, but I wonder if it makes sense across all different pNFS layout types? This restriction is definitely not needed for the (trivial) in-kernel flexfiles server, for instance. Maybe it'd be best to push this down into the individual layout drivers and let them make the decision? -- Jeff Layton <jlayton@kernel.org>
Hi Jeff, On Mon, Aug 25, 2025 at 12:33:46PM -0400, Jeff Layton wrote: > This seems like a reasonable thing to do, but I wonder if it makes > sense across all different pNFS layout types? This restriction is > definitely not needed for the (trivial) in-kernel flexfiles server, for > instance. > > Maybe it'd be best to push this down into the individual layout drivers > and let them make the decision? Good point. The spec says: "If the metadata server is in a grace period, and does not persist layouts and device ID to device address mappings, then it MUST return NFS4ERR_GRACE". As far as I understand, this is a requirement for a specific implementation option. So moving this logic to the layout driver level seems reasonable to me. Will submit new patch. -- Sergey Bashirov
© 2016 - 2025 Red Hat, Inc.