When ATTR_ATIME_SET and ATTR_MTIME_SET are set in the ia_valid mask, the
notify_change() logic takes that to mean that the request should set
those values explicitly, and not override them with "now".
With the advent of delegated timestamps, similar functionality is needed
for the ctime. Add a ATTR_CTIME_SET flag, and use that to indicate that
the ctime should be accepted as-is. Also, clean up the if statements to
eliminate the extra negatives.
Signed-off-by: Jeff Layton <jlayton@kernel.org>
---
fs/attr.c | 15 +++++++++------
include/linux/fs.h | 1 +
2 files changed, 10 insertions(+), 6 deletions(-)
diff --git a/fs/attr.c b/fs/attr.c
index 9caf63d20d03e86c535e9c8c91d49c2a34d34b7a..f0dabd2985989d283a931536a5fc53eda366b373 100644
--- a/fs/attr.c
+++ b/fs/attr.c
@@ -463,15 +463,18 @@ int notify_change(struct mnt_idmap *idmap, struct dentry *dentry,
now = current_time(inode);
- attr->ia_ctime = now;
- if (!(ia_valid & ATTR_ATIME_SET))
- attr->ia_atime = now;
- else
+ if (ia_valid & ATTR_ATIME_SET)
attr->ia_atime = timestamp_truncate(attr->ia_atime, inode);
- if (!(ia_valid & ATTR_MTIME_SET))
- attr->ia_mtime = now;
else
+ attr->ia_atime = now;
+ if (ia_valid & ATTR_CTIME_SET)
+ attr->ia_ctime = timestamp_truncate(attr->ia_ctime, inode);
+ else
+ attr->ia_ctime = now;
+ if (ia_valid & ATTR_MTIME_SET)
attr->ia_mtime = timestamp_truncate(attr->ia_mtime, inode);
+ else
+ attr->ia_mtime = now;
if (ia_valid & ATTR_KILL_PRIV) {
error = security_inode_need_killpriv(dentry);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index 040c0036320fdf87a2379d494ab408a7991875bd..f18f45e88545c39716b917b1378fb7248367b41d 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -237,6 +237,7 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset,
#define ATTR_ATIME_SET (1 << 7)
#define ATTR_MTIME_SET (1 << 8)
#define ATTR_FORCE (1 << 9) /* Not a change, but a change it */
+#define ATTR_CTIME_SET (1 << 10)
#define ATTR_KILL_SUID (1 << 11)
#define ATTR_KILL_SGID (1 << 12)
#define ATTR_FILE (1 << 13)
--
2.50.1
On Mon, 28 Jul 2025, Jeff Layton wrote: > When ATTR_ATIME_SET and ATTR_MTIME_SET are set in the ia_valid mask, the > notify_change() logic takes that to mean that the request should set > those values explicitly, and not override them with "now". > > With the advent of delegated timestamps, similar functionality is needed > for the ctime. Add a ATTR_CTIME_SET flag, and use that to indicate that > the ctime should be accepted as-is. Also, clean up the if statements to > eliminate the extra negatives. I don't feel entirely comfortable with this. ctime is a fallback for "has anything changed" - mtime can be changed but ctime is always reliable, controlled by VFS and FS. Until now. I know you aren't exposing this to user-space, but then not doing so blocks user-space file servers from using this functionality. I see that you also move vetting of the value out of vfs code and into nfsd code. I don't really understand why you did that. Maybe nfsd has more information about previous timestamps than the vfs has? Anyway I would much prefer that ATTR_CTIME_SET could only change the ctime value to something between the old ctime value and the current time (inclusive). Certainly nfsd might impose extra restrictions, but I think that basic restriction should by in the VFS close to what ATTR_CTIME_SET is honoured. What way if someone else finds another use for it some day they will have to work within the same restriction (or change it explicitly and try to justify that change). Lustre has the equivalent of ATTR_CTIME_SET (MFS_ATTR_CTIME_SET and LA_CTIME) and would want to use it if the server-side code ever landed upstream. It appears to just assume the client sent a valid timestamp. I would rather it were vetted by the VFS. Thanks, NeilBrown > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > --- > fs/attr.c | 15 +++++++++------ > include/linux/fs.h | 1 + > 2 files changed, 10 insertions(+), 6 deletions(-) > > diff --git a/fs/attr.c b/fs/attr.c > index 9caf63d20d03e86c535e9c8c91d49c2a34d34b7a..f0dabd2985989d283a931536a5fc53eda366b373 100644 > --- a/fs/attr.c > +++ b/fs/attr.c > @@ -463,15 +463,18 @@ int notify_change(struct mnt_idmap *idmap, struct dentry *dentry, > > now = current_time(inode); > > - attr->ia_ctime = now; > - if (!(ia_valid & ATTR_ATIME_SET)) > - attr->ia_atime = now; > - else > + if (ia_valid & ATTR_ATIME_SET) > attr->ia_atime = timestamp_truncate(attr->ia_atime, inode); > - if (!(ia_valid & ATTR_MTIME_SET)) > - attr->ia_mtime = now; > else > + attr->ia_atime = now; > + if (ia_valid & ATTR_CTIME_SET) > + attr->ia_ctime = timestamp_truncate(attr->ia_ctime, inode); > + else > + attr->ia_ctime = now; > + if (ia_valid & ATTR_MTIME_SET) > attr->ia_mtime = timestamp_truncate(attr->ia_mtime, inode); > + else > + attr->ia_mtime = now; > > if (ia_valid & ATTR_KILL_PRIV) { > error = security_inode_need_killpriv(dentry); > diff --git a/include/linux/fs.h b/include/linux/fs.h > index 040c0036320fdf87a2379d494ab408a7991875bd..f18f45e88545c39716b917b1378fb7248367b41d 100644 > --- a/include/linux/fs.h > +++ b/include/linux/fs.h > @@ -237,6 +237,7 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset, > #define ATTR_ATIME_SET (1 << 7) > #define ATTR_MTIME_SET (1 << 8) > #define ATTR_FORCE (1 << 9) /* Not a change, but a change it */ > +#define ATTR_CTIME_SET (1 << 10) > #define ATTR_KILL_SUID (1 << 11) > #define ATTR_KILL_SGID (1 << 12) > #define ATTR_FILE (1 << 13) > > -- > 2.50.1 > >
On Mon, 2025-07-28 at 10:04 +1000, NeilBrown wrote: > On Mon, 28 Jul 2025, Jeff Layton wrote: > > When ATTR_ATIME_SET and ATTR_MTIME_SET are set in the ia_valid mask, the > > notify_change() logic takes that to mean that the request should set > > those values explicitly, and not override them with "now". > > > > With the advent of delegated timestamps, similar functionality is needed > > for the ctime. Add a ATTR_CTIME_SET flag, and use that to indicate that > > the ctime should be accepted as-is. Also, clean up the if statements to > > eliminate the extra negatives. > > I don't feel entirely comfortable with this. ctime is a fallback for > "has anything changed" - mtime can be changed but ctime is always > reliable, controlled by VFS and FS. > > Until now. > I know. I have many of the same reservations, but the specification is pretty clear (now that I understand it better). I don't see a better way to do this. > I know you aren't exposing this to user-space, but then not doing so > blocks user-space file servers from using this functionality. > > I see that you also move vetting of the value out of vfs code and into > nfsd code. I don't really understand why you did that. Maybe nfsd has > more information about previous timestamps than the vfs has? > Yes. We need to track the timestamps of the inode at the time that the delegation was handed out. nfsd is (arguably) in a better position to do this than the VFS is. Patch #5 adds this functionality. > Anyway I would much prefer that ATTR_CTIME_SET could only change the > ctime value to something between the old ctime value and the current > time (inclusive). > That will be a problem. What you're suggesting is the current status quo with the delegated attrs code, and that behavior was the source of the problems that we were seeing in the git regression testsuite. When git checks out an object, it opens a file, writes to it and then stats it so that it can later see whether it changed. If it gets a WRITE_ATTRS_DELEG delegation, the client doesn't wait on writeback before returning from that stat(). Then later, we go to do writeback. The mtime and ctime on the server get set to the server's local time (which is later than the time that git has recorded). Finally, the client does the SETATTR+DELEGRETURN and tries to set the timestamps to the same times that git has recorded, but those times are too early vs. the current timestamps on the file and they get ignored (in accordance with the spec). This was the source of my confusion with the spec. When it says "original time", it means the timestamps at the time that the delegation was created, but I interpreted it the same way you did. Unfortunately, if we want to do this, then we have to allow nfsd to set the ctime to a time earlier than the current ctime on the inode. I too have some reservations with this. This means that applications on the server may see the ctime go backward, which I really do not like. In practice though, if there is an outstanding delegation then those applications can't do anything other than stat() the file without causing it to be recalled. They can't have the file open at the time, and can't do any directory operations that involve it. Given that, I think the ctime rollbacks are "mostly harmless". Moving these checks into the VFS would be pretty ugly, unless we want to tightly integrate the setattr and lease handling code. nfsd is just in a much better position to track and vet this info than the VFS. > Certainly nfsd might impose extra restrictions, but I think that basic > restriction should by in the VFS close to what ATTR_CTIME_SET is > honoured. What way if someone else finds another use for it some day > they will have to work within the same restriction (or change it > explicitly and try to justify that change). > > Lustre has the equivalent of ATTR_CTIME_SET (MFS_ATTR_CTIME_SET and > LA_CTIME) and would want to use it if the server-side code ever landed > upstream. It appears to just assume the client sent a valid timestamp. > I would rather it were vetted by the VFS. > Interesting. I don't think they have any immediate plans to upstream the server (the priority is the client), but having this functionality in the VFS would make it easier to integrate. > > > > Signed-off-by: Jeff Layton <jlayton@kernel.org> > > --- > > fs/attr.c | 15 +++++++++------ > > include/linux/fs.h | 1 + > > 2 files changed, 10 insertions(+), 6 deletions(-) > > > > diff --git a/fs/attr.c b/fs/attr.c > > index 9caf63d20d03e86c535e9c8c91d49c2a34d34b7a..f0dabd2985989d283a931536a5fc53eda366b373 100644 > > --- a/fs/attr.c > > +++ b/fs/attr.c > > @@ -463,15 +463,18 @@ int notify_change(struct mnt_idmap *idmap, struct dentry *dentry, > > > > now = current_time(inode); > > > > - attr->ia_ctime = now; > > - if (!(ia_valid & ATTR_ATIME_SET)) > > - attr->ia_atime = now; > > - else > > + if (ia_valid & ATTR_ATIME_SET) > > attr->ia_atime = timestamp_truncate(attr->ia_atime, inode); > > - if (!(ia_valid & ATTR_MTIME_SET)) > > - attr->ia_mtime = now; > > else > > + attr->ia_atime = now; > > + if (ia_valid & ATTR_CTIME_SET) > > + attr->ia_ctime = timestamp_truncate(attr->ia_ctime, inode); > > + else > > + attr->ia_ctime = now; > > + if (ia_valid & ATTR_MTIME_SET) > > attr->ia_mtime = timestamp_truncate(attr->ia_mtime, inode); > > + else > > + attr->ia_mtime = now; > > > > if (ia_valid & ATTR_KILL_PRIV) { > > error = security_inode_need_killpriv(dentry); > > diff --git a/include/linux/fs.h b/include/linux/fs.h > > index 040c0036320fdf87a2379d494ab408a7991875bd..f18f45e88545c39716b917b1378fb7248367b41d 100644 > > --- a/include/linux/fs.h > > +++ b/include/linux/fs.h > > @@ -237,6 +237,7 @@ typedef int (dio_iodone_t)(struct kiocb *iocb, loff_t offset, > > #define ATTR_ATIME_SET (1 << 7) > > #define ATTR_MTIME_SET (1 << 8) > > #define ATTR_FORCE (1 << 9) /* Not a change, but a change it */ > > +#define ATTR_CTIME_SET (1 << 10) > > #define ATTR_KILL_SUID (1 << 11) > > #define ATTR_KILL_SGID (1 << 12) > > #define ATTR_FILE (1 << 13) > > > > -- > > 2.50.1 > > > > > -- Jeff Layton <jlayton@kernel.org>
On Mon, 28 Jul 2025, Jeff Layton wrote: > On Mon, 2025-07-28 at 10:04 +1000, NeilBrown wrote: > > On Mon, 28 Jul 2025, Jeff Layton wrote: > > > When ATTR_ATIME_SET and ATTR_MTIME_SET are set in the ia_valid mask, the > > > notify_change() logic takes that to mean that the request should set > > > those values explicitly, and not override them with "now". > > > > > > With the advent of delegated timestamps, similar functionality is needed > > > for the ctime. Add a ATTR_CTIME_SET flag, and use that to indicate that > > > the ctime should be accepted as-is. Also, clean up the if statements to > > > eliminate the extra negatives. > > > > I don't feel entirely comfortable with this. ctime is a fallback for > > "has anything changed" - mtime can be changed but ctime is always > > reliable, controlled by VFS and FS. > > > > Until now. > > > > I know. I have many of the same reservations, but the specification is > pretty clear (now that I understand it better). I don't see a better > way to do this. > > > I know you aren't exposing this to user-space, but then not doing so > > blocks user-space file servers from using this functionality. > > > > I see that you also move vetting of the value out of vfs code and into > > nfsd code. I don't really understand why you did that. Maybe nfsd has > > more information about previous timestamps than the vfs has? > > > > Yes. We need to track the timestamps of the inode at the time that the > delegation was handed out. nfsd is (arguably) in a better position to > do this than the VFS is. Patch #5 adds this functionality. > > > Anyway I would much prefer that ATTR_CTIME_SET could only change the > > ctime value to something between the old ctime value and the current > > time (inclusive). > > > > That will be a problem. What you're suggesting is the current status > quo with the delegated attrs code, and that behavior was the source of > the problems that we were seeing in the git regression testsuite. > > > When git checks out an object, it opens a file, writes to it and then > stats it so that it can later see whether it changed. If it gets a > WRITE_ATTRS_DELEG delegation, the client doesn't wait on writeback > before returning from that stat(). > > Then later, we go to do writeback. The mtime and ctime on the server > get set to the server's local time (which is later than the time that > git has recorded). Finally, the client does the SETATTR+DELEGRETURN and > tries to set the timestamps to the same times that git has recorded, > but those times are too early vs. the current timestamps on the file > and they get ignored (in accordance with the spec). > > This was the source of my confusion with the spec. When it says > "original time", it means the timestamps at the time that the > delegation was created, but I interpreted it the same way you did. > > Unfortunately, if we want to do this, then we have to allow nfsd to set > the ctime to a time earlier than the current ctime on the inode. I too > have some reservations with this. This means that applications on the > server may see the ctime go backward, which I really do not like. An alternate approach would be to allow the writeback through a delegation to skip the mtime/ctime update, possibly making use of FMODE_NOCMTIME. It would be nice to have some mechanism in the VFS to ensure there was an ATTR_CTIME_SET request on any file which had made use of FMODE_NOCMTIME before that file was closed, else the times would be set to the time of the close. That wouldn't be entirely straight forward, but should be manageable. (I would allow some way to avoid the ctime update on close so XFS_IOC_OPENBY_HANDLE could still be supported, but it would need to be explicit somewhere). While FMODE_NOCMTIME also distorts the meaning of ctime, I think it is better than making it too easy for ctime to go backwards. How would you feel about that approach? > > Interesting. I don't think they have any immediate plans to upstream > the server (the priority is the client), but having this functionality > in the VFS would make it easier to integrate. I think that splitting the client from the server is a non-trivial task that brings no benefit to anyone. Any chance I get I advocate upstreaming both at once. Thanks, NeilBrown
On Mon, 2025-07-28 at 11:51 +1000, NeilBrown wrote: > On Mon, 28 Jul 2025, Jeff Layton wrote: > > On Mon, 2025-07-28 at 10:04 +1000, NeilBrown wrote: > > > On Mon, 28 Jul 2025, Jeff Layton wrote: > > > > When ATTR_ATIME_SET and ATTR_MTIME_SET are set in the ia_valid mask, the > > > > notify_change() logic takes that to mean that the request should set > > > > those values explicitly, and not override them with "now". > > > > > > > > With the advent of delegated timestamps, similar functionality is needed > > > > for the ctime. Add a ATTR_CTIME_SET flag, and use that to indicate that > > > > the ctime should be accepted as-is. Also, clean up the if statements to > > > > eliminate the extra negatives. > > > > > > I don't feel entirely comfortable with this. ctime is a fallback for > > > "has anything changed" - mtime can be changed but ctime is always > > > reliable, controlled by VFS and FS. > > > > > > Until now. > > > > > > > I know. I have many of the same reservations, but the specification is > > pretty clear (now that I understand it better). I don't see a better > > way to do this. > > > > > I know you aren't exposing this to user-space, but then not doing so > > > blocks user-space file servers from using this functionality. > > > > > > I see that you also move vetting of the value out of vfs code and into > > > nfsd code. I don't really understand why you did that. Maybe nfsd has > > > more information about previous timestamps than the vfs has? > > > > > > > Yes. We need to track the timestamps of the inode at the time that the > > delegation was handed out. nfsd is (arguably) in a better position to > > do this than the VFS is. Patch #5 adds this functionality. > > > > > Anyway I would much prefer that ATTR_CTIME_SET could only change the > > > ctime value to something between the old ctime value and the current > > > time (inclusive). > > > > > > > That will be a problem. What you're suggesting is the current status > > quo with the delegated attrs code, and that behavior was the source of > > the problems that we were seeing in the git regression testsuite. > > > > > > When git checks out an object, it opens a file, writes to it and then > > stats it so that it can later see whether it changed. If it gets a > > WRITE_ATTRS_DELEG delegation, the client doesn't wait on writeback > > before returning from that stat(). > > > > Then later, we go to do writeback. The mtime and ctime on the server > > get set to the server's local time (which is later than the time that > > git has recorded). Finally, the client does the SETATTR+DELEGRETURN and > > tries to set the timestamps to the same times that git has recorded, > > but those times are too early vs. the current timestamps on the file > > and they get ignored (in accordance with the spec). > > > > This was the source of my confusion with the spec. When it says > > "original time", it means the timestamps at the time that the > > delegation was created, but I interpreted it the same way you did. > > > > Unfortunately, if we want to do this, then we have to allow nfsd to set > > the ctime to a time earlier than the current ctime on the inode. I too > > have some reservations with this. This means that applications on the > > server may see the ctime go backward, which I really do not like. > > An alternate approach would be to allow the writeback through a > delegation to skip the mtime/ctime update, possibly making use of > FMODE_NOCMTIME. > > It would be nice to have some mechanism in the VFS to ensure there was > an ATTR_CTIME_SET request on any file which had made use of > FMODE_NOCMTIME before that file was closed, else the times would be set > to the time of the close. That wouldn't be entirely straight forward, > but should be manageable. (I would allow some way to avoid the ctime > update on close so XFS_IOC_OPENBY_HANDLE could still be supported, but > it would need to be explicit somewhere). > > While FMODE_NOCMTIME also distorts the meaning of ctime, I think it is > better than making it too easy for ctime to go backwards. > > How would you feel about that approach? > I do like the idea of "freezing" timestamp updates until the delegation is returned. Doing a bit of git-archaeology shows that the NOCMTIME flag was added here: commit 4d4be482a4d78ca906f45e99fd9fdb91e907f5ad Author: Christoph Hellwig <hch@infradead.org> Date: Tue Dec 9 04:47:33 2008 -0500 [XFS] add a FMODE flag to make XFS invisible I/O less hacky XFS has a mode called invisble I/O that doesn't update any of the timestamps. It's used for HSM-style applications and exposed through the nasty open by handle ioctl. Instead of doing directly assignment of file operations that set an internal flag for it add a new FMODE_NOCMTIME flag that we can check in the normal file operations. (addition of the generic VFS flag has been ACKed by Al as an interims solution) Signed-off-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com> Delegated timestamps seems like a similar enough use-case that we can probably make this work. The main catch here is that there is no guarantee that the client will ever follow up with a SETATTR, so (like you mentioned), we'd need a mechanism to ensure that the cmtime can be updated on close, if the file is ever written while the delegation is in force and the final SETATTR never comes in. I'll do a bit of research an get back to you on whether this is feasible. Thanks for the suggestion! -- Jeff Layton <jlayton@kernel.org>
© 2016 - 2025 Red Hat, Inc.