[PATCH] xfs: snapshot i_disk_size before truncate writeback

Cen Zhang posted 1 patch 1 month, 1 week ago
fs/xfs/xfs_iops.c | 8 ++++++--
1 file changed, 6 insertions(+), 2 deletions(-)
[PATCH] xfs: snapshot i_disk_size before truncate writeback
Posted by Cen Zhang 1 month, 1 week ago
xfs_setattr_size() updates the in-core i_size and truncates the page
cache before it writes back the range between the on-disk EOF and the new
EOF.  The range calculation currently samples ip->i_disk_size without the
inode ILOCK.

Buffered unwritten-extent completion can advance ip->i_disk_size from
xfs_iomap_write_unwritten() under XFS_ILOCK_EXCL at the same time.  A
stale coherent value is harmless here because concurrent EOF advancement
only makes the pre-transaction writeback range conservative, but a
lockless 64-bit load can tear on 32-bit builds and create a bogus range
start or condition.  That can cause the null-files avoidance writeback to
be skipped or shortened before the size change is logged.

Snapshot ip->i_disk_size under XFS_ILOCK_SHARED after truncate_setsize()
and use that snapshot for the writeback decision and range start.

Signed-off-by: Cen Zhang <zzzccc427@gmail.com>
---
 fs/xfs/xfs_iops.c | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
index 208543e..a531715 100644
--- a/fs/xfs/xfs_iops.c
+++ b/fs/xfs/xfs_iops.c
@@ -915,6 +915,7 @@ xfs_setattr_size(
 	struct xfs_mount	*mp = ip->i_mount;
 	struct inode		*inode = VFS_I(ip);
 	xfs_off_t		oldsize, newsize;
+	xfs_fsize_t		ondisk_size;
 	struct xfs_trans	*tp;
 	int			error;
 	uint			lock_flags = 0;
@@ -1024,6 +1025,9 @@ xfs_setattr_size(
 	 * guaranteed not to write stale data past the new EOF on truncate down.
 	 */
 	truncate_setsize(inode, newsize);
+	xfs_ilock(ip, XFS_ILOCK_SHARED);
+	ondisk_size = ip->i_disk_size;
+	xfs_iunlock(ip, XFS_ILOCK_SHARED);
 
 	/*
 	 * We are going to log the inode size change in this transaction so
@@ -1034,9 +1038,9 @@ xfs_setattr_size(
 	 * otherwise those blocks may not be zeroed after a crash.
 	 */
 	if (did_zeroing ||
-	    (newsize > ip->i_disk_size && oldsize != ip->i_disk_size)) {
+	    (newsize > ondisk_size && oldsize != ondisk_size)) {
 		error = filemap_write_and_wait_range(VFS_I(ip)->i_mapping,
-						ip->i_disk_size, newsize - 1);
+						ondisk_size, newsize - 1);
 		if (error)
 			return error;
 	}
-- 
2.43.0