[PATCH RFC 06/10] xfs: iomap CoW-based atomic write support

John Garry posted 10 patches 10 months, 1 week ago
There is a newer version of this series
[PATCH RFC 06/10] xfs: iomap CoW-based atomic write support
Posted by John Garry 10 months, 1 week ago
In cases of an atomic write occurs for misaligned or discontiguous disk
blocks, we will use a CoW-based method to issue the atomic write.

So, for that case, return -EAGAIN to request that the write be issued in
CoW atomic write mode. The dio write path should detect this, similar to
how misaligned regalar DIO writes are handled.

Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 fs/xfs/xfs_iomap.c | 68 ++++++++++++++++++++++++++++++++++++++++++++--
 1 file changed, 66 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
index ae3755ed00e6..2c2867d728e4 100644
--- a/fs/xfs/xfs_iomap.c
+++ b/fs/xfs/xfs_iomap.c
@@ -809,9 +809,12 @@ xfs_direct_write_iomap_begin(
 	struct xfs_bmbt_irec	imap, cmap;
 	xfs_fileoff_t		offset_fsb = XFS_B_TO_FSBT(mp, offset);
 	xfs_fileoff_t		end_fsb = xfs_iomap_end_fsb(mp, offset, length);
+	bool			atomic = flags & IOMAP_ATOMIC;
 	int			nimaps = 1, error = 0;
 	bool			shared = false;
+	bool			found = false;
 	u16			iomap_flags = 0;
+	bool			need_alloc;
 	unsigned int		lockmode;
 	u64			seq;
 
@@ -832,7 +835,7 @@ xfs_direct_write_iomap_begin(
 	 * COW writes may allocate delalloc space or convert unwritten COW
 	 * extents, so we need to make sure to take the lock exclusively here.
 	 */
-	if (xfs_is_cow_inode(ip))
+	if (xfs_is_cow_inode(ip) || atomic)
 		lockmode = XFS_ILOCK_EXCL;
 	else
 		lockmode = XFS_ILOCK_SHARED;
@@ -857,12 +860,73 @@ xfs_direct_write_iomap_begin(
 	if (error)
 		goto out_unlock;
 
+
+	if (flags & IOMAP_ATOMIC_COW) {
+		error = xfs_reflink_allocate_cow(ip, &imap, &cmap, &shared,
+				&lockmode,
+				(flags & IOMAP_DIRECT) || IS_DAX(inode), true);
+		if (error)
+			goto out_unlock;
+
+		end_fsb = imap.br_startoff + imap.br_blockcount;
+		length = XFS_FSB_TO_B(mp, end_fsb) - offset;
+
+		if (imap.br_startblock != HOLESTARTBLOCK) {
+			seq = xfs_iomap_inode_sequence(ip, 0);
+
+			error = xfs_bmbt_to_iomap(ip, srcmap, &imap, flags,
+				iomap_flags | IOMAP_F_ATOMIC_COW, seq);
+			if (error)
+				goto out_unlock;
+		}
+		seq = xfs_iomap_inode_sequence(ip, 0);
+		xfs_iunlock(ip, lockmode);
+		return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags,
+					iomap_flags | IOMAP_F_ATOMIC_COW, seq);
+	}
+
+	need_alloc = imap_needs_alloc(inode, flags, &imap, nimaps);
+
+	if (atomic) {
+		/* Use CoW-based method if any of the following fail */
+		error = -EAGAIN;
+
+		/*
+		 * Lazily use CoW-based method for initial alloc of data.
+		 * Check br_blockcount for FSes which do not support atomic
+		 * writes > 1x block.
+		 */
+		if (need_alloc && imap.br_blockcount > 1)
+			goto out_unlock;
+
+		/* Misaligned start block wrt size */
+		if (!IS_ALIGNED(imap.br_startblock, imap.br_blockcount))
+			goto out_unlock;
+
+		/* Discontiguous or mixed extents */
+		if (!imap_spans_range(&imap, offset_fsb, end_fsb))
+			goto out_unlock;
+	}
+
 	if (imap_needs_cow(ip, flags, &imap, nimaps)) {
 		error = -EAGAIN;
 		if (flags & IOMAP_NOWAIT)
 			goto out_unlock;
 
+		if (atomic) {
+			/* Detect whether we're already covered in a cow fork */
+			error  = xfs_find_trim_cow_extent(ip, &imap, &cmap, &shared, &found);
+			if (error)
+				goto out_unlock;
+
+			if (shared) {
+				error = -EAGAIN;
+				goto out_unlock;
+			}
+		}
+
 		/* may drop and re-acquire the ilock */
+		shared = false;
 		error = xfs_reflink_allocate_cow(ip, &imap, &cmap, &shared,
 				&lockmode,
 				(flags & IOMAP_DIRECT) || IS_DAX(inode), false);
@@ -874,7 +938,7 @@ xfs_direct_write_iomap_begin(
 		length = XFS_FSB_TO_B(mp, end_fsb) - offset;
 	}
 
-	if (imap_needs_alloc(inode, flags, &imap, nimaps))
+	if (need_alloc)
 		goto allocate_blocks;
 
 	/*
-- 
2.31.1
Re: [PATCH RFC 06/10] xfs: iomap CoW-based atomic write support
Posted by Darrick J. Wong 10 months, 1 week ago
On Tue, Feb 04, 2025 at 12:01:23PM +0000, John Garry wrote:
> In cases of an atomic write occurs for misaligned or discontiguous disk
> blocks, we will use a CoW-based method to issue the atomic write.
> 
> So, for that case, return -EAGAIN to request that the write be issued in
> CoW atomic write mode. The dio write path should detect this, similar to
> how misaligned regalar DIO writes are handled.
> 
> Signed-off-by: John Garry <john.g.garry@oracle.com>
> ---
>  fs/xfs/xfs_iomap.c | 68 ++++++++++++++++++++++++++++++++++++++++++++--
>  1 file changed, 66 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> index ae3755ed00e6..2c2867d728e4 100644
> --- a/fs/xfs/xfs_iomap.c
> +++ b/fs/xfs/xfs_iomap.c
> @@ -809,9 +809,12 @@ xfs_direct_write_iomap_begin(
>  	struct xfs_bmbt_irec	imap, cmap;
>  	xfs_fileoff_t		offset_fsb = XFS_B_TO_FSBT(mp, offset);
>  	xfs_fileoff_t		end_fsb = xfs_iomap_end_fsb(mp, offset, length);
> +	bool			atomic = flags & IOMAP_ATOMIC;
>  	int			nimaps = 1, error = 0;
>  	bool			shared = false;
> +	bool			found = false;
>  	u16			iomap_flags = 0;
> +	bool			need_alloc;
>  	unsigned int		lockmode;
>  	u64			seq;
>  
> @@ -832,7 +835,7 @@ xfs_direct_write_iomap_begin(
>  	 * COW writes may allocate delalloc space or convert unwritten COW
>  	 * extents, so we need to make sure to take the lock exclusively here.
>  	 */
> -	if (xfs_is_cow_inode(ip))
> +	if (xfs_is_cow_inode(ip) || atomic)
>  		lockmode = XFS_ILOCK_EXCL;
>  	else
>  		lockmode = XFS_ILOCK_SHARED;
> @@ -857,12 +860,73 @@ xfs_direct_write_iomap_begin(
>  	if (error)
>  		goto out_unlock;
>  
> +
> +	if (flags & IOMAP_ATOMIC_COW) {
> +		error = xfs_reflink_allocate_cow(ip, &imap, &cmap, &shared,
> +				&lockmode,
> +				(flags & IOMAP_DIRECT) || IS_DAX(inode), true);

Weird nit not relate to this patch: Is there ever a case where
IS_DAX(inode) and (flags & IOMAP_DAX) disagree?  I wonder if this odd
construction could be simplified to:

	(flags & (IOMAP_DIRECT | IOMAP_DAX))

?

> +		if (error)
> +			goto out_unlock;
> +
> +		end_fsb = imap.br_startoff + imap.br_blockcount;
> +		length = XFS_FSB_TO_B(mp, end_fsb) - offset;
> +
> +		if (imap.br_startblock != HOLESTARTBLOCK) {
> +			seq = xfs_iomap_inode_sequence(ip, 0);
> +
> +			error = xfs_bmbt_to_iomap(ip, srcmap, &imap, flags,
> +				iomap_flags | IOMAP_F_ATOMIC_COW, seq);
> +			if (error)
> +				goto out_unlock;
> +		}
> +		seq = xfs_iomap_inode_sequence(ip, 0);
> +		xfs_iunlock(ip, lockmode);
> +		return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags,
> +					iomap_flags | IOMAP_F_ATOMIC_COW, seq);
> +	}

/me wonders if this should be a separate helper so that the main
xfs_direct_write_iomap_begin doesn't get even longer... but otherwise
the logic in here looks sane.

> +
> +	need_alloc = imap_needs_alloc(inode, flags, &imap, nimaps);
> +
> +	if (atomic) {
> +		/* Use CoW-based method if any of the following fail */
> +		error = -EAGAIN;
> +
> +		/*
> +		 * Lazily use CoW-based method for initial alloc of data.
> +		 * Check br_blockcount for FSes which do not support atomic
> +		 * writes > 1x block.
> +		 */
> +		if (need_alloc && imap.br_blockcount > 1)
> +			goto out_unlock;
> +
> +		/* Misaligned start block wrt size */
> +		if (!IS_ALIGNED(imap.br_startblock, imap.br_blockcount))
> +			goto out_unlock;
> +
> +		/* Discontiguous or mixed extents */
> +		if (!imap_spans_range(&imap, offset_fsb, end_fsb))
> +			goto out_unlock;
> +	}

(Same two comments here.)

> +
>  	if (imap_needs_cow(ip, flags, &imap, nimaps)) {
>  		error = -EAGAIN;
>  		if (flags & IOMAP_NOWAIT)
>  			goto out_unlock;
>  
> +		if (atomic) {
> +			/* Detect whether we're already covered in a cow fork */
> +			error  = xfs_find_trim_cow_extent(ip, &imap, &cmap, &shared, &found);
> +			if (error)
> +				goto out_unlock;
> +
> +			if (shared) {
> +				error = -EAGAIN;
> +				goto out_unlock;

What is this checking?  That something else already created a mapping in
the COW fork, so we want to bail out to get rid of it?

--D

> +			}
> +		}
> +
>  		/* may drop and re-acquire the ilock */
> +		shared = false;
>  		error = xfs_reflink_allocate_cow(ip, &imap, &cmap, &shared,
>  				&lockmode,
>  				(flags & IOMAP_DIRECT) || IS_DAX(inode), false);
> @@ -874,7 +938,7 @@ xfs_direct_write_iomap_begin(
>  		length = XFS_FSB_TO_B(mp, end_fsb) - offset;
>  	}
>  
> -	if (imap_needs_alloc(inode, flags, &imap, nimaps))
> +	if (need_alloc)
>  		goto allocate_blocks;
>  
>  	/*
> -- 
> 2.31.1
> 
>
Re: [PATCH RFC 06/10] xfs: iomap CoW-based atomic write support
Posted by John Garry 10 months, 1 week ago
On 05/02/2025 20:05, Darrick J. Wong wrote:
> On Tue, Feb 04, 2025 at 12:01:23PM +0000, John Garry wrote:
>> In cases of an atomic write occurs for misaligned or discontiguous disk
>> blocks, we will use a CoW-based method to issue the atomic write.
>>
>> So, for that case, return -EAGAIN to request that the write be issued in
>> CoW atomic write mode. The dio write path should detect this, similar to
>> how misaligned regalar DIO writes are handled.
>>
>> Signed-off-by: John Garry <john.g.garry@oracle.com>
>> ---
>>   fs/xfs/xfs_iomap.c | 68 ++++++++++++++++++++++++++++++++++++++++++++--
>>   1 file changed, 66 insertions(+), 2 deletions(-)
>>
>> diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
>> index ae3755ed00e6..2c2867d728e4 100644
>> --- a/fs/xfs/xfs_iomap.c
>> +++ b/fs/xfs/xfs_iomap.c
>> @@ -809,9 +809,12 @@ xfs_direct_write_iomap_begin(
>>   	struct xfs_bmbt_irec	imap, cmap;
>>   	xfs_fileoff_t		offset_fsb = XFS_B_TO_FSBT(mp, offset);
>>   	xfs_fileoff_t		end_fsb = xfs_iomap_end_fsb(mp, offset, length);
>> +	bool			atomic = flags & IOMAP_ATOMIC;
>>   	int			nimaps = 1, error = 0;
>>   	bool			shared = false;
>> +	bool			found = false;
>>   	u16			iomap_flags = 0;
>> +	bool			need_alloc;
>>   	unsigned int		lockmode;
>>   	u64			seq;
>>   
>> @@ -832,7 +835,7 @@ xfs_direct_write_iomap_begin(
>>   	 * COW writes may allocate delalloc space or convert unwritten COW
>>   	 * extents, so we need to make sure to take the lock exclusively here.
>>   	 */
>> -	if (xfs_is_cow_inode(ip))
>> +	if (xfs_is_cow_inode(ip) || atomic)
>>   		lockmode = XFS_ILOCK_EXCL;
>>   	else
>>   		lockmode = XFS_ILOCK_SHARED;
>> @@ -857,12 +860,73 @@ xfs_direct_write_iomap_begin(
>>   	if (error)
>>   		goto out_unlock;
>>   
>> +
>> +	if (flags & IOMAP_ATOMIC_COW) {
>> +		error = xfs_reflink_allocate_cow(ip, &imap, &cmap, &shared,
>> +				&lockmode,
>> +				(flags & IOMAP_DIRECT) || IS_DAX(inode), true);
> 
> Weird nit not relate to this patch: Is there ever a case where
> IS_DAX(inode) and (flags & IOMAP_DAX) disagree?  I wonder if this odd
> construction could be simplified to:
> 
> 	(flags & (IOMAP_DIRECT | IOMAP_DAX))

I'm not sure. I assume that we always want to convert for DAX, and 
IOMAP_DAX may not be set always for DIO path - but I only see 
xfs_file_write_iter() -> xfs_file_dax_write() 
->dax_iomap_rw(xfs_dax_write_iomap_ops), which sets IOMAP_DAX in 
iomap_iter.flags

> 
> ?
> 
>> +		if (error)
>> +			goto out_unlock;
>> +
>> +		end_fsb = imap.br_startoff + imap.br_blockcount;
>> +		length = XFS_FSB_TO_B(mp, end_fsb) - offset;
>> +
>> +		if (imap.br_startblock != HOLESTARTBLOCK) {
>> +			seq = xfs_iomap_inode_sequence(ip, 0);
>> +
>> +			error = xfs_bmbt_to_iomap(ip, srcmap, &imap, flags,
>> +				iomap_flags | IOMAP_F_ATOMIC_COW, seq);
>> +			if (error)
>> +				goto out_unlock;
>> +		}
>> +		seq = xfs_iomap_inode_sequence(ip, 0);
>> +		xfs_iunlock(ip, lockmode);
>> +		return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags,
>> +					iomap_flags | IOMAP_F_ATOMIC_COW, seq);
>> +	}
> 
> /me wonders if this should be a separate helper so that the main
> xfs_direct_write_iomap_begin doesn't get even longer... but otherwise
> the logic in here looks sane.

I can do that. Maybe some code can be factored out for regular "found 
cow path".

> 
>> +
>> +	need_alloc = imap_needs_alloc(inode, flags, &imap, nimaps);
>> +
>> +	if (atomic) {
>> +		/* Use CoW-based method if any of the following fail */
>> +		error = -EAGAIN;
>> +
>> +		/*
>> +		 * Lazily use CoW-based method for initial alloc of data.
>> +		 * Check br_blockcount for FSes which do not support atomic
>> +		 * writes > 1x block.
>> +		 */
>> +		if (need_alloc && imap.br_blockcount > 1)
>> +			goto out_unlock;
>> +
>> +		/* Misaligned start block wrt size */
>> +		if (!IS_ALIGNED(imap.br_startblock, imap.br_blockcount))
>> +			goto out_unlock;
>> +
>> +		/* Discontiguous or mixed extents */
>> +		if (!imap_spans_range(&imap, offset_fsb, end_fsb))
>> +			goto out_unlock;
>> +	}
> 
> (Same two comments here.)

ok

> 
>> +
>>   	if (imap_needs_cow(ip, flags, &imap, nimaps)) {
>>   		error = -EAGAIN;
>>   		if (flags & IOMAP_NOWAIT)
>>   			goto out_unlock;
>>   
>> +		if (atomic) {
>> +			/* Detect whether we're already covered in a cow fork */
>> +			error  = xfs_find_trim_cow_extent(ip, &imap, &cmap, &shared, &found);
>> +			if (error)
>> +				goto out_unlock;
>> +
>> +			if (shared) {
>> +				error = -EAGAIN;
>> +				goto out_unlock;
> 
> What is this checking?  That something else already created a mapping in
> the COW fork, so we want to bail out to get rid of it?

I want to check if some data is shared. In that case, we should unshare.

And I am not sure if that check is sufficient.

On the buffered write path, we may have something in a CoW fork - in 
that case it should be flushed, right?

Thanks,
John
Re: [PATCH RFC 06/10] xfs: iomap CoW-based atomic write support
Posted by Darrick J. Wong 10 months, 1 week ago
On Thu, Feb 06, 2025 at 11:10:40AM +0000, John Garry wrote:
> On 05/02/2025 20:05, Darrick J. Wong wrote:
> > On Tue, Feb 04, 2025 at 12:01:23PM +0000, John Garry wrote:
> > > In cases of an atomic write occurs for misaligned or discontiguous disk
> > > blocks, we will use a CoW-based method to issue the atomic write.
> > > 
> > > So, for that case, return -EAGAIN to request that the write be issued in
> > > CoW atomic write mode. The dio write path should detect this, similar to
> > > how misaligned regalar DIO writes are handled.
> > > 
> > > Signed-off-by: John Garry <john.g.garry@oracle.com>
> > > ---
> > >   fs/xfs/xfs_iomap.c | 68 ++++++++++++++++++++++++++++++++++++++++++++--
> > >   1 file changed, 66 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/xfs/xfs_iomap.c b/fs/xfs/xfs_iomap.c
> > > index ae3755ed00e6..2c2867d728e4 100644
> > > --- a/fs/xfs/xfs_iomap.c
> > > +++ b/fs/xfs/xfs_iomap.c
> > > @@ -809,9 +809,12 @@ xfs_direct_write_iomap_begin(
> > >   	struct xfs_bmbt_irec	imap, cmap;
> > >   	xfs_fileoff_t		offset_fsb = XFS_B_TO_FSBT(mp, offset);
> > >   	xfs_fileoff_t		end_fsb = xfs_iomap_end_fsb(mp, offset, length);
> > > +	bool			atomic = flags & IOMAP_ATOMIC;
> > >   	int			nimaps = 1, error = 0;
> > >   	bool			shared = false;
> > > +	bool			found = false;
> > >   	u16			iomap_flags = 0;
> > > +	bool			need_alloc;
> > >   	unsigned int		lockmode;
> > >   	u64			seq;
> > > @@ -832,7 +835,7 @@ xfs_direct_write_iomap_begin(
> > >   	 * COW writes may allocate delalloc space or convert unwritten COW
> > >   	 * extents, so we need to make sure to take the lock exclusively here.
> > >   	 */
> > > -	if (xfs_is_cow_inode(ip))
> > > +	if (xfs_is_cow_inode(ip) || atomic)
> > >   		lockmode = XFS_ILOCK_EXCL;
> > >   	else
> > >   		lockmode = XFS_ILOCK_SHARED;
> > > @@ -857,12 +860,73 @@ xfs_direct_write_iomap_begin(
> > >   	if (error)
> > >   		goto out_unlock;
> > > +
> > > +	if (flags & IOMAP_ATOMIC_COW) {
> > > +		error = xfs_reflink_allocate_cow(ip, &imap, &cmap, &shared,
> > > +				&lockmode,
> > > +				(flags & IOMAP_DIRECT) || IS_DAX(inode), true);
> > 
> > Weird nit not relate to this patch: Is there ever a case where
> > IS_DAX(inode) and (flags & IOMAP_DAX) disagree?  I wonder if this odd
> > construction could be simplified to:
> > 
> > 	(flags & (IOMAP_DIRECT | IOMAP_DAX))
> 
> I'm not sure. I assume that we always want to convert for DAX, and IOMAP_DAX
> may not be set always for DIO path - but I only see xfs_file_write_iter() ->
> xfs_file_dax_write() ->dax_iomap_rw(xfs_dax_write_iomap_ops), which sets
> IOMAP_DAX in iomap_iter.flags
> 
> > 
> > ?
> > 
> > > +		if (error)
> > > +			goto out_unlock;
> > > +
> > > +		end_fsb = imap.br_startoff + imap.br_blockcount;
> > > +		length = XFS_FSB_TO_B(mp, end_fsb) - offset;
> > > +
> > > +		if (imap.br_startblock != HOLESTARTBLOCK) {
> > > +			seq = xfs_iomap_inode_sequence(ip, 0);
> > > +
> > > +			error = xfs_bmbt_to_iomap(ip, srcmap, &imap, flags,
> > > +				iomap_flags | IOMAP_F_ATOMIC_COW, seq);
> > > +			if (error)
> > > +				goto out_unlock;
> > > +		}
> > > +		seq = xfs_iomap_inode_sequence(ip, 0);
> > > +		xfs_iunlock(ip, lockmode);
> > > +		return xfs_bmbt_to_iomap(ip, iomap, &cmap, flags,
> > > +					iomap_flags | IOMAP_F_ATOMIC_COW, seq);
> > > +	}
> > 
> > /me wonders if this should be a separate helper so that the main
> > xfs_direct_write_iomap_begin doesn't get even longer... but otherwise
> > the logic in here looks sane.
> 
> I can do that. Maybe some code can be factored out for regular "found cow
> path".
> 
> > 
> > > +
> > > +	need_alloc = imap_needs_alloc(inode, flags, &imap, nimaps);
> > > +
> > > +	if (atomic) {
> > > +		/* Use CoW-based method if any of the following fail */
> > > +		error = -EAGAIN;
> > > +
> > > +		/*
> > > +		 * Lazily use CoW-based method for initial alloc of data.
> > > +		 * Check br_blockcount for FSes which do not support atomic
> > > +		 * writes > 1x block.
> > > +		 */
> > > +		if (need_alloc && imap.br_blockcount > 1)
> > > +			goto out_unlock;
> > > +
> > > +		/* Misaligned start block wrt size */
> > > +		if (!IS_ALIGNED(imap.br_startblock, imap.br_blockcount))
> > > +			goto out_unlock;
> > > +
> > > +		/* Discontiguous or mixed extents */
> > > +		if (!imap_spans_range(&imap, offset_fsb, end_fsb))
> > > +			goto out_unlock;
> > > +	}
> > 
> > (Same two comments here.)
> 
> ok
> 
> > 
> > > +
> > >   	if (imap_needs_cow(ip, flags, &imap, nimaps)) {
> > >   		error = -EAGAIN;
> > >   		if (flags & IOMAP_NOWAIT)
> > >   			goto out_unlock;
> > > +		if (atomic) {
> > > +			/* Detect whether we're already covered in a cow fork */
> > > +			error  = xfs_find_trim_cow_extent(ip, &imap, &cmap, &shared, &found);
> > > +			if (error)
> > > +				goto out_unlock;
> > > +
> > > +			if (shared) {
> > > +				error = -EAGAIN;
> > > +				goto out_unlock;
> > 
> > What is this checking?  That something else already created a mapping in
> > the COW fork, so we want to bail out to get rid of it?
> 
> I want to check if some data is shared. In that case, we should unshare.

Why is it necessary to unshare?  Userspace gave us a buffer of new
contents, and we're already prepared to write that out of place and
remap it.

> And I am not sure if that check is sufficient.
> 
> On the buffered write path, we may have something in a CoW fork - in that
> case it should be flushed, right?

Flushed against what?  Concurrent writeback or something?  The directio
setup should have flushed dirty pagecache, so the only things left in
the COW fork are speculative preallocations.  (IOWs, I don't understand
what needs to be flushed or why.)

--D

> 
> Thanks,
> John
>
Re: [PATCH RFC 06/10] xfs: iomap CoW-based atomic write support
Posted by John Garry 10 months, 1 week ago
On 06/02/2025 21:44, Darrick J. Wong wrote:
>>> What is this checking?  That something else already created a mapping in
>>> the COW fork, so we want to bail out to get rid of it?
>> I want to check if some data is shared. In that case, we should unshare.
> Why is it necessary to unshare?  Userspace gave us a buffer of new
> contents, and we're already prepared to write that out of place and
> remap it.

fine, as long as the remap does what we need, then I won't bother with 
this explicit unshare.

> 
>> And I am not sure if that check is sufficient.
>>
>> On the buffered write path, we may have something in a CoW fork - in that
>> case it should be flushed, right?
> Flushed against what?  Concurrent writeback or something?  The directio
> setup should have flushed dirty pagecache, so the only things left in
> the COW fork are speculative preallocations.  (IOWs, I don't understand
> what needs to be flushed or why.)

ah, ok, as long as DIO would have flushed dirty relevant pagecache, then 
we should be good.

Cheers,
John