xfs: fix confused tracepoints in xfs_reflink_end_atomic_cow()

[PATCH] xfs: fix confused tracepoints in xfs_reflink_end_atomic_cow()

Posted by alexjlzheng@gmail.com 2 months, 2 weeks ago

From: Jinliang Zheng <alexjlzheng@tencent.com>

The commit b1e09178b73a ("xfs: commit CoW-based atomic writes atomically")
introduced xfs_reflink_end_atomic_cow() for atomic CoW-based writes, but
it used the same tracepoint as xfs_reflink_end_cow(), making trace logs
ambiguous.

This patch adds two new tracepoints trace_xfs_reflink_end_atomic_cow() and
trace_xfs_reflink_end_atomic_cow_error() to distinguish them.

Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
 fs/xfs/xfs_reflink.c | 4 ++--
 fs/xfs/xfs_trace.h   | 2 ++
 2 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
index 3f177b4ec131..47f532fd46e0 100644
--- a/fs/xfs/xfs_reflink.c
+++ b/fs/xfs/xfs_reflink.c
@@ -1003,7 +1003,7 @@ xfs_reflink_end_atomic_cow(
 	struct xfs_trans		*tp;
 	unsigned int			resblks;
 
-	trace_xfs_reflink_end_cow(ip, offset, count);
+	trace_xfs_reflink_end_atomic_cow(ip, offset, count);
 
 	offset_fsb = XFS_B_TO_FSBT(mp, offset);
 	end_fsb = XFS_B_TO_FSB(mp, offset + count);
@@ -1028,7 +1028,7 @@ xfs_reflink_end_atomic_cow(
 				end_fsb);
 	}
 	if (error) {
-		trace_xfs_reflink_end_cow_error(ip, error, _RET_IP_);
+		trace_xfs_reflink_end_atomic_cow_error(ip, error, _RET_IP_);
 		goto out_cancel;
 	}
 	error = xfs_trans_commit(tp);
diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
index 79b8641880ab..29eefacb8226 100644
--- a/fs/xfs/xfs_trace.h
+++ b/fs/xfs/xfs_trace.h
@@ -4186,12 +4186,14 @@ DEFINE_INODE_IREC_EVENT(xfs_reflink_convert_cow);
 
 DEFINE_SIMPLE_IO_EVENT(xfs_reflink_cancel_cow_range);
 DEFINE_SIMPLE_IO_EVENT(xfs_reflink_end_cow);
+DEFINE_SIMPLE_IO_EVENT(xfs_reflink_end_atomic_cow);
 DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap_from);
 DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap_to);
 DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap_skip);
 
 DEFINE_INODE_ERROR_EVENT(xfs_reflink_cancel_cow_range_error);
 DEFINE_INODE_ERROR_EVENT(xfs_reflink_end_cow_error);
+DEFINE_INODE_ERROR_EVENT(xfs_reflink_end_atomic_cow_error);
 
 
 DEFINE_INODE_IREC_EVENT(xfs_reflink_cancel_cow);
-- 
2.49.0

Re: [PATCH] xfs: fix confused tracepoints in xfs_reflink_end_atomic_cow()

Posted by Christoph Hellwig 2 months, 2 weeks ago

On Fri, Nov 21, 2025 at 07:56:56PM +0800, alexjlzheng@gmail.com wrote:
> From: Jinliang Zheng <alexjlzheng@tencent.com>
> 
> The commit b1e09178b73a ("xfs: commit CoW-based atomic writes atomically")
> introduced xfs_reflink_end_atomic_cow() for atomic CoW-based writes, but
> it used the same tracepoint as xfs_reflink_end_cow(), making trace logs
> ambiguous.
> 
> This patch adds two new tracepoints trace_xfs_reflink_end_atomic_cow() and
> trace_xfs_reflink_end_atomic_cow_error() to distinguish them.

Confused sounds a bit strong, but otherwise this looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

Semi-related:  back when this code was added I asked why we're not
using the transaction / defer ops chaining even for normale reflink
completions, as it should be just as efficient and that way we have
less code to maintain and less diverging code paths.  Or am I missing
something?

> 
> Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
> ---
>  fs/xfs/xfs_reflink.c | 4 ++--
>  fs/xfs/xfs_trace.h   | 2 ++
>  2 files changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/xfs/xfs_reflink.c b/fs/xfs/xfs_reflink.c
> index 3f177b4ec131..47f532fd46e0 100644
> --- a/fs/xfs/xfs_reflink.c
> +++ b/fs/xfs/xfs_reflink.c
> @@ -1003,7 +1003,7 @@ xfs_reflink_end_atomic_cow(
>  	struct xfs_trans		*tp;
>  	unsigned int			resblks;
>  
> -	trace_xfs_reflink_end_cow(ip, offset, count);
> +	trace_xfs_reflink_end_atomic_cow(ip, offset, count);
>  
>  	offset_fsb = XFS_B_TO_FSBT(mp, offset);
>  	end_fsb = XFS_B_TO_FSB(mp, offset + count);
> @@ -1028,7 +1028,7 @@ xfs_reflink_end_atomic_cow(
>  				end_fsb);
>  	}
>  	if (error) {
> -		trace_xfs_reflink_end_cow_error(ip, error, _RET_IP_);
> +		trace_xfs_reflink_end_atomic_cow_error(ip, error, _RET_IP_);
>  		goto out_cancel;
>  	}
>  	error = xfs_trans_commit(tp);
> diff --git a/fs/xfs/xfs_trace.h b/fs/xfs/xfs_trace.h
> index 79b8641880ab..29eefacb8226 100644
> --- a/fs/xfs/xfs_trace.h
> +++ b/fs/xfs/xfs_trace.h
> @@ -4186,12 +4186,14 @@ DEFINE_INODE_IREC_EVENT(xfs_reflink_convert_cow);
>  
>  DEFINE_SIMPLE_IO_EVENT(xfs_reflink_cancel_cow_range);
>  DEFINE_SIMPLE_IO_EVENT(xfs_reflink_end_cow);
> +DEFINE_SIMPLE_IO_EVENT(xfs_reflink_end_atomic_cow);
>  DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap_from);
>  DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap_to);
>  DEFINE_INODE_IREC_EVENT(xfs_reflink_cow_remap_skip);
>  
>  DEFINE_INODE_ERROR_EVENT(xfs_reflink_cancel_cow_range_error);
>  DEFINE_INODE_ERROR_EVENT(xfs_reflink_end_cow_error);
> +DEFINE_INODE_ERROR_EVENT(xfs_reflink_end_atomic_cow_error);
>  
>  
>  DEFINE_INODE_IREC_EVENT(xfs_reflink_cancel_cow);
> -- 
> 2.49.0
> 
> 
---end quoted text---

Re: [PATCH] xfs: fix confused tracepoints in xfs_reflink_end_atomic_cow()

Posted by John Garry 2 months, 2 weeks ago

On 24/11/2025 09:34, Christoph Hellwig wrote:
> On Fri, Nov 21, 2025 at 07:56:56PM +0800, alexjlzheng@gmail.com wrote:
>> From: Jinliang Zheng <alexjlzheng@tencent.com>
>>
>> The commit b1e09178b73a ("xfs: commit CoW-based atomic writes atomically")
>> introduced xfs_reflink_end_atomic_cow() for atomic CoW-based writes, but
>> it used the same tracepoint as xfs_reflink_end_cow(), making trace logs
>> ambiguous.
>>
>> This patch adds two new tracepoints trace_xfs_reflink_end_atomic_cow() and
>> trace_xfs_reflink_end_atomic_cow_error() to distinguish them.
> 
> Confused sounds a bit strong, 

Yeah, maybe "ambiguous" could be a better word.

FWIW,

Reviewed-by: John Garry <john.g.garry@oracle.com>

> but otherwise this looks good:
> 
> Reviewed-by: Christoph Hellwig <hch@lst.de>
> 
> Semi-related:  back when this code was added I asked why we're not
> using the transaction / defer ops chaining even for normale reflink
> completions, as it should be just as efficient and that way we have
> less code to maintain and less diverging code paths.  Or am I missing
> something?

Commit d6f215f35963 might be able to explain that.

Re: [PATCH] xfs: fix confused tracepoints in xfs_reflink_end_atomic_cow()

Posted by Christoph Hellwig 2 months, 2 weeks ago

On Mon, Nov 24, 2025 at 10:57:24AM +0000, John Garry wrote:
> Commit d6f215f35963 might be able to explain that.

I don't think so.  That commit splits up the operation so to avoid
doing the entire operation in a single transaction, and the rationale
for this is sound.  But the atomic work showed that it went to far,
because we can still batch up a fair amount of conversions.  I think
the argument of allowing to batch up as many transactions as we allow
in an atomic write still makes perfect sense.

Re: [PATCH] xfs: fix confused tracepoints in xfs_reflink_end_atomic_cow()

Posted by John Garry 2 months, 2 weeks ago

On 24/11/2025 14:04, Christoph Hellwig wrote:
> On Mon, Nov 24, 2025 at 10:57:24AM +0000, John Garry wrote:
>> Commit d6f215f35963 might be able to explain that.
> 
> I don't think so. 

I am just pointing out why it was changed to use a separate transaction 
per extent [and why the cow end handler for atomic writes is different].

> That commit splits up the operation so to avoid
> doing the entire operation in a single transaction, and the rationale
> for this is sound.  But the atomic work showed that it went to far,
> because we can still batch up a fair amount of conversions.  I think
> the argument of allowing to batch up as many transactions as we allow
> in an atomic write still makes perfect sense.
> 

Sure, Darrick knows more about this than me (so I'll let him comment).

Re: [PATCH] xfs: fix confused tracepoints in xfs_reflink_end_atomic_cow()

Posted by Darrick J. Wong 2 months, 2 weeks ago

On Mon, Nov 24, 2025 at 02:25:29PM +0000, John Garry wrote:
> On 24/11/2025 14:04, Christoph Hellwig wrote:
> > On Mon, Nov 24, 2025 at 10:57:24AM +0000, John Garry wrote:
> > > Commit d6f215f35963 might be able to explain that.
> > 
> > I don't think so.
> 
> I am just pointing out why it was changed to use a separate transaction per
> extent [and why the cow end handler for atomic writes is different].
> 
> > That commit splits up the operation so to avoid
> > doing the entire operation in a single transaction, and the rationale
> > for this is sound.  But the atomic work showed that it went to far,

It did go too far, because when d6f215f3596 was written we didn't have
these nice helpers that guestimate how many deferred remapping ops we
can attach to a single fresh transaction.

Nowadays we probably could speed up the non-atomic remapping path by
computing a safe batching factor and doing that much instead of only one
extent per transaction chain.  Stupidly hardcoding 16 would be enough to
achieve an order of magnitude improvement.

Nobody's complained about slow ioend performance enough to do that work
and then QA it though.  Remember, that d6f commit comes with the
following barb:

"Note that this can be reproduced after ~320 million fsx ops while
running generic/938 (long soak directio fsx exerciser):"

(generic/938 is now generic/521.)

--D

> > because we can still batch up a fair amount of conversions.  I think
> > the argument of allowing to batch up as many transactions as we allow
> > in an atomic write still makes perfect sense.
> > 
> 
> Sure, Darrick knows more about this than me (so I'll let him comment).
> 
>