[PATCH v11 02/16] xfs: only call xfs_setsize_buftarg once per buffer target

John Garry posted 16 patches 7 months, 2 weeks ago
There is a newer version of this series
[PATCH v11 02/16] xfs: only call xfs_setsize_buftarg once per buffer target
Posted by John Garry 7 months, 2 weeks ago
From: "Darrick J. Wong" <djwong@kernel.org>

It's silly to call xfs_setsize_buftarg from xfs_alloc_buftarg with the
block device LBA size because we don't need to ask the block layer to
validate a geometry number that it provided us.  Instead, set the
preliminary bt_meta_sector* fields to the LBA size in preparation for
reading the primary super.

However, we still want to flush and invalidate the pagecache for all
three block devices before we start reading metadata from those devices.
Move the sync_blockdev calls into a separate helper function, and call
it immediately after xfs_open_devices creates the buftargs.

This will enable a subsequent patch to validate hw atomic write geometry
against the filesystem geometry.

Signed-off-by: Darrick J. Wong <djwong@kernel.org>
Reviewed-by: John Garry <john.g.garry@oracle.com>
Signed-off-by: John Garry <john.g.garry@oracle.com>
---
 fs/xfs/xfs_buf.c   | 14 +++++---------
 fs/xfs/xfs_buf.h   |  5 +++++
 fs/xfs/xfs_super.c | 33 +++++++++++++++++++++++++++++++++
 3 files changed, 43 insertions(+), 9 deletions(-)

diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c
index 5ae77ffdc947..292891d6ff69 100644
--- a/fs/xfs/xfs_buf.c
+++ b/fs/xfs/xfs_buf.c
@@ -1733,11 +1733,7 @@ xfs_setsize_buftarg(
 		return -EINVAL;
 	}
 
-	/*
-	 * Flush the block device pagecache so our bios see anything dirtied
-	 * before mount.
-	 */
-	return sync_blockdev(btp->bt_bdev);
+	return 0;
 }
 
 int
@@ -1810,10 +1806,10 @@ xfs_alloc_buftarg(
 	 * When allocating the buftargs we have not yet read the super block and
 	 * thus don't know the file system sector size yet.
 	 */
-	if (xfs_setsize_buftarg(btp, bdev_logical_block_size(btp->bt_bdev)))
-		goto error_free;
-	if (xfs_init_buftarg(btp, bdev_logical_block_size(btp->bt_bdev),
-			mp->m_super->s_id))
+	btp->bt_meta_sectorsize = bdev_logical_block_size(btp->bt_bdev);
+	btp->bt_meta_sectormask = btp->bt_meta_sectorsize - 1;
+
+	if (xfs_init_buftarg(btp, btp->bt_meta_sectorsize, mp->m_super->s_id))
 		goto error_free;
 
 	return btp;
diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h
index d0b065a9a9f0..132210705602 100644
--- a/fs/xfs/xfs_buf.h
+++ b/fs/xfs/xfs_buf.h
@@ -383,6 +383,11 @@ int xfs_buf_reverify(struct xfs_buf *bp, const struct xfs_buf_ops *ops);
 bool xfs_verify_magic(struct xfs_buf *bp, __be32 dmagic);
 bool xfs_verify_magic16(struct xfs_buf *bp, __be16 dmagic);
 
+static inline int xfs_buftarg_sync(struct xfs_buftarg *btp)
+{
+	return sync_blockdev(btp->bt_bdev);
+}
+
 /* for xfs_buf_mem.c only: */
 int xfs_init_buftarg(struct xfs_buftarg *btp, size_t logical_sectorsize,
 		const char *descr);
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 5e456a6073ca..45e188466e51 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -520,6 +520,35 @@ xfs_open_devices(
 	return error;
 }
 
+/*
+ * Flush and invalidate all devices' pagecaches before reading any metadata
+ * because XFS doesn't use the bdev pagecache.
+ */
+STATIC int
+xfs_preflush_devices(
+	struct xfs_mount	*mp)
+{
+	int			error;
+
+	error = xfs_buftarg_sync(mp->m_ddev_targp);
+	if (error)
+		return error;
+
+	if (mp->m_logdev_targp && mp->m_logdev_targp != mp->m_ddev_targp) {
+		error = xfs_buftarg_sync(mp->m_ddev_targp);
+		if (error)
+			return error;
+	}
+
+	if (mp->m_rtdev_targp) {
+		error = xfs_buftarg_sync(mp->m_rtdev_targp);
+		if (error)
+			return error;
+	}
+
+	return 0;
+}
+
 /*
  * Setup xfs_mount buffer target pointers based on superblock
  */
@@ -1672,6 +1701,10 @@ xfs_fs_fill_super(
 	if (error)
 		return error;
 
+	error = xfs_preflush_devices(mp);
+	if (error)
+		goto out_shutdown_devices;
+
 	if (xfs_debugfs) {
 		mp->m_debugfs = xfs_debugfs_mkdir(mp->m_super->s_id,
 						  xfs_debugfs);
-- 
2.31.1
Re: [PATCH v11 02/16] xfs: only call xfs_setsize_buftarg once per buffer target
Posted by Christoph Hellwig 7 months, 2 weeks ago
On Sun, May 04, 2025 at 08:59:09AM +0000, John Garry wrote:
> +static inline int xfs_buftarg_sync(struct xfs_buftarg *btp)
> +{
> +	return sync_blockdev(btp->bt_bdev);
> +}

What is the point in having this wrapper?

> +/*
> + * Flush and invalidate all devices' pagecaches before reading any metadata
> + * because XFS doesn't use the bdev pagecache.
> + */
> +STATIC int
> +xfs_preflush_devices(
> +	struct xfs_mount	*mp)
> +{
> +	int			error;
> +
> +	error = xfs_buftarg_sync(mp->m_ddev_targp);
> +	if (error)
> +		return error;
> +
> +	if (mp->m_logdev_targp && mp->m_logdev_targp != mp->m_ddev_targp) {
> +		error = xfs_buftarg_sync(mp->m_ddev_targp);
> +		if (error)
> +			return error;
> +	}

Why does this duplicate all the logic instead of being folded into
xfs_open_devices?
Re: [PATCH v11 02/16] xfs: only call xfs_setsize_buftarg once per buffer target
Posted by John Garry 7 months, 2 weeks ago
On 05/05/2025 06:40, Christoph Hellwig wrote:
>> +/*
>> + * Flush and invalidate all devices' pagecaches before reading any metadata
>> + * because XFS doesn't use the bdev pagecache.
>> + */
>> +STATIC int
>> +xfs_preflush_devices(
>> +	struct xfs_mount	*mp)
>> +{
>> +	int			error;
>> +
>> +	error = xfs_buftarg_sync(mp->m_ddev_targp);
>> +	if (error)
>> +		return error;
>> +
>> +	if (mp->m_logdev_targp && mp->m_logdev_targp != mp->m_ddev_targp) {
>> +		error = xfs_buftarg_sync(mp->m_ddev_targp);
>> +		if (error)
>> +			return error;
>> +	}
> Why does this duplicate all the logic instead of being folded into
> xfs_open_devices?

So you mean an additive change like:

diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 64fbd089ef55..9fa538938e07 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -488,6 +488,9 @@ xfs_open_devices(
  	mp->m_ddev_targp = xfs_alloc_buftarg(mp, sb->s_bdev_file);
  	if (!mp->m_ddev_targp)
  		goto out_close_rtdev;
+	error = sync_blockdev(mp->m_ddev_targp->bt_bdev);
+	if (error)
+		goto out_close_rtdev;

  	if (rtdev_file) {
@@ -495,6 +498,9 @@ xfs_open_devices(
  		mp->m_rtdev_targp = xfs_alloc_buftarg(mp, rtdev_file);
  		if (!mp->m_rtdev_targp)
  			goto out_free_ddev_targ;
+		error = sync_blockdev(mp->m_rtdev_targp->bt_bdev);
+		if (error)
+			goto out_free_ddev_targ;
  	}

  	if (logdev_file && file_bdev(logdev_file) != ddev) {
@@ -503,6 +509,9 @@ xfs_open_devices(
  		mp->m_logdev_targp = xfs_alloc_buftarg(mp, logdev_file);
  		if (!mp->m_logdev_targp)
  			goto out_free_rtdev_targ;
+		error = sync_blockdev(mp->m_logdev_targp->bt_bdev);
+		if (error)
+			goto out_free_rtdev_targ;
  	} else {
  		mp->m_logdev_targp = mp->m_ddev_targp;
  		/* Handle won't be used, drop it */


Right?
Re: [PATCH v11 02/16] xfs: only call xfs_setsize_buftarg once per buffer target
Posted by Christoph Hellwig 7 months, 2 weeks ago
On Mon, May 05, 2025 at 11:04:55AM +0100, John Garry wrote:
> @@ -503,6 +509,9 @@ xfs_open_devices(
>  		mp->m_logdev_targp = xfs_alloc_buftarg(mp, logdev_file);
>  		if (!mp->m_logdev_targp)
>  			goto out_free_rtdev_targ;
> +		error = sync_blockdev(mp->m_logdev_targp->bt_bdev);
> +		if (error)
> +			goto out_free_rtdev_targ;
>  	} else {
>  		mp->m_logdev_targp = mp->m_ddev_targp;
>  		/* Handle won't be used, drop it */
>
>
> Right?

Yes.  Or in fact just folding it into xfs_alloc_buftarg, which might
be even simpler.  While you're at it adding a command why we are doing
the sync would also be really useful, and having it in just one place
helps with that.
Re: [PATCH v11 02/16] xfs: only call xfs_setsize_buftarg once per buffer target
Posted by John Garry 7 months, 2 weeks ago
On 05/05/2025 11:49, Christoph Hellwig wrote:
> On Mon, May 05, 2025 at 11:04:55AM +0100, John Garry wrote:
>> @@ -503,6 +509,9 @@ xfs_open_devices(
>>   		mp->m_logdev_targp = xfs_alloc_buftarg(mp, logdev_file);
>>   		if (!mp->m_logdev_targp)
>>   			goto out_free_rtdev_targ;
>> +		error = sync_blockdev(mp->m_logdev_targp->bt_bdev);
>> +		if (error)
>> +			goto out_free_rtdev_targ;
>>   	} else {
>>   		mp->m_logdev_targp = mp->m_ddev_targp;
>>   		/* Handle won't be used, drop it */
>>
>>
>> Right?
> Yes.  Or in fact just folding it into xfs_alloc_buftarg, which might
> be even simpler.

Yes, that was my next question..

>  While you're at it adding a command why we are doing
> the sync would also be really useful, and having it in just one place
> helps with that.

ok, there was such comment in xfs_preflush_devices().

@Darrick, please comment on whether happy with changes discussed.

Thanks,
John
Re: [PATCH v11 02/16] xfs: only call xfs_setsize_buftarg once per buffer target
Posted by Darrick J. Wong 7 months, 2 weeks ago
On Mon, May 05, 2025 at 11:55:13AM +0100, John Garry wrote:
> On 05/05/2025 11:49, Christoph Hellwig wrote:
> > On Mon, May 05, 2025 at 11:04:55AM +0100, John Garry wrote:
> > > @@ -503,6 +509,9 @@ xfs_open_devices(
> > >   		mp->m_logdev_targp = xfs_alloc_buftarg(mp, logdev_file);
> > >   		if (!mp->m_logdev_targp)
> > >   			goto out_free_rtdev_targ;
> > > +		error = sync_blockdev(mp->m_logdev_targp->bt_bdev);
> > > +		if (error)
> > > +			goto out_free_rtdev_targ;
> > >   	} else {
> > >   		mp->m_logdev_targp = mp->m_ddev_targp;
> > >   		/* Handle won't be used, drop it */
> > > 
> > > 
> > > Right?
> > Yes.  Or in fact just folding it into xfs_alloc_buftarg, which might
> > be even simpler.
> 
> Yes, that was my next question..
> 
> >  While you're at it adding a command why we are doing
> > the sync would also be really useful, and having it in just one place
> > helps with that.
> 
> ok, there was such comment in xfs_preflush_devices().
> 
> @Darrick, please comment on whether happy with changes discussed.

I put the sync_blockdev calls in a separate function so that the
EIO/ENOSPC/whatever errors that come from the block device sync don't
get morphed into ENOMEM by xfs_alloc_buftarg before being passed up.  I
suppose we could make that function return an ERR_PTR, but I was trying
to avoid making even more changes at the last minute, again.

--D

> Thanks,
> John
> 
>
Re: [PATCH v11 02/16] xfs: only call xfs_setsize_buftarg once per buffer target
Posted by John Garry 7 months, 2 weeks ago
On 05/05/2025 15:22, Darrick J. Wong wrote:
>>> Yes.  Or in fact just folding it into xfs_alloc_buftarg, which might
>>> be even simpler.
>> Yes, that was my next question..
>>
>>>   While you're at it adding a command why we are doing
>>> the sync would also be really useful, and having it in just one place
>>> helps with that.
>> ok, there was such comment in xfs_preflush_devices().
>>
>> @Darrick, please comment on whether happy with changes discussed.
> I put the sync_blockdev calls in a separate function so that the
> EIO/ENOSPC/whatever errors that come from the block device sync don't
> get morphed into ENOMEM by xfs_alloc_buftarg before being passed up.  I
> suppose we could make that function return an ERR_PTR, but I was trying
> to avoid making even more changes at the last minute, again.

It seems simpler to just have the individual sync_blockdev() calls from 
xfs_alloc_buftarg(), rather than adding ERR_PTR() et al handling in both 
xfs_alloc_buftarg() and xfs_open_devices().
Re: [PATCH v11 02/16] xfs: only call xfs_setsize_buftarg once per buffer target
Posted by John Garry 7 months, 2 weeks ago
On 05/05/2025 15:48, John Garry wrote:
>>> @Darrick, please comment on whether happy with changes discussed.
>> I put the sync_blockdev calls in a separate function so that the
>> EIO/ENOSPC/whatever errors that come from the block device sync don't
>> get morphed into ENOMEM by xfs_alloc_buftarg before being passed up.  I
>> suppose we could make that function return an ERR_PTR, but I was trying
>> to avoid making even more changes at the last minute, again.
> 
> It seems simpler to just have the individual sync_blockdev() calls from 
> xfs_alloc_buftarg(), rather than adding ERR_PTR() et al handling in both 
> xfs_alloc_buftarg() and xfs_open_devices().

Which of the following is better:

------8<-----
int
@@ -1810,10 +1806,10 @@ xfs_alloc_buftarg(
  	 * When allocating the buftargs we have not yet read the super block and
  	 * thus don't know the file system sector size yet.
  	 */
-	if (xfs_setsize_buftarg(btp, bdev_logical_block_size(btp->bt_bdev)))
-		goto error_free;
-	if (xfs_init_buftarg(btp, bdev_logical_block_size(btp->bt_bdev),
-			mp->m_super->s_id))
+	btp->bt_meta_sectorsize = bdev_logical_block_size(btp->bt_bdev);
+	btp->bt_meta_sectormask = btp->bt_meta_sectorsize - 1;
+
+	if (xfs_init_buftarg(btp, btp->bt_meta_sectorsize, mp->m_super->s_id))
  		goto error_free;

  	return btp;
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 5e456a6073ca..48d5b630fe46 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -481,21 +481,38 @@ xfs_open_devices(

  	/*
  	 * Setup xfs_mount buffer target pointers
+	 *
+	 * Flush and invalidate all devices' pagecaches before reading any
+	 * metadata because XFS doesn't use the bdev pagecache.
  	 */
-	error = -ENOMEM;
  	mp->m_ddev_targp = xfs_alloc_buftarg(mp, sb->s_bdev_file);
-	if (!mp->m_ddev_targp)
+	if (!mp->m_ddev_targp) {
+		error = -ENOMEM;
+		goto out_close_rtdev;
+	}
+	error = sync_blockdev(mp->m_ddev_targp->bt_bdev);
+	if (error)
  		goto out_close_rtdev;

  	if (rtdev_file) {
  		mp->m_rtdev_targp = xfs_alloc_buftarg(mp, rtdev_file);
-		if (!mp->m_rtdev_targp)
+		if (!mp->m_rtdev_targp) {
+			error = -ENOMEM;
+			goto out_free_ddev_targ;
+		}
+		error = sync_blockdev(mp->m_rtdev_targp->bt_bdev);
+		if (error)
  			goto out_free_ddev_targ;
  	}

  	if (logdev_file && file_bdev(logdev_file) != ddev) {
  		mp->m_logdev_targp = xfs_alloc_buftarg(mp, logdev_file);
-		if (!mp->m_logdev_targp)
+		if (!mp->m_logdev_targp) {
+			error = -ENOMEM;
+			goto out_free_rtdev_targ;
+		}
+		error = sync_blockdev(mp->m_logdev_targp->bt_bdev);
+		if (error)
  			goto out_free_rtdev_targ;


----->8------

int
@@ -1786,6 +1782,8 @@ xfs_alloc_buftarg(
  {
  	struct xfs_buftarg	*btp;
  	const struct dax_holder_operations *ops = NULL;
+	int			error;
+

  #if defined(CONFIG_FS_DAX) && defined(CONFIG_MEMORY_FAILURE)
  	ops = &xfs_dax_holder_operations;
@@ -1806,21 +1804,31 @@ xfs_alloc_buftarg(
  						btp->bt_bdev);
  	}

+	/*
+	 * Flush and invalidate all devices' pagecaches before reading any
+	 * metadata because XFS doesn't use the bdev pagecache.
+	 */
+	error = sync_blockdev(mp->m_ddev_targp->bt_bdev);
+	if (error)
+		goto error_free;
+
  	/*
  	 * When allocating the buftargs we have not yet read the super block and
  	 * thus don't know the file system sector size yet.
  	 */
-	if (xfs_setsize_buftarg(btp, bdev_logical_block_size(btp->bt_bdev)))
-		goto error_free;
-	if (xfs_init_buftarg(btp, bdev_logical_block_size(btp->bt_bdev),
-			mp->m_super->s_id))
+	btp->bt_meta_sectorsize = bdev_logical_block_size(btp->bt_bdev);
+	btp->bt_meta_sectormask = btp->bt_meta_sectorsize - 1;
+
+	if (xfs_init_buftarg(btp, btp->bt_meta_sectorsize, mp->m_super->s_id)) {
+		error = -ENOMEM;
  		goto error_free;
+	}

  	return btp;

  error_free:
  	kfree(btp);
-	return NULL;
+	return ERR_PTR(error);
  }

  static inline void
diff --git a/fs/xfs/xfs_super.c b/fs/xfs/xfs_super.c
index 5e456a6073ca..4daf0cc480af 100644
--- a/fs/xfs/xfs_super.c
+++ b/fs/xfs/xfs_super.c
@@ -482,21 +482,26 @@ xfs_open_devices(
  	/*
  	 * Setup xfs_mount buffer target pointers
  	 */
-	error = -ENOMEM;
  	mp->m_ddev_targp = xfs_alloc_buftarg(mp, sb->s_bdev_file);
-	if (!mp->m_ddev_targp)
+	if (IS_ERR(mp->m_ddev_targp)) {
+		error = PTR_ERR(mp->m_ddev_targp);
  		goto out_close_rtdev;
+	}

  	if (rtdev_file) {
  		mp->m_rtdev_targp = xfs_alloc_buftarg(mp, rtdev_file);
-		if (!mp->m_rtdev_targp)
+		if (IS_ERR(mp->m_rtdev_targp)) {
+			error = PTR_ERR(mp->m_rtdev_targp);
  			goto out_free_ddev_targ;
+		}
  	}

  	if (logdev_file && file_bdev(logdev_file) != ddev) {
  		mp->m_logdev_targp = xfs_alloc_buftarg(mp, logdev_file);
-		if (!mp->m_logdev_targp)
+		if (IS_ERR(mp->m_logdev_targp)) {
+			error = PTR_ERR(mp->m_logdev_targp);
  			goto out_free_rtdev_targ;
+		}


----8<-----


Re: [PATCH v11 02/16] xfs: only call xfs_setsize_buftarg once per buffer target
Posted by Christoph Hellwig 7 months, 2 weeks ago
On Mon, May 05, 2025 at 04:27:56PM +0100, John Garry wrote:
> On 05/05/2025 15:48, John Garry wrote:
>>>> @Darrick, please comment on whether happy with changes discussed.
>>> I put the sync_blockdev calls in a separate function so that the
>>> EIO/ENOSPC/whatever errors that come from the block device sync don't
>>> get morphed into ENOMEM by xfs_alloc_buftarg before being passed up.  I
>>> suppose we could make that function return an ERR_PTR, but I was trying
>>> to avoid making even more changes at the last minute, again.
>>
>> It seems simpler to just have the individual sync_blockdev() calls from 
>> xfs_alloc_buftarg(), rather than adding ERR_PTR() et al handling in both 
>> xfs_alloc_buftarg() and xfs_open_devices().
>
> Which of the following is better:

To me version 2 looks much better.  I had initial reservations as
ERR_PTR doesn't play well with userspace, but none of this code is
in libxfs, so that should be fine.
Re: [PATCH v11 02/16] xfs: only call xfs_setsize_buftarg once per buffer target
Posted by John Garry 7 months, 2 weeks ago
On 06/05/2025 05:22, Christoph Hellwig wrote:
>>> It seems simpler to just have the individual sync_blockdev() calls from
>>> xfs_alloc_buftarg(), rather than adding ERR_PTR() et al handling in both
>>> xfs_alloc_buftarg() and xfs_open_devices().
>> Which of the following is better:
> To me version 2 looks much better.  I had initial reservations as
> ERR_PTR doesn't play well with userspace, but none of this code is
> in libxfs, so that should be fine.

cool, that is what we/I had gone with in prep'ing v12