[RFC PATCH v4 29/34] ext4: fall back to buffer_head path for defrag

Zhang Yi posted 34 patches 1 year, 8 months ago
[RFC PATCH v4 29/34] ext4: fall back to buffer_head path for defrag
Posted by Zhang Yi 1 year, 8 months ago
From: Zhang Yi <yi.zhang@huawei.com>

Online defrag doesn't support iomap path yet, we have to fall back to
buffer_head path for the inode which has been using iomap. Changing
active inode is dangerous, before we start, we must hold the inode lock
and the mapping->invalidate_lock, and writeback all dirty folios and
drop the inode's pagecache.

Signed-off-by: Zhang Yi <yi.zhang@huawei.com>
---
 fs/ext4/move_extent.c | 34 ++++++++++++++++++++++++++++++++++
 1 file changed, 34 insertions(+)

diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
index 7cd4afa4de1d..3db255385367 100644
--- a/fs/ext4/move_extent.c
+++ b/fs/ext4/move_extent.c
@@ -538,6 +538,34 @@ mext_check_arguments(struct inode *orig_inode,
 	return 0;
 }
 
+/*
+ * Disable buffered iomap path for the inode that requiring move extents,
+ * fallback to buffer_head path.
+ */
+static int ext4_disable_buffered_iomap_aops(struct inode *inode)
+{
+	int err;
+
+	/*
+	 * The buffered_head aops don't know how to handle folios
+	 * dirtied by iomap, so before falling back, flush all dirty
+	 * folios the inode has.
+	 */
+	filemap_invalidate_lock(inode->i_mapping);
+	err = filemap_write_and_wait(inode->i_mapping);
+	if (err < 0) {
+		filemap_invalidate_unlock(inode->i_mapping);
+		return err;
+	}
+	truncate_inode_pages(inode->i_mapping, 0);
+
+	ext4_clear_inode_state(inode, EXT4_STATE_BUFFERED_IOMAP);
+	ext4_set_aops(inode);
+	filemap_invalidate_unlock(inode->i_mapping);
+
+	return 0;
+}
+
 /**
  * ext4_move_extents - Exchange the specified range of a file
  *
@@ -609,6 +637,12 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk,
 	inode_dio_wait(orig_inode);
 	inode_dio_wait(donor_inode);
 
+	/* Fallback to buffer_head aops for inodes with buffered iomap aops */
+	if (ext4_test_inode_state(orig_inode, EXT4_STATE_BUFFERED_IOMAP))
+		ext4_disable_buffered_iomap_aops(orig_inode);
+	if (ext4_test_inode_state(donor_inode, EXT4_STATE_BUFFERED_IOMAP))
+		ext4_disable_buffered_iomap_aops(donor_inode);
+
 	/* Protect extent tree against block allocations via delalloc */
 	ext4_double_down_write_data_sem(orig_inode, donor_inode);
 	/* Check the filesystem environment whether move_extent can be done */
-- 
2.39.2
Re: [RFC PATCH v4 29/34] ext4: fall back to buffer_head path for defrag
Posted by Dave Chinner 1 year, 7 months ago
On Wed, Apr 10, 2024 at 11:03:08PM +0800, Zhang Yi wrote:
> From: Zhang Yi <yi.zhang@huawei.com>
> 
> Online defrag doesn't support iomap path yet, we have to fall back to
> buffer_head path for the inode which has been using iomap. Changing
> active inode is dangerous, before we start, we must hold the inode lock
> and the mapping->invalidate_lock, and writeback all dirty folios and
> drop the inode's pagecache.

Even then, I don't think this is obviously safe. We went through
this with DAX and we couldn't make it work safely.

Just return EOPNOTSUPP to the online defrag ioctl if iomap is in use
- that avoids all the excitement involved in doing dangerous things
like swapping aops structures on actively referenced inodes...

-Dave.
-- 
Dave Chinner
david@fromorbit.com
Re: [RFC PATCH v4 29/34] ext4: fall back to buffer_head path for defrag
Posted by Zhang Yi 1 year, 7 months ago
On 2024/5/1 17:32, Dave Chinner wrote:
> On Wed, Apr 10, 2024 at 11:03:08PM +0800, Zhang Yi wrote:
>> From: Zhang Yi <yi.zhang@huawei.com>
>>
>> Online defrag doesn't support iomap path yet, we have to fall back to
>> buffer_head path for the inode which has been using iomap. Changing
>> active inode is dangerous, before we start, we must hold the inode lock
>> and the mapping->invalidate_lock, and writeback all dirty folios and
>> drop the inode's pagecache.
> 
> Even then, I don't think this is obviously safe. We went through
> this with DAX and we couldn't make it work safely.
> 
> Just return EOPNOTSUPP to the online defrag ioctl if iomap is in use
> - that avoids all the excitement involved in doing dangerous things
> like swapping aops structures on actively referenced inodes...
> 

Okay, this is just a temporary solution to support defrag. I've been
looking at how to support defrag for iomap recently, I hope it could
be supported in the near future, so let's drop this dangerous
operation.

Thanks,
Yi.