From nobody Wed Dec 17 18:14:13 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9C99A2F1FD3; Mon, 13 Oct 2025 01:52:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320369; cv=none; b=ZOM4lsgYYH0ehxOuQIfdoUa9NrjQOe2nklyD5EAVjzN7s5iYHk9KE/+khKjz9NOWTEjosbQXn8khaK3E3ycQtYOEZ9HlUl23fBJFBnv7TQbznWht+FDRGr1+r74WuEM7xrwGwM3eDSNWu4UbblM93BGzcBnvvmYRaoNyfn23wuI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320369; c=relaxed/simple; bh=+xxR2hag0/PnI5LaSNRvX1EX5gGrNaoG80rtC97RI+Q=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Eiy08l+kZzp2VgntVJPD5XbRRoDERvzH6PZR3p9EFsq7LIHD3ErcVyzPrtpb6jw5qkUsmEk6+ErWHXpNoJL2Ceuzh2NAGiXrAGAX4f7foCf1ph7ft589pIc+gsAt9+HXzMlbI0IXcRDdcEUvHgJMg9SUx1Ol/VDTL8ZDB1htDg0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4clL1r6M4DzYQtlS; Mon, 13 Oct 2025 09:52:00 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.75]) by mail.maildlp.com (Postfix) with ESMTP id 7B2E81A06D7; Mon, 13 Oct 2025 09:52:39 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP2 (Coremail) with SMTP id Syh0CgCn_UVfW+xoNhu7AA--.53067S5; Mon, 13 Oct 2025 09:52:39 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 01/12] ext4: correct the checking of quota files before moving extents Date: Mon, 13 Oct 2025 09:51:17 +0800 Message-ID: <20251013015128.499308-2-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20251013015128.499308-1-yi.zhang@huaweicloud.com> References: <20251013015128.499308-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: Syh0CgCn_UVfW+xoNhu7AA--.53067S5 X-Coremail-Antispam: 1UD129KBjvdXoW7GF43Xr4UXr1DAFyfXrWrXwb_yoWDCrcEya yIyrWkZrsYva929rs5JFyrJr4IkF4rKFn8WFZ5Cr15ur1xXr4kGrnYqrnIyr98Ww4UKrZ8 ZFs7tryayryIgjkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbkxFF20E14v26ryj6rWUM7CY07I20VC2zVCF04k26cxKx2IYs7xG 6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUGwA2048vs2IY02 0Ec7CjxVAFwI0_Gr0_Xr1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xv wVC0I7IYx2IY67AKxVWDJVCq3wA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r1Y6r17McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04 v7MxkF7I0En4kS14v26r1q6r43MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j 6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7 AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE 2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcV C2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2Kfnx nUUI43ZEXa7VUjc18JUUUUU== X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi The move extent operation should return -EOPNOTSUPP if any of the inodes is a quota inode, rather than requiring both to be quota inodes. Fixes: 02749a4c2082 ("ext4: add ext4_is_quota_file()") Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/move_extent.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 4b091c21908f..0f4b7c89edd3 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -485,7 +485,7 @@ mext_check_arguments(struct inode *orig_inode, return -ETXTBSY; } =20 - if (ext4_is_quota_file(orig_inode) && ext4_is_quota_file(donor_inode)) { + if (ext4_is_quota_file(orig_inode) || ext4_is_quota_file(donor_inode)) { ext4_debug("ext4 move extent: The argument files should not be quota fil= es [ino:orig %lu, donor %lu]\n", orig_inode->i_ino, donor_inode->i_ino); return -EOPNOTSUPP; --=20 2.46.1 From nobody Wed Dec 17 18:14:13 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9C91B2F1FD1; Mon, 13 Oct 2025 01:52:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320370; cv=none; b=T0rxu87lXTB6Le6BPnLDViHarPQQetG+V32vs+WwK8ySO6I76eWxmqQCL1gHmNPnWz0WTSXIipLjVOM9I18XAJOFkS/7nHAdY5CtAGTXmNDsGoVVmISWMBGJH3akI3ZuswhaaGtXvJA0bbbo4hXY/b4wtczxo5ydBbj6GsjsozQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320370; c=relaxed/simple; bh=gfXHwnkcrbxsGS9YWwNB2M+cSNzyC3eTDoTV3rbZoys=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WgteuVFOp1xoyU/NZt2wchGNowAhHrsjW+HOD6rOKclvd8TQuSz2v9/cPf9eedQ/D0vMu3iIWN7R6xye/sHb0luo6zdIF901cwSMLAkXoATNX9qK9evg4Juc8oLLaUoj5H+sHArb6p8S5OMu5qLtaEgkjiman1JyZ8y7XhlXWTE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4clL1r6kf0zYQtlj; Mon, 13 Oct 2025 09:52:00 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.75]) by mail.maildlp.com (Postfix) with ESMTP id 8A2AC1A1AE3; Mon, 13 Oct 2025 09:52:39 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP2 (Coremail) with SMTP id Syh0CgCn_UVfW+xoNhu7AA--.53067S6; Mon, 13 Oct 2025 09:52:39 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 02/12] ext4: introduce seq counter for the extent status entry Date: Mon, 13 Oct 2025 09:51:18 +0800 Message-ID: <20251013015128.499308-3-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20251013015128.499308-1-yi.zhang@huaweicloud.com> References: <20251013015128.499308-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: Syh0CgCn_UVfW+xoNhu7AA--.53067S6 X-Coremail-Antispam: 1UD129KBjvJXoWxtrW8WF4Uur1kZw1DWr45Wrg_yoWfJF1DpF ZxAry5WrWfXw4j9ayxXw1UXr15Xa48WrW7Jr9Fg34fZFWrJFyqgF1DtFyjvF90qrWFvrnx XFWFyryDC3WUWa7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUm014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0x vEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVj vjDU0xZFpf9x0JUl9a9UUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi In the iomap_write_iter(), the iomap buffered write frame does not hold any locks between querying the inode extent mapping info and performing page cache writes. As a result, the extent mapping can be changed due to concurrent I/O in flight. Similarly, in the iomap_writepage_map(), the write-back process faces a similar problem: concurrent changes can invalidate the extent mapping before the I/O is submitted. Therefore, both of these processes must recheck the mapping info after acquiring the folio lock. To address this, similar to XFS, we propose introducing an extent sequence number to serve as a validity cookie for the extent. After commit 24b7a2331fcd ("ext4: clairfy the rules for modifying extents"), we can ensure the extent information should always be processed through the extent status tree, and the extent status tree is always uptodate under i_rwsem or invalidate_lock or folio lock, so it's safe to introduce this sequence number. The sequence number will be increased whenever the extent status tree changes, preparing for the buffered write iomap conversion. Besides, this mechanism is also applicable for the moving extents case. In move_extent_per_page(), it also needs to reacquire data_sem and check the mapping info again under the folio lock. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/ext4.h | 2 ++ fs/ext4/extents_status.c | 25 +++++++++++++++++++++---- fs/ext4/super.c | 1 + include/trace/events/ext4.h | 23 +++++++++++++++-------- 4 files changed, 39 insertions(+), 12 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 57087da6c7be..eff97b3a1093 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1138,6 +1138,8 @@ struct ext4_inode_info { ext4_lblk_t i_es_shrink_lblk; /* Offset where we start searching for extents to shrink. Protected by i_es_lock */ + u64 i_es_seq; /* Change counter for extents. + Protected by i_es_lock */ =20 /* ialloc */ ext4_group_t i_last_alloc_group; diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 31dc0496f8d0..c3daa57ecd35 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -235,6 +235,13 @@ static inline ext4_lblk_t ext4_es_end(struct extent_st= atus *es) return es->es_lblk + es->es_len - 1; } =20 +static inline void ext4_es_inc_seq(struct inode *inode) +{ + struct ext4_inode_info *ei =3D EXT4_I(inode); + + WRITE_ONCE(ei->i_es_seq, ei->i_es_seq + 1); +} + /* * search through the tree for an delayed extent with a given offset. If * it can't be found, try to find next extent. @@ -906,7 +913,6 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lb= lk_t lblk, newes.es_lblk =3D lblk; newes.es_len =3D len; ext4_es_store_pblock_status(&newes, pblk, status); - trace_ext4_es_insert_extent(inode, &newes); =20 ext4_es_insert_extent_check(inode, &newes); =20 @@ -955,6 +961,11 @@ void ext4_es_insert_extent(struct inode *inode, ext4_l= blk_t lblk, } pending =3D err3; } + /* + * TODO: For cache on-disk extents, there is no need to increment + * the sequence counter, this requires future optimization. + */ + ext4_es_inc_seq(inode); error: write_unlock(&EXT4_I(inode)->i_es_lock); /* @@ -981,6 +992,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lb= lk_t lblk, if (err1 || err2 || err3 < 0) goto retry; =20 + trace_ext4_es_insert_extent(inode, &newes); ext4_es_print_tree(inode); return; } @@ -1550,7 +1562,6 @@ void ext4_es_remove_extent(struct inode *inode, ext4_= lblk_t lblk, if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) return; =20 - trace_ext4_es_remove_extent(inode, lblk, len); es_debug("remove [%u/%u) from extent status tree of inode %lu\n", lblk, len, inode->i_ino); =20 @@ -1570,16 +1581,21 @@ void ext4_es_remove_extent(struct inode *inode, ext= 4_lblk_t lblk, */ write_lock(&EXT4_I(inode)->i_es_lock); err =3D __es_remove_extent(inode, lblk, end, &reserved, es); + if (err) + goto error; /* Free preallocated extent if it didn't get used. */ if (es) { if (!es->es_len) __es_free_extent(es); es =3D NULL; } + ext4_es_inc_seq(inode); +error: write_unlock(&EXT4_I(inode)->i_es_lock); if (err) goto retry; =20 + trace_ext4_es_remove_extent(inode, lblk, len); ext4_es_print_tree(inode); ext4_da_release_space(inode, reserved); } @@ -2140,8 +2156,6 @@ void ext4_es_insert_delayed_extent(struct inode *inod= e, ext4_lblk_t lblk, newes.es_lblk =3D lblk; newes.es_len =3D len; ext4_es_store_pblock_status(&newes, ~0, EXTENT_STATUS_DELAYED); - trace_ext4_es_insert_delayed_extent(inode, &newes, lclu_allocated, - end_allocated); =20 ext4_es_insert_extent_check(inode, &newes); =20 @@ -2196,11 +2210,14 @@ void ext4_es_insert_delayed_extent(struct inode *in= ode, ext4_lblk_t lblk, pr2 =3D NULL; } } + ext4_es_inc_seq(inode); error: write_unlock(&EXT4_I(inode)->i_es_lock); if (err1 || err2 || err3 < 0) goto retry; =20 + trace_ext4_es_insert_delayed_extent(inode, &newes, lclu_allocated, + end_allocated); ext4_es_print_tree(inode); ext4_print_pending_tree(inode); return; diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 33e7c08c9529..760c9d7588be 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1406,6 +1406,7 @@ static struct inode *ext4_alloc_inode(struct super_bl= ock *sb) ei->i_es_all_nr =3D 0; ei->i_es_shk_nr =3D 0; ei->i_es_shrink_lblk =3D 0; + ei->i_es_seq =3D 0; ei->i_reserved_data_blocks =3D 0; spin_lock_init(&(ei->i_block_reservation_lock)); ext4_init_pending_tree(&ei->i_pending_tree); diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index a374e7ea7e57..6a0754d38acf 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -2210,7 +2210,8 @@ DECLARE_EVENT_CLASS(ext4__es_extent, __field( ext4_lblk_t, lblk ) __field( ext4_lblk_t, len ) __field( ext4_fsblk_t, pblk ) - __field( char, status ) + __field( char, status ) + __field( u64, seq ) ), =20 TP_fast_assign( @@ -2220,13 +2221,15 @@ DECLARE_EVENT_CLASS(ext4__es_extent, __entry->len =3D es->es_len; __entry->pblk =3D ext4_es_show_pblock(es); __entry->status =3D ext4_es_status(es); + __entry->seq =3D EXT4_I(inode)->i_es_seq; ), =20 - TP_printk("dev %d,%d ino %lu es [%u/%u) mapped %llu status %s", + TP_printk("dev %d,%d ino %lu es [%u/%u) mapped %llu status %s seq %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->lblk, __entry->len, - __entry->pblk, show_extent_status(__entry->status)) + __entry->pblk, show_extent_status(__entry->status), + __entry->seq) ); =20 DEFINE_EVENT(ext4__es_extent, ext4_es_insert_extent, @@ -2251,6 +2254,7 @@ TRACE_EVENT(ext4_es_remove_extent, __field( ino_t, ino ) __field( loff_t, lblk ) __field( loff_t, len ) + __field( u64, seq ) ), =20 TP_fast_assign( @@ -2258,12 +2262,13 @@ TRACE_EVENT(ext4_es_remove_extent, __entry->ino =3D inode->i_ino; __entry->lblk =3D lblk; __entry->len =3D len; + __entry->seq =3D EXT4_I(inode)->i_es_seq; ), =20 - TP_printk("dev %d,%d ino %lu es [%lld/%lld)", + TP_printk("dev %d,%d ino %lu es [%lld/%lld) seq %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, - __entry->lblk, __entry->len) + __entry->lblk, __entry->len, __entry->seq) ); =20 TRACE_EVENT(ext4_es_find_extent_range_enter, @@ -2523,6 +2528,7 @@ TRACE_EVENT(ext4_es_insert_delayed_extent, __field( char, status ) __field( bool, lclu_allocated ) __field( bool, end_allocated ) + __field( u64, seq ) ), =20 TP_fast_assign( @@ -2534,15 +2540,16 @@ TRACE_EVENT(ext4_es_insert_delayed_extent, __entry->status =3D ext4_es_status(es); __entry->lclu_allocated =3D lclu_allocated; __entry->end_allocated =3D end_allocated; + __entry->seq =3D EXT4_I(inode)->i_es_seq; ), =20 - TP_printk("dev %d,%d ino %lu es [%u/%u) mapped %llu status %s " - "allocated %d %d", + TP_printk("dev %d,%d ino %lu es [%u/%u) mapped %llu status %s allocated %= d %d seq %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->lblk, __entry->len, __entry->pblk, show_extent_status(__entry->status), - __entry->lclu_allocated, __entry->end_allocated) + __entry->lclu_allocated, __entry->end_allocated, + __entry->seq) ); =20 /* fsmap traces */ --=20 2.46.1 From nobody Wed Dec 17 18:14:13 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AAF7126F2AE; Mon, 13 Oct 2025 01:52:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320364; cv=none; b=nlZIZrSMBhEl9kSFRDj03ka68g1QWadfY+rTUh0sb4WD6VyyuXX4I+dB6diSwiIlwSlpsv6wJj4gv61bsBom9ilP2/BdNbtt7X7/kilUy+KEhKwPV0zdLh12YWz//1yr96xK19QWFKhwVDTWL9gX/VYGIP7iB2KfN7Wpt5aVWoE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320364; c=relaxed/simple; bh=KGn4KbkSGD3zxr1G5PckiEK7mCneyM8O4qvbBF316tQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PEdzZWG+Gn6JUoepyN8zygFgKdNnTaa/HtrTb0BKsbBfh6T6d+losXyvA1lWkr5+4R9MBWbY40od0kmLudQpxs7NNcCKz1MhIGfPyyy0d9l2GpnSbZbas9wl2XnT4YH272MQ4+uNzTmcv7OgjnFBUN6SFr96WQODNAg7k/f603g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4clL1x3mydzKHMLM; Mon, 13 Oct 2025 09:52:05 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.75]) by mail.maildlp.com (Postfix) with ESMTP id 9ACC01A0E6A; Mon, 13 Oct 2025 09:52:39 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP2 (Coremail) with SMTP id Syh0CgCn_UVfW+xoNhu7AA--.53067S7; Mon, 13 Oct 2025 09:52:39 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 03/12] ext4: make ext4_es_lookup_extent() pass out the extent seq counter Date: Mon, 13 Oct 2025 09:51:19 +0800 Message-ID: <20251013015128.499308-4-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20251013015128.499308-1-yi.zhang@huaweicloud.com> References: <20251013015128.499308-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: Syh0CgCn_UVfW+xoNhu7AA--.53067S7 X-Coremail-Antispam: 1UD129KBjvJXoWxGw4rJFyUAry8Cr13CFy5CFg_yoWrur4xp3 9rAr1UGw1fZw1v9ayxKF47Zr15K3WYkrW7Cr93Kw1rKa4rXrySyF10yFW2yFyFgrWIqwn0 vF40kw1UGa1fKa7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUm014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0x vEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVj vjDU0xZFpf9x0JULBMNUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi When querying extents in the extent status tree, we should hold the data_sem if we want to obtain the sequence number as a valid cookie simultaneously. However, currently, ext4_map_blocks() calls ext4_es_lookup_extent() without holding data_sem. Therefore, we should acquire i_es_lock instead, which also ensures that the sequence cookie and the extent remain consistent. Consequently, make ext4_es_lookup_extent() to pass out the sequence number when necessary. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/extents.c | 2 +- fs/ext4/extents_status.c | 6 ++++-- fs/ext4/extents_status.h | 2 +- fs/ext4/inode.c | 8 ++++---- 4 files changed, 10 insertions(+), 8 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index ca5499e9412b..c7d219e6c6d8 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -2213,7 +2213,7 @@ static int ext4_fill_es_cache_info(struct inode *inod= e, while (block <=3D end) { next =3D 0; flags =3D 0; - if (!ext4_es_lookup_extent(inode, block, &next, &es)) + if (!ext4_es_lookup_extent(inode, block, &next, &es, NULL)) break; if (ext4_es_is_unwritten(&es)) flags |=3D FIEMAP_EXTENT_UNWRITTEN; diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index c3daa57ecd35..e04fbf10fe4f 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -1039,8 +1039,8 @@ void ext4_es_cache_extent(struct inode *inode, ext4_l= blk_t lblk, * Return: 1 on found, 0 on not */ int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk, - ext4_lblk_t *next_lblk, - struct extent_status *es) + ext4_lblk_t *next_lblk, struct extent_status *es, + u64 *pseq) { struct ext4_es_tree *tree; struct ext4_es_stats *stats; @@ -1099,6 +1099,8 @@ int ext4_es_lookup_extent(struct inode *inode, ext4_l= blk_t lblk, } else *next_lblk =3D 0; } + if (pseq) + *pseq =3D EXT4_I(inode)->i_es_seq; } else { percpu_counter_inc(&stats->es_stats_cache_misses); } diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h index 8f9c008d11e8..f3396cf32b44 100644 --- a/fs/ext4/extents_status.h +++ b/fs/ext4/extents_status.h @@ -148,7 +148,7 @@ extern void ext4_es_find_extent_range(struct inode *ino= de, struct extent_status *es); extern int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t *next_lblk, - struct extent_status *es); + struct extent_status *es, u64 *pseq); extern bool ext4_es_scan_range(struct inode *inode, int (*matching_fn)(struct extent_status *es), ext4_lblk_t lblk, ext4_lblk_t end); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index f9e4ac87211e..10792772b450 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -649,7 +649,7 @@ static int ext4_map_create_blocks(handle_t *handle, str= uct inode *inode, * extent status tree. */ if (flags & EXT4_GET_BLOCKS_PRE_IO && - ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) { + ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) { if (ext4_es_is_written(&es)) return retval; } @@ -723,7 +723,7 @@ int ext4_map_blocks(handle_t *handle, struct inode *ino= de, ext4_check_map_extents_env(inode); =20 /* Lookup extent status tree firstly */ - if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) { + if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) { if (ext4_es_is_written(&es) || ext4_es_is_unwritten(&es)) { map->m_pblk =3D ext4_es_pblock(&es) + map->m_lblk - es.es_lblk; @@ -1908,7 +1908,7 @@ static int ext4_da_map_blocks(struct inode *inode, st= ruct ext4_map_blocks *map) ext4_check_map_extents_env(inode); =20 /* Lookup extent status tree firstly */ - if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) { + if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) { map->m_len =3D min_t(unsigned int, map->m_len, es.es_len - (map->m_lblk - es.es_lblk)); =20 @@ -1961,7 +1961,7 @@ static int ext4_da_map_blocks(struct inode *inode, st= ruct ext4_map_blocks *map) * is held in write mode, before inserting a new da entry in * the extent status tree. */ - if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) { + if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) { map->m_len =3D min_t(unsigned int, map->m_len, es.es_len - (map->m_lblk - es.es_lblk)); =20 --=20 2.46.1 From nobody Wed Dec 17 18:14:13 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AAFDB270575; Mon, 13 Oct 2025 01:52:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320363; cv=none; b=I8XPbzTd8C0d9dHBPfX3iKVn8Ik9eodhWAPgP0BRT29nU2NMyZOu4TrtilsvH1yQIcslgYy5omFCW2VLpu5DHM0GXamD3Yn+0VAb8k7qqI17mzkTQUZYyyQGzXVDblyUzmvluJviSMUsS+SongYCqcUagzCx38yhqLX6Qy1riHQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320363; c=relaxed/simple; bh=5/sfsbtw6iIXz/PCef9UHqJUh7c2SsT1Y+pUbzGv4Po=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UPACgEDXFG+nPzuSrB6u57JpFXLL3AGWlzZOrOI3tnRcZh8Wo0CpATr9VHvPzQOMR/PXbB832s81EwYHTfeYTG4NTuCamxVkd6o1V1/xFkt7udtn1exI8n5VqJP272fiW/F9KASQMSmQf37LG24LLmINeJYEzO+hq6KJluCoWqE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4clL1x4VnhzKHMMl; Mon, 13 Oct 2025 09:52:05 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.75]) by mail.maildlp.com (Postfix) with ESMTP id B06C11A1012; Mon, 13 Oct 2025 09:52:39 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP2 (Coremail) with SMTP id Syh0CgCn_UVfW+xoNhu7AA--.53067S8; Mon, 13 Oct 2025 09:52:39 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 04/12] ext4: pass out extent seq counter when mapping blocks Date: Mon, 13 Oct 2025 09:51:20 +0800 Message-ID: <20251013015128.499308-5-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20251013015128.499308-1-yi.zhang@huaweicloud.com> References: <20251013015128.499308-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: Syh0CgCn_UVfW+xoNhu7AA--.53067S8 X-Coremail-Antispam: 1UD129KBjvJXoWxXryrtr4xKrWrGw15CryDJrb_yoW5KFW7pr ZrAr1rGr4UWw1q9F4SyF4UZF1a93W5KrW7JrZ7WryFya4fJrn3tF1jyF1SyFyDKrWfX3WF qF45K34UCa1fGa7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUo73vUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi When creating or querying mapping blocks using the ext4_map_blocks() and ext4_map_{query|create}_blocks() helpers, also pass out the extent sequence number of the block mapping info through the ext4_map_blocks structure. This sequence number can later serve as a valid cookie within iomap infrastructure and the move extents procedure. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/ext4.h | 1 + fs/ext4/inode.c | 24 ++++++++++++++++-------- 2 files changed, 17 insertions(+), 8 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index eff97b3a1093..9f127aedbaee 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -260,6 +260,7 @@ struct ext4_map_blocks { ext4_lblk_t m_lblk; unsigned int m_len; unsigned int m_flags; + u64 m_seq; }; =20 /* diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 10792772b450..ad8deae1c7c3 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -550,10 +550,13 @@ static int ext4_map_query_blocks(handle_t *handle, st= ruct inode *inode, retval =3D ext4_ext_map_blocks(handle, inode, map, flags); else retval =3D ext4_ind_map_blocks(handle, inode, map, flags); - - if (retval <=3D 0) + if (retval < 0) return retval; =20 + /* A hole? */ + if (retval =3D=3D 0) + goto out; + if (unlikely(retval !=3D map->m_len)) { ext4_warning(inode->i_sb, "ES len assertion failed for inode " @@ -573,11 +576,13 @@ static int ext4_map_query_blocks(handle_t *handle, st= ruct inode *inode, EXTENT_STATUS_UNWRITTEN : EXTENT_STATUS_WRITTEN; ext4_es_insert_extent(inode, map->m_lblk, map->m_len, map->m_pblk, status, false); - return retval; + } else { + retval =3D ext4_map_query_blocks_next_in_leaf(handle, inode, map, + orig_mlen); } - - return ext4_map_query_blocks_next_in_leaf(handle, inode, map, - orig_mlen); +out: + map->m_seq =3D READ_ONCE(EXT4_I(inode)->i_es_seq); + return retval; } =20 static int ext4_map_create_blocks(handle_t *handle, struct inode *inode, @@ -649,7 +654,7 @@ static int ext4_map_create_blocks(handle_t *handle, str= uct inode *inode, * extent status tree. */ if (flags & EXT4_GET_BLOCKS_PRE_IO && - ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) { + ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, &map->m_seq)) { if (ext4_es_is_written(&es)) return retval; } @@ -658,6 +663,7 @@ static int ext4_map_create_blocks(handle_t *handle, str= uct inode *inode, EXTENT_STATUS_UNWRITTEN : EXTENT_STATUS_WRITTEN; ext4_es_insert_extent(inode, map->m_lblk, map->m_len, map->m_pblk, status, flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE); + map->m_seq =3D READ_ONCE(EXT4_I(inode)->i_es_seq); =20 return retval; } @@ -723,7 +729,7 @@ int ext4_map_blocks(handle_t *handle, struct inode *ino= de, ext4_check_map_extents_env(inode); =20 /* Lookup extent status tree firstly */ - if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) { + if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, &map->m_seq)) { if (ext4_es_is_written(&es) || ext4_es_is_unwritten(&es)) { map->m_pblk =3D ext4_es_pblock(&es) + map->m_lblk - es.es_lblk; @@ -1979,6 +1985,8 @@ static int ext4_da_map_blocks(struct inode *inode, st= ruct ext4_map_blocks *map) =20 map->m_flags |=3D EXT4_MAP_DELAYED; retval =3D ext4_insert_delayed_blocks(inode, map->m_lblk, map->m_len); + if (!retval) + map->m_seq =3D READ_ONCE(EXT4_I(inode)->i_es_seq); up_write(&EXT4_I(inode)->i_data_sem); =20 return retval; --=20 2.46.1 From nobody Wed Dec 17 18:14:13 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AB08E270EA3; Mon, 13 Oct 2025 01:52:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320363; cv=none; b=bAqBhapIDxvhIEoyAMUHcarZJIcz7bTZMv9+q5FdrlpxNgY6v70LuRt+PruuvrEtMV5Te5TObqdbkIF48jKaU5NMry362s4mB2Mg1qaFz8L2Iz+SX8n0n9lTfxVbrcwewWlKH5yuASu8z6JR1psOUv0rKVTbXd25kv9+AYsG9Us= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320363; c=relaxed/simple; bh=mbJmM6BNJHUabh4u8YeL3sdf4tyrULCGkseO9tryUd0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=iZrbNfe8mRjof1+frkjfOgpzI9Mer5q1f67d4PHgklVlkQZ8XndTNXIl0acEt5NQuUeR9wHNVvUMzSL52o9iZrL9UIelRAtUFOz+cfKNi6Dh3c8zU4HZUMHkLn3ilnEWazDQr9Hj2NRGYjmUvGpu8Zv60EhEWY4TB/8wTIjCR0E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4clL1x4ZzczKHMNt; Mon, 13 Oct 2025 09:52:05 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.75]) by mail.maildlp.com (Postfix) with ESMTP id B7CDE1A14DF; Mon, 13 Oct 2025 09:52:39 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP2 (Coremail) with SMTP id Syh0CgCn_UVfW+xoNhu7AA--.53067S9; Mon, 13 Oct 2025 09:52:39 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 05/12] ext4: use EXT4_B_TO_LBLK() in mext_check_arguments() Date: Mon, 13 Oct 2025 09:51:21 +0800 Message-ID: <20251013015128.499308-6-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20251013015128.499308-1-yi.zhang@huaweicloud.com> References: <20251013015128.499308-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: Syh0CgCn_UVfW+xoNhu7AA--.53067S9 X-Coremail-Antispam: 1UD129KBjvJXoW7Ar1fCFy3CrW5ur1rZry8Grg_yoW8GF1Dp3 WIyan5C3yqqa4Y9w109F1Iv348Ka1xGr47XrWfJr4UWay0kFyFgF15Kan8Aa4jqrWkJ34r ZFn2kr17Xw15G3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUo73vUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi Switch to using EXT4_B_TO_LBLK() to calculate the EOF position of the origin and donor inodes, instead of using open-coded calculations. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/move_extent.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 0f4b7c89edd3..6175906c7119 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -461,12 +461,6 @@ mext_check_arguments(struct inode *orig_inode, __u64 donor_start, __u64 *len) { __u64 orig_eof, donor_eof; - unsigned int blkbits =3D orig_inode->i_blkbits; - unsigned int blocksize =3D 1 << blkbits; - - orig_eof =3D (i_size_read(orig_inode) + blocksize - 1) >> blkbits; - donor_eof =3D (i_size_read(donor_inode) + blocksize - 1) >> blkbits; - =20 if (donor_inode->i_mode & (S_ISUID|S_ISGID)) { ext4_debug("ext4 move extent: suid or sgid is set" @@ -526,6 +520,9 @@ mext_check_arguments(struct inode *orig_inode, orig_inode->i_ino, donor_inode->i_ino); return -EINVAL; } + + orig_eof =3D EXT4_B_TO_LBLK(orig_inode, i_size_read(orig_inode)); + donor_eof =3D EXT4_B_TO_LBLK(donor_inode, i_size_read(donor_inode)); if (orig_eof <=3D orig_start) *len =3D 0; else if (orig_eof < orig_start + *len - 1) --=20 2.46.1 From nobody Wed Dec 17 18:14:13 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9CBDB2F1FD7; Mon, 13 Oct 2025 01:52:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320370; cv=none; b=TjIgQGCyFnDzu1sOfMstTT9IU/8CPb5kXGEPJuBE0l12QYQvvESN/Rqr2Tl3rDL+gBQqT5NHrugz3RZp5jq30k3rk7IFO5vwB1ri6BHcpDFLwox+dK6MPz0A83RGooVPVn1L8EneywXIslkFGVfDJOzK85z+Xtw6vGNYfxf3f24= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320370; c=relaxed/simple; bh=WEJ8oXQIdcgfihMlXbbPwQ4Nqs/lYadab2UUZf07uIY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lXDFzYzSvZekBtFQyUm4+k779D52iVIuECCJLu0oR4UBt4kM6wcLdlQhfw52+xbfd5SGhGps28QslBxQPqtxdxdOGiHvylWRUb5VYqmiheZLu4ZHXDDNHoWy8+Jl1MHMBKNwcWwIsQ/Ek+CUmfXWgCwV+NqO8sUDs7ikuaSNz8A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4clL1s1tsTzYQtl9; Mon, 13 Oct 2025 09:52:01 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.75]) by mail.maildlp.com (Postfix) with ESMTP id D2A961A14DD; Mon, 13 Oct 2025 09:52:39 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP2 (Coremail) with SMTP id Syh0CgCn_UVfW+xoNhu7AA--.53067S10; Mon, 13 Oct 2025 09:52:39 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 06/12] ext4: add mext_check_validity() to do basic check Date: Mon, 13 Oct 2025 09:51:22 +0800 Message-ID: <20251013015128.499308-7-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20251013015128.499308-1-yi.zhang@huaweicloud.com> References: <20251013015128.499308-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: Syh0CgCn_UVfW+xoNhu7AA--.53067S10 X-Coremail-Antispam: 1UD129KBjvJXoW3Xr43Ar45tFWUurWftrWDJwb_yoW7XrWxpF yxCr15X34UXas0k3yrtFsxXr1Y93WxKr42grZ3Xw48ZFWDCF9Igw1kGF4vv3WUtrWDJ3y0 qF42kry7ua17JaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUo73vUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi Currently, the basic validation checks during the move extent operation are scattered across __ext4_ioctl() and ext4_move_extents(), which makes the code somewhat disorganized. Introduce a new helper, mext_check_validity(), to handle these checks. This change involves only code relocation without any logical modifications. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/ioctl.c | 10 ----- fs/ext4/move_extent.c | 102 +++++++++++++++++++++++++++--------------- 2 files changed, 65 insertions(+), 47 deletions(-) diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index a93a7baae990..366a9b615bf0 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -1641,16 +1641,6 @@ static long __ext4_ioctl(struct file *filp, unsigned= int cmd, unsigned long arg) if (!(fd_file(donor)->f_mode & FMODE_WRITE)) return -EBADF; =20 - if (ext4_has_feature_bigalloc(sb)) { - ext4_msg(sb, KERN_ERR, - "Online defrag not supported with bigalloc"); - return -EOPNOTSUPP; - } else if (IS_DAX(inode)) { - ext4_msg(sb, KERN_ERR, - "Online defrag not supported with DAX"); - return -EOPNOTSUPP; - } - err =3D mnt_want_write_file(filp); if (err) return err; diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 6175906c7119..cdd175d5c1f3 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -442,6 +442,68 @@ move_extent_per_page(struct file *o_filp, struct inode= *donor_inode, goto unlock_folios; } =20 +/* + * Check the validity of the basic filesystem environment and the + * inodes' support status. + */ +static int mext_check_validity(struct inode *orig_inode, + struct inode *donor_inode) +{ + struct super_block *sb =3D orig_inode->i_sb; + + /* origin and donor should be different inodes */ + if (orig_inode =3D=3D donor_inode) { + ext4_debug("ext4 move extent: The argument files should not be same inod= e [ino:orig %lu, donor %lu]\n", + orig_inode->i_ino, donor_inode->i_ino); + return -EINVAL; + } + + /* origin and donor should belone to the same filesystem */ + if (orig_inode->i_sb !=3D donor_inode->i_sb) { + ext4_debug("ext4 move extent: The argument files should be in same FS [i= no:orig %lu, donor %lu]\n", + orig_inode->i_ino, donor_inode->i_ino); + return -EINVAL; + } + + /* Regular file check */ + if (!S_ISREG(orig_inode->i_mode) || !S_ISREG(donor_inode->i_mode)) { + ext4_debug("ext4 move extent: The argument files should be regular file = [ino:orig %lu, donor %lu]\n", + orig_inode->i_ino, donor_inode->i_ino); + return -EINVAL; + } + + if (ext4_has_feature_bigalloc(sb)) { + ext4_msg(sb, KERN_ERR, + "Online defrag not supported with bigalloc"); + return -EOPNOTSUPP; + } + + if (IS_DAX(orig_inode)) { + ext4_msg(sb, KERN_ERR, + "Online defrag not supported with DAX"); + return -EOPNOTSUPP; + } + + /* + * TODO: it's not obvious how to swap blocks for inodes with full + * journaling enabled. + */ + if (ext4_should_journal_data(orig_inode) || + ext4_should_journal_data(donor_inode)) { + ext4_msg(sb, KERN_ERR, + "Online defrag not supported with data journaling"); + return -EOPNOTSUPP; + } + + if (IS_ENCRYPTED(orig_inode) || IS_ENCRYPTED(donor_inode)) { + ext4_msg(sb, KERN_ERR, + "Online defrag not supported for encrypted files"); + return -EOPNOTSUPP; + } + + return 0; +} + /** * mext_check_arguments - Check whether move extent can be done * @@ -567,43 +629,9 @@ ext4_move_extents(struct file *o_filp, struct file *d_= filp, __u64 orig_blk, ext4_lblk_t d_start =3D donor_blk; int ret; =20 - if (orig_inode->i_sb !=3D donor_inode->i_sb) { - ext4_debug("ext4 move extent: The argument files " - "should be in same FS [ino:orig %lu, donor %lu]\n", - orig_inode->i_ino, donor_inode->i_ino); - return -EINVAL; - } - - /* orig and donor should be different inodes */ - if (orig_inode =3D=3D donor_inode) { - ext4_debug("ext4 move extent: The argument files should not " - "be same inode [ino:orig %lu, donor %lu]\n", - orig_inode->i_ino, donor_inode->i_ino); - return -EINVAL; - } - - /* Regular file check */ - if (!S_ISREG(orig_inode->i_mode) || !S_ISREG(donor_inode->i_mode)) { - ext4_debug("ext4 move extent: The argument files should be " - "regular file [ino:orig %lu, donor %lu]\n", - orig_inode->i_ino, donor_inode->i_ino); - return -EINVAL; - } - - /* TODO: it's not obvious how to swap blocks for inodes with full - journaling enabled */ - if (ext4_should_journal_data(orig_inode) || - ext4_should_journal_data(donor_inode)) { - ext4_msg(orig_inode->i_sb, KERN_ERR, - "Online defrag not supported with data journaling"); - return -EOPNOTSUPP; - } - - if (IS_ENCRYPTED(orig_inode) || IS_ENCRYPTED(donor_inode)) { - ext4_msg(orig_inode->i_sb, KERN_ERR, - "Online defrag not supported for encrypted files"); - return -EOPNOTSUPP; - } + ret =3D mext_check_validity(orig_inode, donor_inode); + if (ret) + return ret; =20 /* Protect orig and donor inodes against a truncate */ lock_two_nondirectories(orig_inode, donor_inode); --=20 2.46.1 From nobody Wed Dec 17 18:14:13 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9CAED2F1FD6; Mon, 13 Oct 2025 01:52:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320370; cv=none; b=L7gDchPHE8JXmuMbNZtEaYRUME9x/Z7CIAuU1r6uNadevnRGStWXHPrgvdir9VfYKLXFdsUuFrTNMbZrIcBVnMIqkK2Q4tFx8L8pDOOtqd+qmSiKn4BpG5NZ12BoZmEN9WSunLQYCPlkoOn7jsRq4j/YmQdWUIXSz7t0D47DJU4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320370; c=relaxed/simple; bh=0TCdYGIC4kbIMGTSuIFa4/EzZ7AtsSeMSqB2nFU6Lyc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Rgj4DajN2o4+QY/VpmWvtJvZPU8T604cS0WqSXZMu72gd3TzY1z7RCbTU1PnQt/SINo/TVv1zr4GdhbhwxAdvRxeKWADD8y+O8I4jtSNvRvCRrzroIM+8ZAqGOWNuqR4s/1PPe2zHbkzI1egBeALXSDctT5TQ9TXpQVIKA+kmSo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4clL1s2cD2zYQtn6; Mon, 13 Oct 2025 09:52:01 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.75]) by mail.maildlp.com (Postfix) with ESMTP id EBB8A1A0F69; Mon, 13 Oct 2025 09:52:39 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP2 (Coremail) with SMTP id Syh0CgCn_UVfW+xoNhu7AA--.53067S11; Mon, 13 Oct 2025 09:52:39 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 07/12] ext4: refactor mext_check_arguments() Date: Mon, 13 Oct 2025 09:51:23 +0800 Message-ID: <20251013015128.499308-8-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20251013015128.499308-1-yi.zhang@huaweicloud.com> References: <20251013015128.499308-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: Syh0CgCn_UVfW+xoNhu7AA--.53067S11 X-Coremail-Antispam: 1UD129KBjvJXoW3Wr4xGF1fCF17GF4xKr43trb_yoWxKFyUpF yxCry5Xw4vqayFg3yvyrsrX34Fk3W7Gr47XrZ7Xw18uFy8Ary2ga4UJa1vqF9xJrWUJ34a vF40yrnruw1rJaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUo73vUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi When moving extents, mext_check_validity() performs some basic file system and file checks. However, some essential checks need to be performed after acquiring the i_rwsem are still scattered in mext_check_arguments(). Move those checks into mext_check_validity() and make it executes entirely under the i_rwsem to make the checks clearer. Furthermore, rename mext_check_arguments() to mext_check_adjust_range(), as it only performs checks and length adjustments on the move extent range. Finally, also change the print message for the non-existent file check to be consistent with other unsupported checks. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/move_extent.c | 97 +++++++++++++++++++------------------------ 1 file changed, 43 insertions(+), 54 deletions(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index cdd175d5c1f3..0191a3c746db 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -501,60 +501,36 @@ static int mext_check_validity(struct inode *orig_ino= de, return -EOPNOTSUPP; } =20 - return 0; -} - -/** - * mext_check_arguments - Check whether move extent can be done - * - * @orig_inode: original inode - * @donor_inode: donor inode - * @orig_start: logical start offset in block for orig - * @donor_start: logical start offset in block for donor - * @len: the number of blocks to be moved - * - * Check the arguments of ext4_move_extents() whether the files can be - * exchanged with each other. - * Return 0 on success, or a negative error value on failure. - */ -static int -mext_check_arguments(struct inode *orig_inode, - struct inode *donor_inode, __u64 orig_start, - __u64 donor_start, __u64 *len) -{ - __u64 orig_eof, donor_eof; + /* Ext4 move extent supports only extent based file */ + if (!(ext4_test_inode_flag(orig_inode, EXT4_INODE_EXTENTS)) || + !(ext4_test_inode_flag(donor_inode, EXT4_INODE_EXTENTS))) { + ext4_msg(sb, KERN_ERR, + "Online defrag not supported for non-extent files"); + return -EOPNOTSUPP; + } =20 if (donor_inode->i_mode & (S_ISUID|S_ISGID)) { - ext4_debug("ext4 move extent: suid or sgid is set" - " to donor file [ino:orig %lu, donor %lu]\n", + ext4_debug("ext4 move extent: suid or sgid is set to donor file [ino:ori= g %lu, donor %lu]\n", orig_inode->i_ino, donor_inode->i_ino); return -EINVAL; } =20 - if (IS_IMMUTABLE(donor_inode) || IS_APPEND(donor_inode)) + if (IS_IMMUTABLE(donor_inode) || IS_APPEND(donor_inode)) { + ext4_debug("ext4 move extent: donor should not be immutable or append fi= le [ino:orig %lu, donor %lu]\n", + orig_inode->i_ino, donor_inode->i_ino); return -EPERM; + } =20 /* Ext4 move extent does not support swap files */ if (IS_SWAPFILE(orig_inode) || IS_SWAPFILE(donor_inode)) { ext4_debug("ext4 move extent: The argument files should not be swap file= s [ino:orig %lu, donor %lu]\n", - orig_inode->i_ino, donor_inode->i_ino); + orig_inode->i_ino, donor_inode->i_ino); return -ETXTBSY; } =20 if (ext4_is_quota_file(orig_inode) || ext4_is_quota_file(donor_inode)) { ext4_debug("ext4 move extent: The argument files should not be quota fil= es [ino:orig %lu, donor %lu]\n", - orig_inode->i_ino, donor_inode->i_ino); - return -EOPNOTSUPP; - } - - /* Ext4 move extent supports only extent based file */ - if (!(ext4_test_inode_flag(orig_inode, EXT4_INODE_EXTENTS))) { - ext4_debug("ext4 move extent: orig file is not extents " - "based file [ino:orig %lu]\n", orig_inode->i_ino); - return -EOPNOTSUPP; - } else if (!(ext4_test_inode_flag(donor_inode, EXT4_INODE_EXTENTS))) { - ext4_debug("ext4 move extent: donor file is not extents " - "based file [ino:donor %lu]\n", donor_inode->i_ino); + orig_inode->i_ino, donor_inode->i_ino); return -EOPNOTSUPP; } =20 @@ -563,12 +539,25 @@ mext_check_arguments(struct inode *orig_inode, return -EINVAL; } =20 + return 0; +} + +/* + * Check the moving range of ext4_move_extents() whether the files can be + * exchanged with each other, and adjust the length to fit within the file + * size. Return 0 on success, or a negative error value on failure. + */ +static int mext_check_adjust_range(struct inode *orig_inode, + struct inode *donor_inode, __u64 orig_start, + __u64 donor_start, __u64 *len) +{ + __u64 orig_eof, donor_eof; + /* Start offset should be same */ if ((orig_start & ~(PAGE_MASK >> orig_inode->i_blkbits)) !=3D (donor_start & ~(PAGE_MASK >> orig_inode->i_blkbits))) { - ext4_debug("ext4 move extent: orig and donor's start " - "offsets are not aligned [ino:orig %lu, donor %lu]\n", - orig_inode->i_ino, donor_inode->i_ino); + ext4_debug("ext4 move extent: orig and donor's start offsets are not ali= gned [ino:orig %lu, donor %lu]\n", + orig_inode->i_ino, donor_inode->i_ino); return -EINVAL; } =20 @@ -577,9 +566,9 @@ mext_check_arguments(struct inode *orig_inode, (*len > EXT_MAX_BLOCKS) || (donor_start + *len >=3D EXT_MAX_BLOCKS) || (orig_start + *len >=3D EXT_MAX_BLOCKS)) { - ext4_debug("ext4 move extent: Can't handle over [%u] blocks " - "[ino:orig %lu, donor %lu]\n", EXT_MAX_BLOCKS, - orig_inode->i_ino, donor_inode->i_ino); + ext4_debug("ext4 move extent: Can't handle over [%u] blocks [ino:orig %l= u, donor %lu]\n", + EXT_MAX_BLOCKS, + orig_inode->i_ino, donor_inode->i_ino); return -EINVAL; } =20 @@ -594,9 +583,8 @@ mext_check_arguments(struct inode *orig_inode, else if (donor_eof < donor_start + *len - 1) *len =3D donor_eof - donor_start; if (!*len) { - ext4_debug("ext4 move extent: len should not be 0 " - "[ino:orig %lu, donor %lu]\n", orig_inode->i_ino, - donor_inode->i_ino); + ext4_debug("ext4 move extent: len should not be 0 [ino:orig %lu, donor %= lu]\n", + orig_inode->i_ino, donor_inode->i_ino); return -EINVAL; } =20 @@ -629,22 +617,22 @@ ext4_move_extents(struct file *o_filp, struct file *d= _filp, __u64 orig_blk, ext4_lblk_t d_start =3D donor_blk; int ret; =20 - ret =3D mext_check_validity(orig_inode, donor_inode); - if (ret) - return ret; - /* Protect orig and donor inodes against a truncate */ lock_two_nondirectories(orig_inode, donor_inode); =20 + ret =3D mext_check_validity(orig_inode, donor_inode); + if (ret) + goto unlock; + /* Wait for all existing dio workers */ inode_dio_wait(orig_inode); inode_dio_wait(donor_inode); =20 /* Protect extent tree against block allocations via delalloc */ ext4_double_down_write_data_sem(orig_inode, donor_inode); - /* Check the filesystem environment whether move_extent can be done */ - ret =3D mext_check_arguments(orig_inode, donor_inode, orig_blk, - donor_blk, &len); + /* Check and adjust the specified move_extent range. */ + ret =3D mext_check_adjust_range(orig_inode, donor_inode, orig_blk, + donor_blk, &len); if (ret) goto out; o_end =3D o_start + len; @@ -725,6 +713,7 @@ ext4_move_extents(struct file *o_filp, struct file *d_f= ilp, __u64 orig_blk, =20 ext4_free_ext_path(path); ext4_double_up_write_data_sem(orig_inode, donor_inode); +unlock: unlock_two_nondirectories(orig_inode, donor_inode); =20 return ret; --=20 2.46.1 From nobody Wed Dec 17 18:14:13 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BECBD2F5487; Mon, 13 Oct 2025 01:52:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320374; cv=none; b=iiFSqb/jfSzAIXYEf1LhsZWrbSCbBvHyeD1tZeq/z5FK3L2Jy9OcQllbMnntHM6fqLNo95vEJMRooda8UqlF1Me5rCf1G8axQUXapNpkqziHxhValbezFfot+2RuwsD++8+/STpAJuqJEJkE5kD0DWv60+KlXGQh49iXY077c2o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320374; c=relaxed/simple; bh=q8mg2kuq2sHXCI6ihLB3DjnwFpvHDG62+/A3Yw/8XG8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=d7qXE38xLuHBarAm2nVCISNTYo3q8Zzu2QFntjF8vpW7FiKwCHQpAdYFV9CcK7Y3N1xEi/i9ZpFaXAoIQjcvBaAwfgbJVYP7vVwVM8DpgDXeSU2v1mX8PEfVfyyu3H32yBSm8eKodMCeEIWgBMdxthDLF6cjQIA3XxMhV06UCH8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4clL1s2plvzYQtnf; Mon, 13 Oct 2025 09:52:01 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.75]) by mail.maildlp.com (Postfix) with ESMTP id 03B8A1A133D; Mon, 13 Oct 2025 09:52:40 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP2 (Coremail) with SMTP id Syh0CgCn_UVfW+xoNhu7AA--.53067S12; Mon, 13 Oct 2025 09:52:39 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 08/12] ext4: rename mext_page_mkuptodate() to mext_folio_mkuptodate() Date: Mon, 13 Oct 2025 09:51:24 +0800 Message-ID: <20251013015128.499308-9-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20251013015128.499308-1-yi.zhang@huaweicloud.com> References: <20251013015128.499308-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: Syh0CgCn_UVfW+xoNhu7AA--.53067S12 X-Coremail-Antispam: 1UD129KBjvJXoW7Jr1DtF17ur4kWw1Dtw45GFg_yoW8JF13pF y7Ca98trW8Zw1xuwn7JFnrZr45tay7Kr4UWFWfGw1SkFy7tFy0gF1UKa15ZFWFgFWkJrs5 uF4fKr1jqayUt3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUo73vUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi mext_page_mkuptodate() no longer works on a single page, so rename it to mext_folio_mkuptodate(). Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/move_extent.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 0191a3c746db..2df6072b26c0 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -165,7 +165,7 @@ mext_folio_double_lock(struct inode *inode1, struct ino= de *inode2, } =20 /* Force folio buffers uptodate w/o dropping folio's lock */ -static int mext_page_mkuptodate(struct folio *folio, size_t from, size_t t= o) +static int mext_folio_mkuptodate(struct folio *folio, size_t from, size_t = to) { struct inode *inode =3D folio->mapping->host; sector_t block; @@ -358,7 +358,7 @@ move_extent_per_page(struct file *o_filp, struct inode = *donor_inode, data_copy: from =3D offset_in_folio(folio[0], orig_blk_offset << orig_inode->i_blkbits); - *err =3D mext_page_mkuptodate(folio[0], from, from + replaced_size); + *err =3D mext_folio_mkuptodate(folio[0], from, from + replaced_size); if (*err) goto unlock_folios; =20 --=20 2.46.1 From nobody Wed Dec 17 18:14:13 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BB3992F547C; Mon, 13 Oct 2025 01:52:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320374; cv=none; b=qy0o6MaTucjztsCXCzMlKmEJJdfDaIxmtH8SzOG0rfDgpDkhO+92M2NB0+5OQAlUqLSv8I37hcL5rWH5hLWfpaQtKVG8DWgK+sTMFCcso7mkJFLtsoNdUbuwFVl1LE3SHUb9JFRwChn/qSmqYPAJzPPf7CLdg5C3wAbpRp9TZGw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320374; c=relaxed/simple; bh=6gqCwprDnzKylPfT+PySJfgXKnZks5bNSc2PuXv0mEE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k48t7RGZ3/7mIcmPj3lTYxCZ87a4cJG8oOaMiohRF8I1rlMjnpUPnR0aC8V3KdbfLv3b+SC6UIE5p8jBBaOsGR1GJvrQbpLSzjLIV6doMd6D6lHS4q4tI7qXBs9kM2DmcyIYE9ACOG6nKh+xpJ3uICPzYLbGKLO6pu5Odp1WBCs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4clL1s3JNKzYQtnf; Mon, 13 Oct 2025 09:52:01 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.75]) by mail.maildlp.com (Postfix) with ESMTP id 12C961A0F70; Mon, 13 Oct 2025 09:52:40 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP2 (Coremail) with SMTP id Syh0CgCn_UVfW+xoNhu7AA--.53067S13; Mon, 13 Oct 2025 09:52:39 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 09/12] ext4: introduce mext_move_extent() Date: Mon, 13 Oct 2025 09:51:25 +0800 Message-ID: <20251013015128.499308-10-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20251013015128.499308-1-yi.zhang@huaweicloud.com> References: <20251013015128.499308-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: Syh0CgCn_UVfW+xoNhu7AA--.53067S13 X-Coremail-Antispam: 1UD129KBjvJXoW3tFyDCF1fJrW3Wr18uw1rJFb_yoWDCryfpF W2krn8JrWDG3yI9r4Iyw48Zr1fKayxGr47AayfW343ZFyUtry0gas5t3WjvFyrKrWxJFyF qF4Fyry7Way7AaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUljgxUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi When moving extents, the current move_extent_per_page() process can only move extents of length PAGE_SIZE at a time, which is highly inefficient, especially when the fragmentation of the file is not particularly severe, this will result in a large number of unnecessary extent split and merge operations. Moreover, since the ext4 file system now supports large folios, using PAGE_SIZE as the processing unit is no longer practical. Therefore, introduce a new move extents method, mext_move_extent(). It moves one extent of the origin inode at a time, but not exceeding the size of a folio. The parameters for the move are passed through the new mext_data data structure, which includes the origin inode, donor inode, the mapping extent of the origin inode to be moved, and the starting offset of the donor inode. The move process is similar to move_extent_per_page() and can be categorized into three types: MEXT_SKIP_EXTENT, MEXT_MOVE_EXTENT, and MEXT_COPY_DATA. MEXT_SKIP_EXTENT indicates that the corresponding area of the donor file is a hole, meaning no actual space is allocated, so the move is skipped. MEXT_MOVE_EXTENT indicates that the corresponding areas of both the origin and donor files are unwritten, so no data needs to be copied; only the extents are swapped. MEXT_COPY_DATA indicates that the corresponding areas of both the origin and donor files contain data, so data must be copied. The data copying is performed in three steps: first, the data from the original location is read into the page cache; then, the extents are swapped, and the page cache is rebuilt to reflect the index of the physical blocks; finally, the dirty page cache is marked and written back to ensure that the data is written to disk before the metadata is persisted. One important point to note is that the folio lock and i_data_sem are held only during the moving process. Therefore, before moving an extent, it is necessary to check whether the sequence cookie of the area to be moved has changed while holding the folio lock. If a change is detected, it indicates that concurrent write-back operations may have occurred during this period, and the type of the extent to be moved can no longer be considered reliable. For example, it may have changed from unwritten to written. In such cases, return -ESTALE, and the calling function should reacquire the move extent of the original file and retry the movement. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/move_extent.c | 224 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 224 insertions(+) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 2df6072b26c0..92a716c56740 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -13,6 +13,13 @@ #include "ext4.h" #include "ext4_extents.h" =20 +struct mext_data { + struct inode *orig_inode; /* Origin file inode */ + struct inode *donor_inode; /* Donor file inode */ + struct ext4_map_blocks orig_map;/* Origin file's move mapping */ + ext4_lblk_t donor_lblk; /* Start block of the donor file */ +}; + /** * get_ext_path() - Find an extent path for designated logical block numbe= r. * @inode: inode to be searched @@ -164,6 +171,14 @@ mext_folio_double_lock(struct inode *inode1, struct in= ode *inode2, return 0; } =20 +static void mext_folio_double_unlock(struct folio *folio[2]) +{ + folio_unlock(folio[0]); + folio_put(folio[0]); + folio_unlock(folio[1]); + folio_put(folio[1]); +} + /* Force folio buffers uptodate w/o dropping folio's lock */ static int mext_folio_mkuptodate(struct folio *folio, size_t from, size_t = to) { @@ -238,6 +253,215 @@ static int mext_folio_mkuptodate(struct folio *folio,= size_t from, size_t to) return 0; } =20 +enum mext_move_type {MEXT_SKIP_EXTENT, MEXT_MOVE_EXTENT, MEXT_COPY_DATA}; + +/* + * Start to move extent between the origin inode and the donor inode, + * hold one folio for each inode and check the candidate moving extent + * mapping status again. + */ +static int mext_move_begin(struct mext_data *mext, struct folio *folio[2], + enum mext_move_type *move_type) +{ + struct inode *orig_inode =3D mext->orig_inode; + struct inode *donor_inode =3D mext->donor_inode; + unsigned int blkbits =3D orig_inode->i_blkbits; + struct ext4_map_blocks donor_map =3D {0}; + loff_t orig_pos, donor_pos; + size_t move_len; + int ret; + + orig_pos =3D ((loff_t)mext->orig_map.m_lblk) << blkbits; + donor_pos =3D ((loff_t)mext->donor_lblk) << blkbits; + ret =3D mext_folio_double_lock(orig_inode, donor_inode, + orig_pos >> PAGE_SHIFT, donor_pos >> PAGE_SHIFT, folio); + if (ret) + return ret; + + /* + * Check the origin inode's mapping information again under the + * folio lock, as we do not hold the i_data_sem at all times, and + * it may change during the concurrent write-back operation. + */ + if (mext->orig_map.m_seq !=3D READ_ONCE(EXT4_I(orig_inode)->i_es_seq)) { + ret =3D -ESTALE; + goto error; + } + + /* Adjust the moving length according to the length of shorter folio. */ + move_len =3D umin(folio_pos(folio[0]) + folio_size(folio[0]) - orig_pos, + folio_pos(folio[1]) + folio_size(folio[1]) - donor_pos); + move_len >>=3D blkbits; + if (move_len < mext->orig_map.m_len) + mext->orig_map.m_len =3D move_len; + + donor_map.m_lblk =3D mext->donor_lblk; + donor_map.m_len =3D mext->orig_map.m_len; + donor_map.m_flags =3D 0; + ret =3D ext4_map_blocks(NULL, donor_inode, &donor_map, 0); + if (ret < 0) + goto error; + + /* Adjust the moving length according to the donor mapping length. */ + mext->orig_map.m_len =3D donor_map.m_len; + + /* Skip moving if the donor range is a hole or a delalloc extent. */ + if (!(donor_map.m_flags & (EXT4_MAP_MAPPED | EXT4_MAP_UNWRITTEN))) + *move_type =3D MEXT_SKIP_EXTENT; + /* If both mapping ranges are unwritten, no need to copy data. */ + else if ((mext->orig_map.m_flags & EXT4_MAP_UNWRITTEN) && + (donor_map.m_flags & EXT4_MAP_UNWRITTEN)) + *move_type =3D MEXT_MOVE_EXTENT; + else + *move_type =3D MEXT_COPY_DATA; + + return 0; +error: + mext_folio_double_unlock(folio); + return ret; +} + +/* + * Re-create the new moved mapping buffers of the original inode and commit + * the entire written range. + */ +static int mext_folio_mkwrite(struct inode *inode, struct folio *folio, + size_t from, size_t to) +{ + unsigned int blocksize =3D i_blocksize(inode); + struct buffer_head *bh, *head; + size_t block_start, block_end; + sector_t block; + int ret; + + head =3D folio_buffers(folio); + if (!head) + head =3D create_empty_buffers(folio, blocksize, 0); + + block =3D folio_pos(folio) >> inode->i_blkbits; + block_end =3D 0; + bh =3D head; + do { + block_start =3D block_end; + block_end =3D block_start + blocksize; + if (block_end <=3D from || block_start >=3D to) + continue; + + ret =3D ext4_get_block(inode, block, bh, 0); + if (ret) + return ret; + } while (block++, (bh =3D bh->b_this_page) !=3D head); + + block_commit_write(folio, from, to); + return 0; +} + +/* + * Save the data in original inode extent blocks and replace one folio size + * aligned original inode extent with one or one partial donor inode exten= t, + * and then write out the saved data in new original inode blocks. Pass out + * the replaced block count through m_len. Return 0 on success, and an err= or + * code otherwise. + */ +static __used int mext_move_extent(struct mext_data *mext, u64 *m_len) +{ + struct inode *orig_inode =3D mext->orig_inode; + struct inode *donor_inode =3D mext->donor_inode; + struct ext4_map_blocks *orig_map =3D &mext->orig_map; + unsigned int blkbits =3D orig_inode->i_blkbits; + struct folio *folio[2] =3D {NULL, NULL}; + loff_t from, length; + enum mext_move_type move_type =3D 0; + handle_t *handle; + u64 r_len =3D 0; + unsigned int credits; + int ret, ret2; + + *m_len =3D 0; + credits =3D ext4_chunk_trans_extent(orig_inode, 0) * 2; + handle =3D ext4_journal_start(orig_inode, EXT4_HT_MOVE_EXTENTS, credits); + if (IS_ERR(handle)) + return PTR_ERR(handle); + + ret =3D mext_move_begin(mext, folio, &move_type); + if (ret) + goto stop_handle; + + if (move_type =3D=3D MEXT_SKIP_EXTENT) + goto unlock; + + /* + * Copy the data. First, read the original inode data into the page + * cache. Then, release the existing mapping relationships and swap + * the extent. Finally, re-establish the new mapping relationships + * and dirty the page cache. + */ + if (move_type =3D=3D MEXT_COPY_DATA) { + from =3D offset_in_folio(folio[0], + ((loff_t)orig_map->m_lblk) << blkbits); + length =3D ((loff_t)orig_map->m_len) << blkbits; + + ret =3D mext_folio_mkuptodate(folio[0], from, from + length); + if (ret) + goto unlock; + } + + if (!filemap_release_folio(folio[0], 0) || + !filemap_release_folio(folio[1], 0)) { + ret =3D -EBUSY; + goto unlock; + } + + /* Move extent */ + ext4_double_down_write_data_sem(orig_inode, donor_inode); + *m_len =3D ext4_swap_extents(handle, orig_inode, donor_inode, + orig_map->m_lblk, mext->donor_lblk, + orig_map->m_len, 1, &ret); + ext4_double_up_write_data_sem(orig_inode, donor_inode); + + /* A short-length swap cannot occur after a successful swap extent. */ + if (WARN_ON_ONCE(!ret && (*m_len !=3D orig_map->m_len))) + ret =3D -EIO; + + if (!(*m_len) || (move_type =3D=3D MEXT_MOVE_EXTENT)) + goto unlock; + + /* Copy data */ + length =3D (*m_len) << blkbits; + ret2 =3D mext_folio_mkwrite(orig_inode, folio[0], from, from + length); + if (ret2) { + if (!ret) + ret =3D ret2; + goto repair_branches; + } + /* + * Even in case of data=3Dwriteback it is reasonable to pin + * inode to transaction, to prevent unexpected data loss. + */ + ret2 =3D ext4_jbd2_inode_add_write(handle, orig_inode, + ((loff_t)orig_map->m_lblk) << blkbits, length); + if (!ret) + ret =3D ret2; +unlock: + mext_folio_double_unlock(folio); +stop_handle: + ext4_journal_stop(handle); + return ret; + +repair_branches: + ret2 =3D 0; + r_len =3D ext4_swap_extents(handle, donor_inode, orig_inode, + mext->donor_lblk, orig_map->m_lblk, + *m_len, 0, &ret2); + if (ret2 || r_len !=3D *m_len) { + ext4_error_inode_block(orig_inode, (sector_t)(orig_map->m_lblk), + EIO, "Unable to copy data block, data will be lost!"); + ret =3D -EIO; + } + *m_len =3D 0; + goto unlock; +} + /** * move_extent_per_page - Move extent data per page * --=20 2.46.1 From nobody Wed Dec 17 18:14:13 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B0F3271446; Mon, 13 Oct 2025 01:52:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320364; cv=none; b=gHAhZ7opaWcDnCf7S/ue9TrXvnvJQNqjqm1XMBLyxac1AjS1SaI0tqto3gxSvFlsfrTlZ2xenG0ewObqNM1xabckzLykTkjGHCaJF/uj4y07meEta9oGieTNl9wIb7JthEqHjanIP+VccWFq+gOtbVVMOIp+6Rp28VtrsIOY8Nc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320364; c=relaxed/simple; bh=CJkdOIkeM/1vzG3VyqnKLvaTIBy9aJpVScDorIkDe3g=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Zd/nZOjLQj6IdZ35VX7GWyBVhxS6EeM62MeL34+xo/wkB/sbElqtRklPMg9spShk2/rv5rxUGBp0FuRabzRD+AeH1v3lvMNfFvPYeOlQucVY0TGP4CA+P5G0lINhF/DtPxv5oDPw790mIIXYzHTE4u2P/d2eI6KeR8NSEspI18g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4clL1y0N5SzKHMM3; Mon, 13 Oct 2025 09:52:06 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.75]) by mail.maildlp.com (Postfix) with ESMTP id 25E151A1344; Mon, 13 Oct 2025 09:52:40 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP2 (Coremail) with SMTP id Syh0CgCn_UVfW+xoNhu7AA--.53067S14; Mon, 13 Oct 2025 09:52:40 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 10/12] ext4: switch to using the new extent movement method Date: Mon, 13 Oct 2025 09:51:26 +0800 Message-ID: <20251013015128.499308-11-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20251013015128.499308-1-yi.zhang@huaweicloud.com> References: <20251013015128.499308-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: Syh0CgCn_UVfW+xoNhu7AA--.53067S14 X-Coremail-Antispam: 1UD129KBjvAXoWfGw1UJry7Wr4UtFW5XF4DCFg_yoW8WrW5Wo WfCF4jqwn5Wr9Ig3ykKw10yFyUXan7Jw4rJrWfursrWFy3X3W5C39xG3Z7Ja43Xa1rKr15 Xa4xJ3WYyrZ7trn5n29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUOV7AC8VAFwI0_Wr0E3s1l1xkIjI8I6I8E6xAIw20EY4v20xva j40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l82xGYIkIc2x26280x7IE14v26r126s0DM28Irc Ia0xkI8VCY1x0267AKxVW5JVCq3wA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK021l 84ACjcxK6xIIjxv20xvE14v26w1j6s0DM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4UJV WxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_GcCE 3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2I x0cI8IcVAFwI0_Jrv_JF1lYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8 JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lFIxGxcIEc7CjxVA2Y2 ka0xkIwI1lc7CjxVAaw2AFwI0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Y z7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zV AF1VAY17CE14v26r1q6r43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Gr0_Xr1l IxAIcVC0I7IYx2IY6xkF7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r 1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIY CTnIWIevJa73UjIFyTuYvjfUo73vUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi Now that we have mext_move_extent(), we can switch to this new interface and deprecate move_extent_per_page(). First, after acquiring the i_rwsem, we can directly use ext4_map_blocks() to obtain a contiguous extent from the original inode as the extent to be moved. It can and it's safe to get mapping information from the extent status tree without needing to access the ondisk extent tree, because ext4_move_extent() will check the sequence cookie under the folio lock. Then, after populating the mext_data structure, we call ext4_move_extent() to move the extent. Finally, the length of the extent will be adjusted in mext.orig_map.m_len and the actual length moved is returned through m_len. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/move_extent.c | 395 ++++++------------------------------------ 1 file changed, 51 insertions(+), 344 deletions(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 92a716c56740..933c2afed550 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -20,29 +20,6 @@ struct mext_data { ext4_lblk_t donor_lblk; /* Start block of the donor file */ }; =20 -/** - * get_ext_path() - Find an extent path for designated logical block numbe= r. - * @inode: inode to be searched - * @lblock: logical block number to find an extent path - * @path: pointer to an extent path - * - * ext4_find_extent wrapper. Return an extent path pointer on success, - * or an error pointer on failure. - */ -static inline struct ext4_ext_path * -get_ext_path(struct inode *inode, ext4_lblk_t lblock, - struct ext4_ext_path *path) -{ - path =3D ext4_find_extent(inode, lblock, path, EXT4_EX_NOCACHE); - if (IS_ERR(path)) - return path; - if (path[ext_depth(inode)].p_ext =3D=3D NULL) { - ext4_free_ext_path(path); - return ERR_PTR(-ENODATA); - } - return path; -} - /** * ext4_double_down_write_data_sem() - write lock two inodes's i_data_sem * @first: inode to be locked @@ -59,7 +36,6 @@ ext4_double_down_write_data_sem(struct inode *first, stru= ct inode *second) } else { down_write(&EXT4_I(second)->i_data_sem); down_write_nested(&EXT4_I(first)->i_data_sem, I_DATA_SEM_OTHER); - } } =20 @@ -78,42 +54,6 @@ ext4_double_up_write_data_sem(struct inode *orig_inode, up_write(&EXT4_I(donor_inode)->i_data_sem); } =20 -/** - * mext_check_coverage - Check that all extents in range has the same type - * - * @inode: inode in question - * @from: block offset of inode - * @count: block count to be checked - * @unwritten: extents expected to be unwritten - * @err: pointer to save error value - * - * Return 1 if all extents in range has expected type, and zero otherwise. - */ -static int -mext_check_coverage(struct inode *inode, ext4_lblk_t from, ext4_lblk_t cou= nt, - int unwritten, int *err) -{ - struct ext4_ext_path *path =3D NULL; - struct ext4_extent *ext; - int ret =3D 0; - ext4_lblk_t last =3D from + count; - while (from < last) { - path =3D get_ext_path(inode, from, path); - if (IS_ERR(path)) { - *err =3D PTR_ERR(path); - return ret; - } - ext =3D path[ext_depth(inode)].p_ext; - if (unwritten !=3D ext4_ext_is_unwritten(ext)) - goto out; - from +=3D ext4_ext_get_actual_len(ext); - } - ret =3D 1; -out: - ext4_free_ext_path(path); - return ret; -} - /** * mext_folio_double_lock - Grab and lock folio on both @inode1 and @inode2 * @@ -363,7 +303,7 @@ static int mext_folio_mkwrite(struct inode *inode, stru= ct folio *folio, * the replaced block count through m_len. Return 0 on success, and an err= or * code otherwise. */ -static __used int mext_move_extent(struct mext_data *mext, u64 *m_len) +static int mext_move_extent(struct mext_data *mext, u64 *m_len) { struct inode *orig_inode =3D mext->orig_inode; struct inode *donor_inode =3D mext->donor_inode; @@ -462,210 +402,6 @@ static __used int mext_move_extent(struct mext_data *= mext, u64 *m_len) goto unlock; } =20 -/** - * move_extent_per_page - Move extent data per page - * - * @o_filp: file structure of original file - * @donor_inode: donor inode - * @orig_page_offset: page index on original file - * @donor_page_offset: page index on donor file - * @data_offset_in_page: block index where data swapping starts - * @block_len_in_page: the number of blocks to be swapped - * @unwritten: orig extent is unwritten or not - * @err: pointer to save return value - * - * Save the data in original inode blocks and replace original inode exten= ts - * with donor inode extents by calling ext4_swap_extents(). - * Finally, write out the saved data in new original inode blocks. Return - * replaced block count. - */ -static int -move_extent_per_page(struct file *o_filp, struct inode *donor_inode, - pgoff_t orig_page_offset, pgoff_t donor_page_offset, - int data_offset_in_page, - int block_len_in_page, int unwritten, int *err) -{ - struct inode *orig_inode =3D file_inode(o_filp); - struct folio *folio[2] =3D {NULL, NULL}; - handle_t *handle; - ext4_lblk_t orig_blk_offset, donor_blk_offset; - unsigned long blocksize =3D orig_inode->i_sb->s_blocksize; - unsigned int tmp_data_size, data_size, replaced_size; - int i, err2, jblocks, retries =3D 0; - int replaced_count =3D 0; - int from; - int blocks_per_page =3D PAGE_SIZE >> orig_inode->i_blkbits; - struct super_block *sb =3D orig_inode->i_sb; - struct buffer_head *bh =3D NULL; - - /* - * It needs twice the amount of ordinary journal buffers because - * inode and donor_inode may change each different metadata blocks. - */ -again: - *err =3D 0; - jblocks =3D ext4_meta_trans_blocks(orig_inode, block_len_in_page, - block_len_in_page) * 2; - handle =3D ext4_journal_start(orig_inode, EXT4_HT_MOVE_EXTENTS, jblocks); - if (IS_ERR(handle)) { - *err =3D PTR_ERR(handle); - return 0; - } - - orig_blk_offset =3D orig_page_offset * blocks_per_page + - data_offset_in_page; - - donor_blk_offset =3D donor_page_offset * blocks_per_page + - data_offset_in_page; - - /* Calculate data_size */ - if ((orig_blk_offset + block_len_in_page - 1) =3D=3D - ((orig_inode->i_size - 1) >> orig_inode->i_blkbits)) { - /* Replace the last block */ - tmp_data_size =3D orig_inode->i_size & (blocksize - 1); - /* - * If data_size equal zero, it shows data_size is multiples of - * blocksize. So we set appropriate value. - */ - if (tmp_data_size =3D=3D 0) - tmp_data_size =3D blocksize; - - data_size =3D tmp_data_size + - ((block_len_in_page - 1) << orig_inode->i_blkbits); - } else - data_size =3D block_len_in_page << orig_inode->i_blkbits; - - replaced_size =3D data_size; - - *err =3D mext_folio_double_lock(orig_inode, donor_inode, orig_page_offset, - donor_page_offset, folio); - if (unlikely(*err < 0)) - goto stop_journal; - /* - * If orig extent was unwritten it can become initialized - * at any time after i_data_sem was dropped, in order to - * serialize with delalloc we have recheck extent while we - * hold page's lock, if it is still the case data copy is not - * necessary, just swap data blocks between orig and donor. - */ - if (unwritten) { - ext4_double_down_write_data_sem(orig_inode, donor_inode); - /* If any of extents in range became initialized we have to - * fallback to data copying */ - unwritten =3D mext_check_coverage(orig_inode, orig_blk_offset, - block_len_in_page, 1, err); - if (*err) - goto drop_data_sem; - - unwritten &=3D mext_check_coverage(donor_inode, donor_blk_offset, - block_len_in_page, 1, err); - if (*err) - goto drop_data_sem; - - if (!unwritten) { - ext4_double_up_write_data_sem(orig_inode, donor_inode); - goto data_copy; - } - if (!filemap_release_folio(folio[0], 0) || - !filemap_release_folio(folio[1], 0)) { - *err =3D -EBUSY; - goto drop_data_sem; - } - replaced_count =3D ext4_swap_extents(handle, orig_inode, - donor_inode, orig_blk_offset, - donor_blk_offset, - block_len_in_page, 1, err); - drop_data_sem: - ext4_double_up_write_data_sem(orig_inode, donor_inode); - goto unlock_folios; - } -data_copy: - from =3D offset_in_folio(folio[0], - orig_blk_offset << orig_inode->i_blkbits); - *err =3D mext_folio_mkuptodate(folio[0], from, from + replaced_size); - if (*err) - goto unlock_folios; - - /* At this point all buffers in range are uptodate, old mapping layout - * is no longer required, try to drop it now. */ - if (!filemap_release_folio(folio[0], 0) || - !filemap_release_folio(folio[1], 0)) { - *err =3D -EBUSY; - goto unlock_folios; - } - ext4_double_down_write_data_sem(orig_inode, donor_inode); - replaced_count =3D ext4_swap_extents(handle, orig_inode, donor_inode, - orig_blk_offset, donor_blk_offset, - block_len_in_page, 1, err); - ext4_double_up_write_data_sem(orig_inode, donor_inode); - if (*err) { - if (replaced_count) { - block_len_in_page =3D replaced_count; - replaced_size =3D - block_len_in_page << orig_inode->i_blkbits; - } else - goto unlock_folios; - } - /* Perform all necessary steps similar write_begin()/write_end() - * but keeping in mind that i_size will not change */ - bh =3D folio_buffers(folio[0]); - if (!bh) - bh =3D create_empty_buffers(folio[0], - 1 << orig_inode->i_blkbits, 0); - for (i =3D 0; i < from >> orig_inode->i_blkbits; i++) - bh =3D bh->b_this_page; - for (i =3D 0; i < block_len_in_page; i++) { - *err =3D ext4_get_block(orig_inode, orig_blk_offset + i, bh, 0); - if (*err < 0) - goto repair_branches; - bh =3D bh->b_this_page; - } - - block_commit_write(folio[0], from, from + replaced_size); - - /* Even in case of data=3Dwriteback it is reasonable to pin - * inode to transaction, to prevent unexpected data loss */ - *err =3D ext4_jbd2_inode_add_write(handle, orig_inode, - (loff_t)orig_page_offset << PAGE_SHIFT, replaced_size); - -unlock_folios: - folio_unlock(folio[0]); - folio_put(folio[0]); - folio_unlock(folio[1]); - folio_put(folio[1]); -stop_journal: - ext4_journal_stop(handle); - if (*err =3D=3D -ENOSPC && - ext4_should_retry_alloc(sb, &retries)) - goto again; - /* Buffer was busy because probably is pinned to journal transaction, - * force transaction commit may help to free it. */ - if (*err =3D=3D -EBUSY && retries++ < 4 && EXT4_SB(sb)->s_journal && - jbd2_journal_force_commit_nested(EXT4_SB(sb)->s_journal)) - goto again; - return replaced_count; - -repair_branches: - /* - * This should never ever happen! - * Extents are swapped already, but we are not able to copy data. - * Try to swap extents to it's original places - */ - ext4_double_down_write_data_sem(orig_inode, donor_inode); - replaced_count =3D ext4_swap_extents(handle, donor_inode, orig_inode, - orig_blk_offset, donor_blk_offset, - block_len_in_page, 0, &err2); - ext4_double_up_write_data_sem(orig_inode, donor_inode); - if (replaced_count !=3D block_len_in_page) { - ext4_error_inode_block(orig_inode, (sector_t)(orig_blk_offset), - EIO, "Unable to copy data block," - " data will be lost."); - *err =3D -EIO; - } - replaced_count =3D 0; - goto unlock_folios; -} - /* * Check the validity of the basic filesystem environment and the * inodes' support status. @@ -827,106 +563,81 @@ static int mext_check_adjust_range(struct inode *ori= g_inode, * * This function returns 0 and moved block length is set in moved_len * if succeed, otherwise returns error value. - * */ -int -ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk, - __u64 donor_blk, __u64 len, __u64 *moved_len) +int ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig= _blk, + __u64 donor_blk, __u64 len, __u64 *moved_len) { struct inode *orig_inode =3D file_inode(o_filp); struct inode *donor_inode =3D file_inode(d_filp); - struct ext4_ext_path *path =3D NULL; - int blocks_per_page =3D PAGE_SIZE >> orig_inode->i_blkbits; - ext4_lblk_t o_end, o_start =3D orig_blk; - ext4_lblk_t d_start =3D donor_blk; + struct mext_data mext; + struct super_block *sb =3D orig_inode->i_sb; + struct ext4_sb_info *sbi =3D EXT4_SB(sb); + int retries =3D 0; + u64 m_len; int ret; =20 + *moved_len =3D 0; + /* Protect orig and donor inodes against a truncate */ lock_two_nondirectories(orig_inode, donor_inode); =20 ret =3D mext_check_validity(orig_inode, donor_inode); if (ret) - goto unlock; + goto out; =20 /* Wait for all existing dio workers */ inode_dio_wait(orig_inode); inode_dio_wait(donor_inode); =20 - /* Protect extent tree against block allocations via delalloc */ - ext4_double_down_write_data_sem(orig_inode, donor_inode); /* Check and adjust the specified move_extent range. */ ret =3D mext_check_adjust_range(orig_inode, donor_inode, orig_blk, donor_blk, &len); if (ret) goto out; - o_end =3D o_start + len; =20 - *moved_len =3D 0; - while (o_start < o_end) { - struct ext4_extent *ex; - ext4_lblk_t cur_blk, next_blk; - pgoff_t orig_page_index, donor_page_index; - int offset_in_page; - int unwritten, cur_len; - - path =3D get_ext_path(orig_inode, o_start, path); - if (IS_ERR(path)) { - ret =3D PTR_ERR(path); + mext.orig_inode =3D orig_inode; + mext.donor_inode =3D donor_inode; + while (len) { + mext.orig_map.m_lblk =3D orig_blk; + mext.orig_map.m_len =3D len; + mext.orig_map.m_flags =3D 0; + mext.donor_lblk =3D donor_blk; + + ret =3D ext4_map_blocks(NULL, orig_inode, &mext.orig_map, 0); + if (ret < 0) goto out; - } - ex =3D path[path->p_depth].p_ext; - cur_blk =3D le32_to_cpu(ex->ee_block); - cur_len =3D ext4_ext_get_actual_len(ex); - /* Check hole before the start pos */ - if (cur_blk + cur_len - 1 < o_start) { - next_blk =3D ext4_ext_next_allocated_block(path); - if (next_blk =3D=3D EXT_MAX_BLOCKS) { - ret =3D -ENODATA; - goto out; + + /* Skip moving if it is a hole or a delalloc extent. */ + if (mext.orig_map.m_flags & + (EXT4_MAP_MAPPED | EXT4_MAP_UNWRITTEN)) { + ret =3D mext_move_extent(&mext, &m_len); + *moved_len +=3D m_len; + if (!ret) + goto next; + + /* Move failed or partially failed. */ + if (m_len) { + orig_blk +=3D m_len; + donor_blk +=3D m_len; + len -=3D m_len; } - d_start +=3D next_blk - o_start; - o_start =3D next_blk; - continue; - /* Check hole after the start pos */ - } else if (cur_blk > o_start) { - /* Skip hole */ - d_start +=3D cur_blk - o_start; - o_start =3D cur_blk; - /* Extent inside requested range ?*/ - if (cur_blk >=3D o_end) - goto out; - } else { /* in_range(o_start, o_blk, o_len) */ - cur_len +=3D cur_blk - o_start; + if (ret =3D=3D -ESTALE) + continue; + if (ret =3D=3D -ENOSPC && + ext4_should_retry_alloc(sb, &retries)) + continue; + if (ret =3D=3D -EBUSY && + sbi->s_journal && retries++ < 4 && + jbd2_journal_force_commit_nested(sbi->s_journal)) + continue; + + goto out; } - unwritten =3D ext4_ext_is_unwritten(ex); - if (o_end - o_start < cur_len) - cur_len =3D o_end - o_start; - - orig_page_index =3D o_start >> (PAGE_SHIFT - - orig_inode->i_blkbits); - donor_page_index =3D d_start >> (PAGE_SHIFT - - donor_inode->i_blkbits); - offset_in_page =3D o_start % blocks_per_page; - if (cur_len > blocks_per_page - offset_in_page) - cur_len =3D blocks_per_page - offset_in_page; - /* - * Up semaphore to avoid following problems: - * a. transaction deadlock among ext4_journal_start, - * ->write_begin via pagefault, and jbd2_journal_commit - * b. racing with ->read_folio, ->write_begin, and - * ext4_get_block in move_extent_per_page - */ - ext4_double_up_write_data_sem(orig_inode, donor_inode); - /* Swap original branches with new branches */ - *moved_len +=3D move_extent_per_page(o_filp, donor_inode, - orig_page_index, donor_page_index, - offset_in_page, cur_len, - unwritten, &ret); - ext4_double_down_write_data_sem(orig_inode, donor_inode); - if (ret < 0) - break; - o_start +=3D cur_len; - d_start +=3D cur_len; +next: + orig_blk +=3D mext.orig_map.m_len; + donor_blk +=3D mext.orig_map.m_len; + len -=3D mext.orig_map.m_len; + retries =3D 0; } =20 out: @@ -935,10 +646,6 @@ ext4_move_extents(struct file *o_filp, struct file *d_= filp, __u64 orig_blk, ext4_discard_preallocations(donor_inode); } =20 - ext4_free_ext_path(path); - ext4_double_up_write_data_sem(orig_inode, donor_inode); -unlock: unlock_two_nondirectories(orig_inode, donor_inode); - return ret; } --=20 2.46.1 From nobody Wed Dec 17 18:14:13 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 934792F60D5; Mon, 13 Oct 2025 01:52:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320376; cv=none; b=PlMA0UqucnrvnaIlfdo+99b6C/QutdAKBRbbjOpEx6t7sJRlOWVHVhwtegD3Go3auH3X+VvWo3klIxIfoTsXUhbxucvB9UzTP20Z6Ao1IJLUD1DwP0jZoQCJVpKS4m2uPbaCBW7ZeXfGAJJienQ8Ae2MuUjpAQsOtUShI3oGNHw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320376; c=relaxed/simple; bh=uEV0f0q6Dssyvh6MuMOd07vJYkH9J9MKsoWDlGOziSc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kAxHX1LBL1klW+hlZW9rOmiEBaHmomlctLr9oWmphxoFjwahhiBOdLwFYpVA/kv/j42FClUNmlJcap3IQfCXv4dsgIxRHGF5dXuowOx3OQG7p5MAhQGpYZTuObDloW5kLsmsjEO3FSZgAsSv6hXNDsoy6H5KrvgnKW8o8c6GUPc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4clL1s4PzpzYQtn6; Mon, 13 Oct 2025 09:52:01 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.75]) by mail.maildlp.com (Postfix) with ESMTP id 35E421A0F70; Mon, 13 Oct 2025 09:52:40 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP2 (Coremail) with SMTP id Syh0CgCn_UVfW+xoNhu7AA--.53067S15; Mon, 13 Oct 2025 09:52:40 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 11/12] ext4: add large folios support for moving extents Date: Mon, 13 Oct 2025 09:51:27 +0800 Message-ID: <20251013015128.499308-12-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20251013015128.499308-1-yi.zhang@huaweicloud.com> References: <20251013015128.499308-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: Syh0CgCn_UVfW+xoNhu7AA--.53067S15 X-Coremail-Antispam: 1UD129KBjvJXoWxGF1xWrWrJr1rGFW8XrWUurg_yoW5WrWkpF 1fKan3tFWkX34I9ry0qay7ZrW5Ka4xtr48WF4fJw1SyFyqvFyIgr1jy3WIvFyrtrW8ArWF qF4SkryUWa1Dt3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUljgxUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi Pass the moving extent length into mext_folio_double_lock() so that it can acquire a higher-order folio if the length exceeds PAGE_SIZE. This can speed up extent moving when the extent is larger than one page. Additionally, remove the unnecessary comments from mext_folio_double_lock(). Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/move_extent.c | 27 ++++++++++----------------- 1 file changed, 10 insertions(+), 17 deletions(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 933c2afed550..f04755c2165a 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -54,23 +54,14 @@ ext4_double_up_write_data_sem(struct inode *orig_inode, up_write(&EXT4_I(donor_inode)->i_data_sem); } =20 -/** - * mext_folio_double_lock - Grab and lock folio on both @inode1 and @inode2 - * - * @inode1: the inode structure - * @inode2: the inode structure - * @index1: folio index - * @index2: folio index - * @folio: result folio vector - * - * Grab two locked folio for inode's by inode order - */ -static int -mext_folio_double_lock(struct inode *inode1, struct inode *inode2, - pgoff_t index1, pgoff_t index2, struct folio *folio[2]) +/* Grab and lock folio on both @inode1 and @inode2 by inode order. */ +static int mext_folio_double_lock(struct inode *inode1, struct inode *inod= e2, + pgoff_t index1, pgoff_t index2, size_t len, + struct folio *folio[2]) { struct address_space *mapping[2]; unsigned int flags; + fgf_t fgp_flags =3D FGP_WRITEBEGIN; =20 BUG_ON(!inode1 || !inode2); if (inode1 < inode2) { @@ -83,14 +74,15 @@ mext_folio_double_lock(struct inode *inode1, struct ino= de *inode2, } =20 flags =3D memalloc_nofs_save(); - folio[0] =3D __filemap_get_folio(mapping[0], index1, FGP_WRITEBEGIN, + fgp_flags |=3D fgf_set_order(len); + folio[0] =3D __filemap_get_folio(mapping[0], index1, fgp_flags, mapping_gfp_mask(mapping[0])); if (IS_ERR(folio[0])) { memalloc_nofs_restore(flags); return PTR_ERR(folio[0]); } =20 - folio[1] =3D __filemap_get_folio(mapping[1], index2, FGP_WRITEBEGIN, + folio[1] =3D __filemap_get_folio(mapping[1], index2, fgp_flags, mapping_gfp_mask(mapping[1])); memalloc_nofs_restore(flags); if (IS_ERR(folio[1])) { @@ -214,7 +206,8 @@ static int mext_move_begin(struct mext_data *mext, stru= ct folio *folio[2], orig_pos =3D ((loff_t)mext->orig_map.m_lblk) << blkbits; donor_pos =3D ((loff_t)mext->donor_lblk) << blkbits; ret =3D mext_folio_double_lock(orig_inode, donor_inode, - orig_pos >> PAGE_SHIFT, donor_pos >> PAGE_SHIFT, folio); + orig_pos >> PAGE_SHIFT, donor_pos >> PAGE_SHIFT, + ((size_t)mext->orig_map.m_len) << blkbits, folio); if (ret) return ret; =20 --=20 2.46.1 From nobody Wed Dec 17 18:14:13 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3B036270EC3; Mon, 13 Oct 2025 01:52:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320364; cv=none; b=iKpVk40Jk5b2rhCdmHSwoRabafsM5lLezeRHONfV09CZYIXh/smP3moTYIHeEUl8Ax+vlkXtw2PP1rTSATksSJUceoRn3IL8Jinm7bFYTOYmD1T9koPVrLcW/hoMEklzczwTjGzA6q/gxfcxLsThcBCj3IF+FA8UyDLbV6OcEV8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1760320364; c=relaxed/simple; bh=TCWTYnPez7p6t067V1HQYltC++3+VpsLpFHKVD9QRRE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fpy4KxMcxgdWSvl5leslaOSHQFDRJCWBXzOknafMCf4Rp2d08woYUfPzYpCl609zf8ijZNkiK9XfEQb5tGh970IlaejJeuDZlPojXzO9z+WtEo8/SXxeAMrb7S2BS2+h0CZKX6D+JSNEl45mWXYgzcl9u/FcAmar0wJNht9UDKk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4clL1y1TrBzKHMQX; Mon, 13 Oct 2025 09:52:06 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.75]) by mail.maildlp.com (Postfix) with ESMTP id 4E3A51A16C1; Mon, 13 Oct 2025 09:52:40 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP2 (Coremail) with SMTP id Syh0CgCn_UVfW+xoNhu7AA--.53067S16; Mon, 13 Oct 2025 09:52:40 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 12/12] ext4: add two trace points for moving extents Date: Mon, 13 Oct 2025 09:51:28 +0800 Message-ID: <20251013015128.499308-13-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20251013015128.499308-1-yi.zhang@huaweicloud.com> References: <20251013015128.499308-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: Syh0CgCn_UVfW+xoNhu7AA--.53067S16 X-Coremail-Antispam: 1UD129KBjvJXoWxCFyDAF4DAF1kWFykWFykZrb_yoWrtr4xpF n7AFy5K3ykXaya934xAw48Zr45ua4IkrW7KrWSg343Xayxtr1qgr4kta1jyF9YyrW8Kryf XFWjyryDKa45W3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUXVWUAwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUljgxUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi To facilitate tracking the length, type, and outcome of the move extent, add a trace point at both the entry and exit of mext_move_extent(). Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/move_extent.c | 14 ++++++- include/trace/events/ext4.h | 74 +++++++++++++++++++++++++++++++++++++ 2 files changed, 86 insertions(+), 2 deletions(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index f04755c2165a..0550fd30fd10 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -13,6 +13,8 @@ #include "ext4.h" #include "ext4_extents.h" =20 +#include + struct mext_data { struct inode *orig_inode; /* Origin file inode */ struct inode *donor_inode; /* Donor file inode */ @@ -311,10 +313,14 @@ static int mext_move_extent(struct mext_data *mext, u= 64 *m_len) int ret, ret2; =20 *m_len =3D 0; + trace_ext4_move_extent_enter(orig_inode, orig_map, donor_inode, + mext->donor_lblk); credits =3D ext4_chunk_trans_extent(orig_inode, 0) * 2; handle =3D ext4_journal_start(orig_inode, EXT4_HT_MOVE_EXTENTS, credits); - if (IS_ERR(handle)) - return PTR_ERR(handle); + if (IS_ERR(handle)) { + ret =3D PTR_ERR(handle); + goto out; + } =20 ret =3D mext_move_begin(mext, folio, &move_type); if (ret) @@ -379,6 +385,10 @@ static int mext_move_extent(struct mext_data *mext, u6= 4 *m_len) mext_folio_double_unlock(folio); stop_handle: ext4_journal_stop(handle); +out: + trace_ext4_move_extent_exit(orig_inode, orig_map->m_lblk, donor_inode, + mext->donor_lblk, orig_map->m_len, *m_len, + move_type, ret); return ret; =20 repair_branches: diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index 6a0754d38acf..a05bdd48e16e 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -3016,6 +3016,80 @@ TRACE_EVENT(ext4_update_sb, __entry->fsblk, __entry->flags) ); =20 +TRACE_EVENT(ext4_move_extent_enter, + TP_PROTO(struct inode *orig_inode, struct ext4_map_blocks *orig_map, + struct inode *donor_inode, ext4_lblk_t donor_lblk), + + TP_ARGS(orig_inode, orig_map, donor_inode, donor_lblk), + + TP_STRUCT__entry( + __field(dev_t, dev) + __field(ino_t, orig_ino) + __field(ext4_lblk_t, orig_lblk) + __field(unsigned int, orig_flags) + __field(ino_t, donor_ino) + __field(ext4_lblk_t, donor_lblk) + __field(unsigned int, len) + ), + + TP_fast_assign( + __entry->dev =3D orig_inode->i_sb->s_dev; + __entry->orig_ino =3D orig_inode->i_ino; + __entry->orig_lblk =3D orig_map->m_lblk; + __entry->orig_flags =3D orig_map->m_flags; + __entry->donor_ino =3D donor_inode->i_ino; + __entry->donor_lblk =3D donor_lblk; + __entry->len =3D orig_map->m_len; + ), + + TP_printk("dev %d,%d origin ino %lu lblk %u flags %s donor ino %lu lblk %= u len %u", + MAJOR(__entry->dev), MINOR(__entry->dev), + (unsigned long) __entry->orig_ino, __entry->orig_lblk, + show_mflags(__entry->orig_flags), + (unsigned long) __entry->donor_ino, __entry->donor_lblk, + __entry->len) +); + +TRACE_EVENT(ext4_move_extent_exit, + TP_PROTO(struct inode *orig_inode, ext4_lblk_t orig_lblk, + struct inode *donor_inode, ext4_lblk_t donor_lblk, + unsigned int m_len, u64 move_len, int move_type, int ret), + + TP_ARGS(orig_inode, orig_lblk, donor_inode, donor_lblk, m_len, + move_len, move_type, ret), + + TP_STRUCT__entry( + __field(dev_t, dev) + __field(ino_t, orig_ino) + __field(ext4_lblk_t, orig_lblk) + __field(ino_t, donor_ino) + __field(ext4_lblk_t, donor_lblk) + __field(unsigned int, m_len) + __field(u64, move_len) + __field(int, move_type) + __field(int, ret) + ), + + TP_fast_assign( + __entry->dev =3D orig_inode->i_sb->s_dev; + __entry->orig_ino =3D orig_inode->i_ino; + __entry->orig_lblk =3D orig_lblk; + __entry->donor_ino =3D donor_inode->i_ino; + __entry->donor_lblk =3D donor_lblk; + __entry->m_len =3D m_len; + __entry->move_len =3D move_len; + __entry->move_type =3D move_type; + __entry->ret =3D ret; + ), + + TP_printk("dev %d,%d origin ino %lu lblk %u donor ino %lu lblk %u m_len %= u, move_len %llu type %d ret %d", + MAJOR(__entry->dev), MINOR(__entry->dev), + (unsigned long) __entry->orig_ino, __entry->orig_lblk, + (unsigned long) __entry->donor_ino, __entry->donor_lblk, + __entry->m_len, __entry->move_len, __entry->move_type, + __entry->ret) +); + #endif /* _TRACE_EXT4_H */ =20 /* This part must be outside protection */ --=20 2.46.1