From nobody Thu Oct 2 03:28:36 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C35EF24169A; Tue, 23 Sep 2025 01:29:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590974; cv=none; b=a3G7UgdMLCZdpOd3kl0+USh7N1BdtXthILEUKJ3IfT9nMLacjszv7FWmxdbZWSZv/U80XdKZiROzvTWnrcZR3qpfMoa4be4g/Ane3Vxxe5NxiNnsxoU1yR7MLAAH0n3IJCemeN0KgxzacID95gaWfJnrzHIHldac+62LoyxtfH4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590974; c=relaxed/simple; bh=2c9J4BVQOGD7WKgfMY5Roo5DXPkX37AymVYJDjwHoC8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UEZlGRKnyKLBIwqM3B7Axm19EiWaqq2Xh6Inq5SQ1lq7B14i63TwGiCTtrt1ZzJZXqNsOzlFe+ws8NHgaIB4LABuM7fMqnJsSyCLZ5pOF7jpX118qsyM3W5s4214Obymb5+ZN4WQl5E+LSEdW6edwDzcrM2+79C1E8rFr56nwB8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4cW2St73wCzYQtHC; Tue, 23 Sep 2025 09:29:18 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 272031A0FA1; Tue, 23 Sep 2025 09:29:24 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP4 (Coremail) with SMTP id gCh0CgAXKWHq99FoGYYGAg--.10941S5; Tue, 23 Sep 2025 09:29:24 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH 01/13] ext4: fix an off-by-one issue during moving extents Date: Tue, 23 Sep 2025 09:27:11 +0800 Message-ID: <20250923012724.2378858-2-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> References: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAXKWHq99FoGYYGAg--.10941S5 X-Coremail-Antispam: 1UD129KBjvJXoW7uw4rGFWktF4fJF1UWF1xZrb_yoW8Gr1xp3 4aka45KrW0qwnxCw4kW3Z7X3yUG34DKr47Wa4Fkw17CFyay3409rWUK3Wq9a45tFWDJF4r XF4Fkr15Za4UXaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUm014x267AKxVW5JVWrJwAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0x vEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVj vjDU0xZFpf9x0JUTT5JUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi During the movement of a written extent, mext_page_mkuptodate() is called to read data in the range [from, to) into the page cache and to update the corresponding buffers. Therefore, we should not wait on any buffer whose start offset is >=3D 'to'. Otherwise, it will return -EIO and fail the extents movement. $ for i in `seq 3 -1 0`; \ do xfs_io -fs -c "pwrite -b 1024 $((i * 1024)) 1024" /mnt/foo; \ done $ umount /mnt && mount /dev/pmem1s /mnt # drop cache $ e4defrag /mnt/foo e4defrag 1.47.0 (5-Feb-2023) ext4 defragmentation for /mnt/foo [1/1]/mnt/foo: 0% [ NG ] Success: [0/1] Fixes: a40759fb16ae ("ext4: remove array of buffer_heads from mext_page_mku= ptodate()") Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/move_extent.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index adae3caf175a..4b091c21908f 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -225,7 +225,7 @@ static int mext_page_mkuptodate(struct folio *folio, si= ze_t from, size_t to) do { if (bh_offset(bh) + blocksize <=3D from) continue; - if (bh_offset(bh) > to) + if (bh_offset(bh) >=3D to) break; wait_on_buffer(bh); if (buffer_uptodate(bh)) --=20 2.46.1 From nobody Thu Oct 2 03:28:36 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C34AB86329; Tue, 23 Sep 2025 01:29:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590973; cv=none; b=cXYwkyrxgpeFkZotvbb+EOjXT7Ng9UnCJ6j1R9cFsbrqkADfSMm5jb0dP0HwygJlBw6Qper1ImCoHIO8dKJujIkgqFrZ/JlJ8keNmqXRsvPSyNVS/gKCMx/D31PLsaWQbk8bxd2ld2XsALlhOMMwWNxVbK4TY79kn1T1CiQmYlE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590973; c=relaxed/simple; bh=4JqpOUpiWcHHZxVPF0acWtZVQzg65/SoCmh1zSVcpfQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=CWph6i9YdnliGa5a2VCnjsJMKTt9/rhvLLjNaxStTz/UbX0ZQwNyva1iwnOHDCwAuI4/pRcOFRlTUG7Yws7OOFA8pocrGH1FtWLsd9l+a+I+hrqsDqb/gOcJlqZfWt3nZguHJCxkcA4lsryV6jysSRe4OK52fzceDUXCQOSYNhk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4cW2Sv0FrHzYQtpy; Tue, 23 Sep 2025 09:29:19 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 2F7B31A0BEB; Tue, 23 Sep 2025 09:29:24 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP4 (Coremail) with SMTP id gCh0CgAXKWHq99FoGYYGAg--.10941S6; Tue, 23 Sep 2025 09:29:24 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH 02/13] ext4: correct the checking of quota files before moving extents Date: Tue, 23 Sep 2025 09:27:12 +0800 Message-ID: <20250923012724.2378858-3-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> References: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAXKWHq99FoGYYGAg--.10941S6 X-Coremail-Antispam: 1UD129KBjvdXoW7GF43Xr4UXr1DAFyfXrWrXwb_yoWDGwbEya yxCrWkZrsYvFWvgrs5JFyrJrs2kF4rGFn8WFZ5Cr13ur1xXr4kGrnYqrnIyr98Wr4UKrZx ZFs7tryayryIgjkaLaAFLSUrUUUUjb8apTn2vfkv8UJUUUU8Yxn0WfASr-VFAUDa7-sFnT 9fnUUIcSsGvfJTRUUUbkxFF20E14v26rWj6s0DM7CY07I20VC2zVCF04k26cxKx2IYs7xG 6rWj6s0DM7CIcVAFz4kK6r1j6r18M28IrcIa0xkI8VA2jI8067AKxVWUXwA2048vs2IY02 0Ec7CjxVAFwI0_Xr0E3s1l8cAvFVAK0II2c7xJM28CjxkF64kEwVA0rcxSw2x7M28EF7xv wVC0I7IYx2IY67AKxVWDJVCq3wA2z4x0Y4vE2Ix0cI8IcVCY1x0267AKxVW8Jr0_Cr1UM2 8EF7xvwVC2z280aVAFwI0_GcCE3s1l84ACjcxK6I8E87Iv6xkF7I0E14v26rxl6s0DM2AI xVAIcxkEcVAq07x20xvEncxIr21l5I8CrVACY4xI64kE6c02F40Ex7xfMcIj6xIIjxv20x vE14v26r106r15McIj6I8E87Iv67AKxVWUJVW8JwAm72CE4IkC6x0Yz7v_Jr0_Gr1lF7xv r2IYc2Ij64vIr41lF7I21c0EjII2zVCS5cI20VAGYxC7M4IIrI8v6xkF7I0E8cxan2IY04 v7MxkF7I0En4kS14v26r1q6r43MxAIw28IcxkI7VAKI48JMxC20s026xCaFVCjc4AY6r1j 6r4UMI8I3I0E5I8CrVAFwI0_Jr0_Jr4lx2IqxVCjr7xvwVAFwI0_JrI_JrWlx4CE17CEb7 AF67AKxVWUtVW8ZwCIc40Y0x0EwIxGrwCI42IY6xIIjxv20xvE14v26r1j6r1xMIIF0xvE 2Ix0cI8IcVCY1x0267AKxVW8JVWxJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF4lIxAIcV C2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8JVW8JrUvcSsGvfC2Kfnx nUUI43ZEXa7VU1c18PUUUUU== X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi The move extent operation should return -EOPNOTSUPP if any of the inodes is a quota inode, rather than requiring both to be quota inodes. Fixes: 02749a4c2082 ("ext4: add ext4_is_quota_file()") Signed-off-by: Zhang Yi --- fs/ext4/move_extent.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 4b091c21908f..0f4b7c89edd3 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -485,7 +485,7 @@ mext_check_arguments(struct inode *orig_inode, return -ETXTBSY; } =20 - if (ext4_is_quota_file(orig_inode) && ext4_is_quota_file(donor_inode)) { + if (ext4_is_quota_file(orig_inode) || ext4_is_quota_file(donor_inode)) { ext4_debug("ext4 move extent: The argument files should not be quota fil= es [ino:orig %lu, donor %lu]\n", orig_inode->i_ino, donor_inode->i_ino); return -EOPNOTSUPP; --=20 2.46.1 From nobody Thu Oct 2 03:28:36 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C351023A984; Tue, 23 Sep 2025 01:29:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590974; cv=none; b=JMQKSFCrrgbsOV5hRby1wl2YQcATSZb4L6A5ptnz4/N6twM0juT7idam7bXzIfVqGCLALSSKa4S0uBJ/74AjWSmBZyrRklM/KiuVMHUlfaxUhBr7gMMrJd0JtF0enxAr/fC0Q1i5Lst26cAAufJVmEvESbPa0nwK/626c+3UtKU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590974; c=relaxed/simple; bh=koa7EGh8p2mJad9cf3paaa2EDxk3TU8qJD4BEi6WdTk=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=jXCgwUvCIQNPqvKzzIsMOlR+ZAyqKmnweN7/dzXOepOzWYSqKite55B1tlcA3Qbm8f871P30R2LS7e6JFhEsOdAaMN1Mb8PkUPK5NjENnPg0DeuH0HdLRoe9n4tooX+V0UXm3K7/iRS3k7wLMhgYYEuqYmD+lTFYKlrvfVc5aMc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4cW2Sv0l6XzYQtpy; Tue, 23 Sep 2025 09:29:19 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 426EA1A0ADD; Tue, 23 Sep 2025 09:29:24 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP4 (Coremail) with SMTP id gCh0CgAXKWHq99FoGYYGAg--.10941S7; Tue, 23 Sep 2025 09:29:24 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH 03/13] ext4: introduce seq counter for the extent status entry Date: Tue, 23 Sep 2025 09:27:13 +0800 Message-ID: <20250923012724.2378858-4-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> References: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAXKWHq99FoGYYGAg--.10941S7 X-Coremail-Antispam: 1UD129KBjvJXoWxtrW8WF4Uur1kZw1DWr45Wrg_yoW3tw4DpF ZxAryUWrWrXw4j9ayxXw1UXr15Xa48WrW7Jr9Fgw1fZFW8JFyqgF1DtFyjvF90qrWFvrnx XFWFyryDC3Wjga7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUm014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0x vEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVj vjDU0xZFpf9x0JUCg4hUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi In the iomap_write_iter(), the iomap buffered write frame does not hold any locks between querying the inode extent mapping info and performing page cache writes. As a result, the extent mapping can be changed due to concurrent I/O in flight. Similarly, in the iomap_writepage_map(), the write-back process faces a similar problem: concurrent changes can invalidate the extent mapping before the I/O is submitted. Therefore, both of these processes must recheck the mapping info after acquiring the folio lock. To address this, similar to XFS, we propose introducing an extent sequence number to serve as a validity cookie for the extent. After commit 24b7a2331fcd ("ext4: clairfy the rules for modifying extents"), we can ensure the extent information should always be processed through the extent status tree, and the extent status tree is always uptodate under i_rwsem or invalidate_lock or folio lock, so it's safe to introduce this sequence number. The sequence number will be increased whenever the extent status tree changes, preparing for the buffered write iomap conversion. Besides, this mechanism is also applicable for the moving extents case. In move_extent_per_page(), it also needs to reacquire data_sem and check the mapping info again under the folio lock. Signed-off-by: Zhang Yi --- fs/ext4/ext4.h | 2 ++ fs/ext4/extents_status.c | 21 +++++++++++++++++---- fs/ext4/super.c | 1 + include/trace/events/ext4.h | 23 +++++++++++++++-------- 4 files changed, 35 insertions(+), 12 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 01a6e2de7fc3..7b37a661dd37 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1138,6 +1138,8 @@ struct ext4_inode_info { ext4_lblk_t i_es_shrink_lblk; /* Offset where we start searching for extents to shrink. Protected by i_es_lock */ + u64 i_es_seq; /* Change counter for extents. + Protected by i_es_lock */ =20 /* ialloc */ ext4_group_t i_last_alloc_group; diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 31dc0496f8d0..62886e18e2a3 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -235,6 +235,13 @@ static inline ext4_lblk_t ext4_es_end(struct extent_st= atus *es) return es->es_lblk + es->es_len - 1; } =20 +static inline void ext4_es_inc_seq(struct inode *inode) +{ + struct ext4_inode_info *ei =3D EXT4_I(inode); + + WRITE_ONCE(ei->i_es_seq, ei->i_es_seq + 1); +} + /* * search through the tree for an delayed extent with a given offset. If * it can't be found, try to find next extent. @@ -906,7 +913,6 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lb= lk_t lblk, newes.es_lblk =3D lblk; newes.es_len =3D len; ext4_es_store_pblock_status(&newes, pblk, status); - trace_ext4_es_insert_extent(inode, &newes); =20 ext4_es_insert_extent_check(inode, &newes); =20 @@ -955,6 +961,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lb= lk_t lblk, } pending =3D err3; } + ext4_es_inc_seq(inode); error: write_unlock(&EXT4_I(inode)->i_es_lock); /* @@ -981,6 +988,7 @@ void ext4_es_insert_extent(struct inode *inode, ext4_lb= lk_t lblk, if (err1 || err2 || err3 < 0) goto retry; =20 + trace_ext4_es_insert_extent(inode, &newes); ext4_es_print_tree(inode); return; } @@ -1550,7 +1558,6 @@ void ext4_es_remove_extent(struct inode *inode, ext4_= lblk_t lblk, if (EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY) return; =20 - trace_ext4_es_remove_extent(inode, lblk, len); es_debug("remove [%u/%u) from extent status tree of inode %lu\n", lblk, len, inode->i_ino); =20 @@ -1570,16 +1577,21 @@ void ext4_es_remove_extent(struct inode *inode, ext= 4_lblk_t lblk, */ write_lock(&EXT4_I(inode)->i_es_lock); err =3D __es_remove_extent(inode, lblk, end, &reserved, es); + if (err) + goto error; /* Free preallocated extent if it didn't get used. */ if (es) { if (!es->es_len) __es_free_extent(es); es =3D NULL; } + ext4_es_inc_seq(inode); +error: write_unlock(&EXT4_I(inode)->i_es_lock); if (err) goto retry; =20 + trace_ext4_es_remove_extent(inode, lblk, len); ext4_es_print_tree(inode); ext4_da_release_space(inode, reserved); } @@ -2140,8 +2152,6 @@ void ext4_es_insert_delayed_extent(struct inode *inod= e, ext4_lblk_t lblk, newes.es_lblk =3D lblk; newes.es_len =3D len; ext4_es_store_pblock_status(&newes, ~0, EXTENT_STATUS_DELAYED); - trace_ext4_es_insert_delayed_extent(inode, &newes, lclu_allocated, - end_allocated); =20 ext4_es_insert_extent_check(inode, &newes); =20 @@ -2196,11 +2206,14 @@ void ext4_es_insert_delayed_extent(struct inode *in= ode, ext4_lblk_t lblk, pr2 =3D NULL; } } + ext4_es_inc_seq(inode); error: write_unlock(&EXT4_I(inode)->i_es_lock); if (err1 || err2 || err3 < 0) goto retry; =20 + trace_ext4_es_insert_delayed_extent(inode, &newes, lclu_allocated, + end_allocated); ext4_es_print_tree(inode); ext4_print_pending_tree(inode); return; diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 699c15db28a8..30682df3eeef 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1397,6 +1397,7 @@ static struct inode *ext4_alloc_inode(struct super_bl= ock *sb) ei->i_es_all_nr =3D 0; ei->i_es_shk_nr =3D 0; ei->i_es_shrink_lblk =3D 0; + ei->i_es_seq =3D 0; ei->i_reserved_data_blocks =3D 0; spin_lock_init(&(ei->i_block_reservation_lock)); ext4_init_pending_tree(&ei->i_pending_tree); diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index a374e7ea7e57..6a0754d38acf 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -2210,7 +2210,8 @@ DECLARE_EVENT_CLASS(ext4__es_extent, __field( ext4_lblk_t, lblk ) __field( ext4_lblk_t, len ) __field( ext4_fsblk_t, pblk ) - __field( char, status ) + __field( char, status ) + __field( u64, seq ) ), =20 TP_fast_assign( @@ -2220,13 +2221,15 @@ DECLARE_EVENT_CLASS(ext4__es_extent, __entry->len =3D es->es_len; __entry->pblk =3D ext4_es_show_pblock(es); __entry->status =3D ext4_es_status(es); + __entry->seq =3D EXT4_I(inode)->i_es_seq; ), =20 - TP_printk("dev %d,%d ino %lu es [%u/%u) mapped %llu status %s", + TP_printk("dev %d,%d ino %lu es [%u/%u) mapped %llu status %s seq %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->lblk, __entry->len, - __entry->pblk, show_extent_status(__entry->status)) + __entry->pblk, show_extent_status(__entry->status), + __entry->seq) ); =20 DEFINE_EVENT(ext4__es_extent, ext4_es_insert_extent, @@ -2251,6 +2254,7 @@ TRACE_EVENT(ext4_es_remove_extent, __field( ino_t, ino ) __field( loff_t, lblk ) __field( loff_t, len ) + __field( u64, seq ) ), =20 TP_fast_assign( @@ -2258,12 +2262,13 @@ TRACE_EVENT(ext4_es_remove_extent, __entry->ino =3D inode->i_ino; __entry->lblk =3D lblk; __entry->len =3D len; + __entry->seq =3D EXT4_I(inode)->i_es_seq; ), =20 - TP_printk("dev %d,%d ino %lu es [%lld/%lld)", + TP_printk("dev %d,%d ino %lu es [%lld/%lld) seq %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, - __entry->lblk, __entry->len) + __entry->lblk, __entry->len, __entry->seq) ); =20 TRACE_EVENT(ext4_es_find_extent_range_enter, @@ -2523,6 +2528,7 @@ TRACE_EVENT(ext4_es_insert_delayed_extent, __field( char, status ) __field( bool, lclu_allocated ) __field( bool, end_allocated ) + __field( u64, seq ) ), =20 TP_fast_assign( @@ -2534,15 +2540,16 @@ TRACE_EVENT(ext4_es_insert_delayed_extent, __entry->status =3D ext4_es_status(es); __entry->lclu_allocated =3D lclu_allocated; __entry->end_allocated =3D end_allocated; + __entry->seq =3D EXT4_I(inode)->i_es_seq; ), =20 - TP_printk("dev %d,%d ino %lu es [%u/%u) mapped %llu status %s " - "allocated %d %d", + TP_printk("dev %d,%d ino %lu es [%u/%u) mapped %llu status %s allocated %= d %d seq %llu", MAJOR(__entry->dev), MINOR(__entry->dev), (unsigned long) __entry->ino, __entry->lblk, __entry->len, __entry->pblk, show_extent_status(__entry->status), - __entry->lclu_allocated, __entry->end_allocated) + __entry->lclu_allocated, __entry->end_allocated, + __entry->seq) ); =20 /* fsmap traces */ --=20 2.46.1 From nobody Thu Oct 2 03:28:36 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DC9CE2459E5; Tue, 23 Sep 2025 01:29:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590976; cv=none; b=YmO7njHdJ2pntBjSmVaR/AWqWFZcT7FCYgiyuwC0xgr1+clFWf9sHgsc0DfTjENfQZCndspe4Cd6eDIirqHMDz7J0Kq+wHcgqnFER2t4jPufD1yE1dg50UJ1Ls25rg93dDrUz9CHkn4LIaoiAYeDP6Sb6YmnRLxlA7klnjDUeG8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590976; c=relaxed/simple; bh=uwXA7214rTZLSUx7YTuJcKawCH8JK9347t+S5f/2ZvY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=d/bLAIfp3f2N/HoE4Q14ot4eSnxvXGb8dSm2tzE5B6BsHw4SVei3lRQQSMLhduLjrI8dX3apfZ8YMiEv3ieGAL6KB0CIosXK1TXMZjdB6L7/HIfsBTBb9JzRW1yBdEkOR+H+EcFYAgY4dSN+EnWp8AJi8p3lJ3w3d01TnmxL5f4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4cW2Sv6GkczKHLxL; Tue, 23 Sep 2025 09:29:19 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 5C16F1A1026; Tue, 23 Sep 2025 09:29:24 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP4 (Coremail) with SMTP id gCh0CgAXKWHq99FoGYYGAg--.10941S8; Tue, 23 Sep 2025 09:29:24 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH 04/13] ext4: make ext4_es_lookup_extent() pass out the extent seq counter Date: Tue, 23 Sep 2025 09:27:14 +0800 Message-ID: <20250923012724.2378858-5-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> References: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAXKWHq99FoGYYGAg--.10941S8 X-Coremail-Antispam: 1UD129KBjvJXoWxGw4rJFyUAry8Cr13CFy5CFg_yoWrZw15p3 9xAr4UGw4fZw1v9ayxKF47Zr15K3WYkrW7Cr93Kw1rKa4rXrySyF10yFW2yFyFgrWIqwn0 vF40kw1UJa1fKFDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUriihUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi When querying extents in the extent status tree, we should hold the data_sem if we want to obtain the sequence number as a valid cookie simultaneously. However, currently, ext4_map_blocks() calls ext4_es_lookup_extent() without holding data_sem. Therefore, we should acquire i_es_lock instead, which also ensures that the sequence cookie and the extent remain consistent. Consequently, make ext4_es_lookup_extent() to pass out the sequence number when necessary. Signed-off-by: Zhang Yi --- fs/ext4/extents.c | 2 +- fs/ext4/extents_status.c | 6 ++++-- fs/ext4/extents_status.h | 2 +- fs/ext4/inode.c | 8 ++++---- 4 files changed, 10 insertions(+), 8 deletions(-) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index ca5499e9412b..c7d219e6c6d8 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -2213,7 +2213,7 @@ static int ext4_fill_es_cache_info(struct inode *inod= e, while (block <=3D end) { next =3D 0; flags =3D 0; - if (!ext4_es_lookup_extent(inode, block, &next, &es)) + if (!ext4_es_lookup_extent(inode, block, &next, &es, NULL)) break; if (ext4_es_is_unwritten(&es)) flags |=3D FIEMAP_EXTENT_UNWRITTEN; diff --git a/fs/ext4/extents_status.c b/fs/ext4/extents_status.c index 62886e18e2a3..9bf2f48d8ffe 100644 --- a/fs/ext4/extents_status.c +++ b/fs/ext4/extents_status.c @@ -1035,8 +1035,8 @@ void ext4_es_cache_extent(struct inode *inode, ext4_l= blk_t lblk, * Return: 1 on found, 0 on not */ int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk, - ext4_lblk_t *next_lblk, - struct extent_status *es) + ext4_lblk_t *next_lblk, struct extent_status *es, + u64 *pseq) { struct ext4_es_tree *tree; struct ext4_es_stats *stats; @@ -1095,6 +1095,8 @@ int ext4_es_lookup_extent(struct inode *inode, ext4_l= blk_t lblk, } else *next_lblk =3D 0; } + if (pseq) + *pseq =3D EXT4_I(inode)->i_es_seq; } else { percpu_counter_inc(&stats->es_stats_cache_misses); } diff --git a/fs/ext4/extents_status.h b/fs/ext4/extents_status.h index 8f9c008d11e8..f3396cf32b44 100644 --- a/fs/ext4/extents_status.h +++ b/fs/ext4/extents_status.h @@ -148,7 +148,7 @@ extern void ext4_es_find_extent_range(struct inode *ino= de, struct extent_status *es); extern int ext4_es_lookup_extent(struct inode *inode, ext4_lblk_t lblk, ext4_lblk_t *next_lblk, - struct extent_status *es); + struct extent_status *es, u64 *pseq); extern bool ext4_es_scan_range(struct inode *inode, int (*matching_fn)(struct extent_status *es), ext4_lblk_t lblk, ext4_lblk_t end); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 5b7a15db4953..c7fac4b89c88 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -649,7 +649,7 @@ static int ext4_map_create_blocks(handle_t *handle, str= uct inode *inode, * extent status tree. */ if (flags & EXT4_GET_BLOCKS_PRE_IO && - ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) { + ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) { if (ext4_es_is_written(&es)) return retval; } @@ -723,7 +723,7 @@ int ext4_map_blocks(handle_t *handle, struct inode *ino= de, ext4_check_map_extents_env(inode); =20 /* Lookup extent status tree firstly */ - if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) { + if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) { if (ext4_es_is_written(&es) || ext4_es_is_unwritten(&es)) { map->m_pblk =3D ext4_es_pblock(&es) + map->m_lblk - es.es_lblk; @@ -1908,7 +1908,7 @@ static int ext4_da_map_blocks(struct inode *inode, st= ruct ext4_map_blocks *map) ext4_check_map_extents_env(inode); =20 /* Lookup extent status tree firstly */ - if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) { + if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) { map->m_len =3D min_t(unsigned int, map->m_len, es.es_len - (map->m_lblk - es.es_lblk)); =20 @@ -1961,7 +1961,7 @@ static int ext4_da_map_blocks(struct inode *inode, st= ruct ext4_map_blocks *map) * is held in write mode, before inserting a new da entry in * the extent status tree. */ - if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es)) { + if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) { map->m_len =3D min_t(unsigned int, map->m_len, es.es_len - (map->m_lblk - es.es_lblk)); =20 --=20 2.46.1 From nobody Thu Oct 2 03:28:36 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AE6EF2459D4; Tue, 23 Sep 2025 01:29:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590974; cv=none; b=md9gX5d/f1LaSw455JGT7SNkNt+2hV2YbMvLECB8f2fht7FZ3CdqfOnXPujMdhbQC7nMJ9iat101xRosC56AWWzrPBGRw6jZ02+xXLdqe+86kgAbBC4yvAJjE62QXTJ2qxytUKalEw1ypojxuzonOvWTDc2/H5JWBWmpbCnM67I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590974; c=relaxed/simple; bh=9+5OxevCijthUzY/16DUyhS/Q/mMCPWDXDkf74vdWao=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=K/OcSQRy6zccmxsOXoPHy/SVeGc477K2VI8D1qSgXywiR0t+hVzghWmfvYW7ohqYbyOCCZfKYOiMQACGENigrllQxAcqyI7SAOJtdbrGrc90gRdKbdRN38jL4vjba3VkE9ZX3mUhbMaZf4LJuA1QJK2Srj/m0L1gq93QXots+7Q= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4cW2Sv6nQCzKHMhJ; Tue, 23 Sep 2025 09:29:19 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 6CBBC1A1AE9; Tue, 23 Sep 2025 09:29:24 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP4 (Coremail) with SMTP id gCh0CgAXKWHq99FoGYYGAg--.10941S9; Tue, 23 Sep 2025 09:29:24 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH 05/13] ext4: pass out extent seq counter when mapping blocks Date: Tue, 23 Sep 2025 09:27:15 +0800 Message-ID: <20250923012724.2378858-6-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> References: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAXKWHq99FoGYYGAg--.10941S9 X-Coremail-Antispam: 1UD129KBjvJXoWxXryrtr4xKrWrGw15CryDJrb_yoW5KF18pr ZrAr1rGr4UWw1q9F4SyF4UZF1a93W5KrW7J397WryFya4fJrn3tF1jyF1ayF98KrWfX3WF qFW5K34UCa1fGa7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUriihUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi When creating or querying mapping blocks using the ext4_map_blocks() and ext4_map_{query|create}_blocks() helpers, also pass out the extent sequence number of the block mapping info through the ext4_map_blocks structure. This sequence number can later serve as a valid cookie within iomap infrastructure and the move extents procedure. Signed-off-by: Zhang Yi --- fs/ext4/ext4.h | 1 + fs/ext4/inode.c | 24 ++++++++++++++++-------- 2 files changed, 17 insertions(+), 8 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 7b37a661dd37..7f452895ec09 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -260,6 +260,7 @@ struct ext4_map_blocks { ext4_lblk_t m_lblk; unsigned int m_len; unsigned int m_flags; + u64 m_seq; }; =20 /* diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index c7fac4b89c88..d005a4f3f4b3 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -550,10 +550,13 @@ static int ext4_map_query_blocks(handle_t *handle, st= ruct inode *inode, retval =3D ext4_ext_map_blocks(handle, inode, map, flags); else retval =3D ext4_ind_map_blocks(handle, inode, map, flags); - - if (retval <=3D 0) + if (retval < 0) return retval; =20 + /* A hole? */ + if (retval =3D=3D 0) + goto out; + if (unlikely(retval !=3D map->m_len)) { ext4_warning(inode->i_sb, "ES len assertion failed for inode " @@ -573,11 +576,13 @@ static int ext4_map_query_blocks(handle_t *handle, st= ruct inode *inode, EXTENT_STATUS_UNWRITTEN : EXTENT_STATUS_WRITTEN; ext4_es_insert_extent(inode, map->m_lblk, map->m_len, map->m_pblk, status, false); - return retval; + } else { + retval =3D ext4_map_query_blocks_next_in_leaf(handle, inode, map, + orig_mlen); } - - return ext4_map_query_blocks_next_in_leaf(handle, inode, map, - orig_mlen); +out: + map->m_seq =3D READ_ONCE(EXT4_I(inode)->i_es_seq); + return retval; } =20 static int ext4_map_create_blocks(handle_t *handle, struct inode *inode, @@ -649,7 +654,7 @@ static int ext4_map_create_blocks(handle_t *handle, str= uct inode *inode, * extent status tree. */ if (flags & EXT4_GET_BLOCKS_PRE_IO && - ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) { + ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, &map->m_seq)) { if (ext4_es_is_written(&es)) return retval; } @@ -658,6 +663,7 @@ static int ext4_map_create_blocks(handle_t *handle, str= uct inode *inode, EXTENT_STATUS_UNWRITTEN : EXTENT_STATUS_WRITTEN; ext4_es_insert_extent(inode, map->m_lblk, map->m_len, map->m_pblk, status, flags & EXT4_GET_BLOCKS_DELALLOC_RESERVE); + map->m_seq =3D READ_ONCE(EXT4_I(inode)->i_es_seq); =20 return retval; } @@ -723,7 +729,7 @@ int ext4_map_blocks(handle_t *handle, struct inode *ino= de, ext4_check_map_extents_env(inode); =20 /* Lookup extent status tree firstly */ - if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, NULL)) { + if (ext4_es_lookup_extent(inode, map->m_lblk, NULL, &es, &map->m_seq)) { if (ext4_es_is_written(&es) || ext4_es_is_unwritten(&es)) { map->m_pblk =3D ext4_es_pblock(&es) + map->m_lblk - es.es_lblk; @@ -1979,6 +1985,8 @@ static int ext4_da_map_blocks(struct inode *inode, st= ruct ext4_map_blocks *map) =20 map->m_flags |=3D EXT4_MAP_DELAYED; retval =3D ext4_insert_delayed_blocks(inode, map->m_lblk, map->m_len); + if (!retval) + map->m_seq =3D READ_ONCE(EXT4_I(inode)->i_es_seq); up_write(&EXT4_I(inode)->i_data_sem); =20 return retval; --=20 2.46.1 From nobody Thu Oct 2 03:28:36 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C35852405FD; Tue, 23 Sep 2025 01:29:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590973; cv=none; b=oniG6qOw79P/ddEyj7mnX/3Wd9qq2aC9WAQjW3h1NQvJ2OuCvCBaWOd1V6pl4QHxleifPoHYY6EkoqHB4RH2vxVluanquiNNB0B8y3SlXdKGihaZJ0UbtDifrqT88ksil1B/qeNXi3rilZBJm5gj2urmW/Ky05xJ9+GWg5o3x3U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590973; c=relaxed/simple; bh=xpHNl9uCs96ADDOmeKTxWjgPzyvdhOzL7M7eU8voDVc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=o3FafHVC+GuTlfuzspor+U7fKsfJWvkIGJRRbNBMRfXEkfv05YTxzFpazZBwZ2i9IF++diRjErbw8NzC41Gnh9OyRzmGDlLRfZJaut5eQAfR8UOhDJiEV//ZXx33J9pK8rvWkmzgqUNKZklD53D0pvhgEFznfaeTbFm82gCEwDU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4cW2Sv2KlpzYQtpy; Tue, 23 Sep 2025 09:29:19 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 776221A1326; Tue, 23 Sep 2025 09:29:24 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP4 (Coremail) with SMTP id gCh0CgAXKWHq99FoGYYGAg--.10941S10; Tue, 23 Sep 2025 09:29:24 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH 06/13] ext4: use EXT4_B_TO_LBLK() in mext_check_arguments() Date: Tue, 23 Sep 2025 09:27:16 +0800 Message-ID: <20250923012724.2378858-7-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> References: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAXKWHq99FoGYYGAg--.10941S10 X-Coremail-Antispam: 1UD129KBjvJXoW7Ar1fCFy3CrW5ur1rZry8Grg_yoW8Gr4xp3 WIyFs5C3yqqa4Y9w409F1Iv348Ka1xGr47XrWfJr4UWay0kFyFgF1UKan8AFyjqrWkJ34r ZFn2kr17X345G3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUriihUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi Switch to using EXT4_B_TO_LBLK() to calculate the EOF position of the origin and donor inodes, instead of using open-coded calculations. Signed-off-by: Zhang Yi --- fs/ext4/move_extent.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 0f4b7c89edd3..6175906c7119 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -461,12 +461,6 @@ mext_check_arguments(struct inode *orig_inode, __u64 donor_start, __u64 *len) { __u64 orig_eof, donor_eof; - unsigned int blkbits =3D orig_inode->i_blkbits; - unsigned int blocksize =3D 1 << blkbits; - - orig_eof =3D (i_size_read(orig_inode) + blocksize - 1) >> blkbits; - donor_eof =3D (i_size_read(donor_inode) + blocksize - 1) >> blkbits; - =20 if (donor_inode->i_mode & (S_ISUID|S_ISGID)) { ext4_debug("ext4 move extent: suid or sgid is set" @@ -526,6 +520,9 @@ mext_check_arguments(struct inode *orig_inode, orig_inode->i_ino, donor_inode->i_ino); return -EINVAL; } + + orig_eof =3D EXT4_B_TO_LBLK(orig_inode, i_size_read(orig_inode)); + donor_eof =3D EXT4_B_TO_LBLK(donor_inode, i_size_read(donor_inode)); if (orig_eof <=3D orig_start) *len =3D 0; else if (orig_eof < orig_start + *len - 1) --=20 2.46.1 From nobody Thu Oct 2 03:28:36 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C3C2C2701B4; Tue, 23 Sep 2025 01:29:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590978; cv=none; b=jirTgIbVshynbOfgqg1ZmH/zLUWeYuE+92xPcguzwJnaEHMvNCQa6/bhHTxumWexqNDTLPd10UL4CdsyHI6Xm+VfylZrrNvmkHHjhr/9tv8FGZ3Zj4+1n6LZuu/6mR+qJ1vZWSWzKy08D90qvdUTU6HTeJSUuMMSkn/4QK//jKs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590978; c=relaxed/simple; bh=C2oiyfr9F2erbOeu81kYs9iQQq2Amb2HQt4fHUjbKMs=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Fy/nygvSmRsAGFx27KICcf+GAPm2DwdfaXLdn1kC0Twvs19PEjVQ5+XmZ669+kOsXfIoSK94AWc9oI9LfSipCjBQgXCTYojSbtUlfupxwpqKrKQMNfRw/keOEnrtP/4njKyvFu9teOnY4D9DC1pIZqc5iwfVZmi7u/nQ3dc9UGY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4cW2Sv2txyzYQv3W; Tue, 23 Sep 2025 09:29:19 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 8DAE01A103D; Tue, 23 Sep 2025 09:29:24 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP4 (Coremail) with SMTP id gCh0CgAXKWHq99FoGYYGAg--.10941S11; Tue, 23 Sep 2025 09:29:24 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH 07/13] ext4: add mext_check_validity() to do basic check Date: Tue, 23 Sep 2025 09:27:17 +0800 Message-ID: <20250923012724.2378858-8-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> References: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAXKWHq99FoGYYGAg--.10941S11 X-Coremail-Antispam: 1UD129KBjvJXoW3Xr43Ar4xGFyfJFWxKrWktFb_yoW7Xr15pF yxCr13X34jqas0k3yrtFsxXr1Y93WxKr42grZ3Xw48XFWDCF9Igw1DJa1vv3WUtrWDJ3y0 qF42kry7ua17JaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUriihUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi Currently, the basic validation checks during the move extent operation are scattered across __ext4_ioctl() and ext4_move_extents(), which makes the code somewhat disorganized. Introduce a new helper, mext_check_validity(), to handle these checks. This change involves only code relocation without any logical modifications. Signed-off-by: Zhang Yi --- fs/ext4/ioctl.c | 10 ----- fs/ext4/move_extent.c | 102 +++++++++++++++++++++++++++--------------- 2 files changed, 65 insertions(+), 47 deletions(-) diff --git a/fs/ext4/ioctl.c b/fs/ext4/ioctl.c index 84e3c73952d7..a0d3a951ae85 100644 --- a/fs/ext4/ioctl.c +++ b/fs/ext4/ioctl.c @@ -1349,16 +1349,6 @@ static long __ext4_ioctl(struct file *filp, unsigned= int cmd, unsigned long arg) if (!(fd_file(donor)->f_mode & FMODE_WRITE)) return -EBADF; =20 - if (ext4_has_feature_bigalloc(sb)) { - ext4_msg(sb, KERN_ERR, - "Online defrag not supported with bigalloc"); - return -EOPNOTSUPP; - } else if (IS_DAX(inode)) { - ext4_msg(sb, KERN_ERR, - "Online defrag not supported with DAX"); - return -EOPNOTSUPP; - } - err =3D mnt_want_write_file(filp); if (err) return err; diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 6175906c7119..92f4cba3516d 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -442,6 +442,68 @@ move_extent_per_page(struct file *o_filp, struct inode= *donor_inode, goto unlock_folios; } =20 +/* + * Check the validity of the basic filesystem environment and the + * inodes' support status. + */ +static int mext_check_validity(struct inode *orig_inode, + struct inode *donor_inode) +{ + struct super_block *sb =3D orig_inode->i_sb; + + if (ext4_has_feature_bigalloc(sb)) { + ext4_msg(sb, KERN_ERR, + "Online defrag not supported with bigalloc"); + return -EOPNOTSUPP; + } + + if (IS_DAX(orig_inode)) { + ext4_msg(sb, KERN_ERR, + "Online defrag not supported with DAX"); + return -EOPNOTSUPP; + } + + /* + * TODO: it's not obvious how to swap blocks for inodes with full + * journaling enabled. + */ + if (ext4_should_journal_data(orig_inode) || + ext4_should_journal_data(donor_inode)) { + ext4_msg(sb, KERN_ERR, + "Online defrag not supported with data journaling"); + return -EOPNOTSUPP; + } + + if (IS_ENCRYPTED(orig_inode) || IS_ENCRYPTED(donor_inode)) { + ext4_msg(sb, KERN_ERR, + "Online defrag not supported for encrypted files"); + return -EOPNOTSUPP; + } + + /* origin and donor should be different inodes */ + if (orig_inode =3D=3D donor_inode) { + ext4_debug("ext4 move extent: The argument files should not be same inod= e [ino:orig %lu, donor %lu]\n", + orig_inode->i_ino, donor_inode->i_ino); + return -EINVAL; + } + + /* origin and donor should belone to the same filesystem */ + if (orig_inode->i_sb !=3D donor_inode->i_sb) { + ext4_debug("ext4 move extent: The argument files should be in same FS [i= no:orig %lu, donor %lu]\n", + orig_inode->i_ino, donor_inode->i_ino); + return -EINVAL; + } + + /* Regular file check */ + if (!S_ISREG(orig_inode->i_mode) || !S_ISREG(donor_inode->i_mode)) { + ext4_debug("ext4 move extent: The argument files should be regular file = [ino:orig %lu, donor %lu]\n", + orig_inode->i_ino, donor_inode->i_ino); + return -EINVAL; + } + + return 0; +} + /** * mext_check_arguments - Check whether move extent can be done * @@ -567,43 +629,9 @@ ext4_move_extents(struct file *o_filp, struct file *d_= filp, __u64 orig_blk, ext4_lblk_t d_start =3D donor_blk; int ret; =20 - if (orig_inode->i_sb !=3D donor_inode->i_sb) { - ext4_debug("ext4 move extent: The argument files " - "should be in same FS [ino:orig %lu, donor %lu]\n", - orig_inode->i_ino, donor_inode->i_ino); - return -EINVAL; - } - - /* orig and donor should be different inodes */ - if (orig_inode =3D=3D donor_inode) { - ext4_debug("ext4 move extent: The argument files should not " - "be same inode [ino:orig %lu, donor %lu]\n", - orig_inode->i_ino, donor_inode->i_ino); - return -EINVAL; - } - - /* Regular file check */ - if (!S_ISREG(orig_inode->i_mode) || !S_ISREG(donor_inode->i_mode)) { - ext4_debug("ext4 move extent: The argument files should be " - "regular file [ino:orig %lu, donor %lu]\n", - orig_inode->i_ino, donor_inode->i_ino); - return -EINVAL; - } - - /* TODO: it's not obvious how to swap blocks for inodes with full - journaling enabled */ - if (ext4_should_journal_data(orig_inode) || - ext4_should_journal_data(donor_inode)) { - ext4_msg(orig_inode->i_sb, KERN_ERR, - "Online defrag not supported with data journaling"); - return -EOPNOTSUPP; - } - - if (IS_ENCRYPTED(orig_inode) || IS_ENCRYPTED(donor_inode)) { - ext4_msg(orig_inode->i_sb, KERN_ERR, - "Online defrag not supported for encrypted files"); - return -EOPNOTSUPP; - } + ret =3D mext_check_validity(orig_inode, donor_inode); + if (ret) + return ret; =20 /* Protect orig and donor inodes against a truncate */ lock_two_nondirectories(orig_inode, donor_inode); --=20 2.46.1 From nobody Thu Oct 2 03:28:36 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ADCA8242D83; Tue, 23 Sep 2025 01:29:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590974; cv=none; b=gcuT/8YzzOLCi4bBJ5Iqt0hlagT82QyLFrC+sKSuezf9KjuYJwQWdnfIt2tVgO4Hi4wl4VTe9flDl7PWc19LFHBCm7tmyuQ+56EE4JqAiziQX2SCLYD4VEBHIOn8qdXBLCTLUJR+gFXlyhyS4jmD8ShpaK9HLgSQsr6jz5Gry7E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590974; c=relaxed/simple; bh=nKaAm1F8OEOdDyqecoIvhawyqBA/Ra5HiNQqGIZ0qOM=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Ywy5kXTmXwp6er2s3T28WNEVgU7iLr422Kq6RM1G2ivvuHXKSx8u/aNsVIkEaFinTA/9KdQrEqXvYjOSlTNLq8+5dG/DupAdYocK816s5eARxi8mQVQnamTWCL0K+JDkbBJt3wzFGu+UOJBAp7Zt7b/vTZaIWPqIkjTHs8trddU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4cW2Sw1XVxzKHMjF; Tue, 23 Sep 2025 09:29:20 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id AA96B1A1331; Tue, 23 Sep 2025 09:29:24 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP4 (Coremail) with SMTP id gCh0CgAXKWHq99FoGYYGAg--.10941S12; Tue, 23 Sep 2025 09:29:24 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH 08/13] ext4: refactor mext_check_arguments() Date: Tue, 23 Sep 2025 09:27:18 +0800 Message-ID: <20250923012724.2378858-9-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> References: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAXKWHq99FoGYYGAg--.10941S12 X-Coremail-Antispam: 1UD129KBjvJXoW3Wr4xGF45Gw4ruw4ruryxAFb_yoW3GFW8pF yxCry5Xw4vgayFg3yvyrsrXw1Fk3W3Gr47XrZ7Xw18uFykAry2ga4UJa1vqF9xJrWUJ34a vF40yrnruw1rJaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUriihUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi When moving extents, mext_check_validity() performs some basic file system and file checks. However, some essential checks need to be performed after acquiring the i_rwsem are still scattered in mext_check_arguments(). Move those checks into mext_check_validity() and make it executes entirely under the i_rwsem to make the checks clearer. Furthermore, rename mext_check_arguments() to mext_check_adjust_range(), as it only performs checks and length adjustments on the move extent range. Finally, also change the print message for the non-existent file check to be consistent with other unsupported checks. Signed-off-by: Zhang Yi --- fs/ext4/move_extent.c | 99 +++++++++++++++++++------------------------ 1 file changed, 44 insertions(+), 55 deletions(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 92f4cba3516d..580d77e51a4c 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -480,6 +480,14 @@ static int mext_check_validity(struct inode *orig_inod= e, return -EOPNOTSUPP; } =20 + /* Ext4 move extent supports only extent based file */ + if (!(ext4_test_inode_flag(orig_inode, EXT4_INODE_EXTENTS)) || + !(ext4_test_inode_flag(donor_inode, EXT4_INODE_EXTENTS))) { + ext4_msg(sb, KERN_ERR, + "Online defrag not supported for non-extent files"); + return -EOPNOTSUPP; + } + /* origin and donor should be different inodes */ if (orig_inode =3D=3D donor_inode) { ext4_debug("ext4 move extent: The argument files should not be same inod= e [ino:orig %lu, donor %lu]\n", @@ -501,60 +509,28 @@ static int mext_check_validity(struct inode *orig_ino= de, return -EINVAL; } =20 - return 0; -} - -/** - * mext_check_arguments - Check whether move extent can be done - * - * @orig_inode: original inode - * @donor_inode: donor inode - * @orig_start: logical start offset in block for orig - * @donor_start: logical start offset in block for donor - * @len: the number of blocks to be moved - * - * Check the arguments of ext4_move_extents() whether the files can be - * exchanged with each other. - * Return 0 on success, or a negative error value on failure. - */ -static int -mext_check_arguments(struct inode *orig_inode, - struct inode *donor_inode, __u64 orig_start, - __u64 donor_start, __u64 *len) -{ - __u64 orig_eof, donor_eof; - if (donor_inode->i_mode & (S_ISUID|S_ISGID)) { - ext4_debug("ext4 move extent: suid or sgid is set" - " to donor file [ino:orig %lu, donor %lu]\n", + ext4_debug("ext4 move extent: suid or sgid is set to donor file [ino:ori= g %lu, donor %lu]\n", orig_inode->i_ino, donor_inode->i_ino); return -EINVAL; } =20 - if (IS_IMMUTABLE(donor_inode) || IS_APPEND(donor_inode)) + if (IS_IMMUTABLE(donor_inode) || IS_APPEND(donor_inode)) { + ext4_debug("ext4 move extent: donor should not be immutable or append fi= le [ino:orig %lu, donor %lu]\n", + orig_inode->i_ino, donor_inode->i_ino); return -EPERM; + } =20 /* Ext4 move extent does not support swap files */ if (IS_SWAPFILE(orig_inode) || IS_SWAPFILE(donor_inode)) { ext4_debug("ext4 move extent: The argument files should not be swap file= s [ino:orig %lu, donor %lu]\n", - orig_inode->i_ino, donor_inode->i_ino); + orig_inode->i_ino, donor_inode->i_ino); return -ETXTBSY; } =20 if (ext4_is_quota_file(orig_inode) || ext4_is_quota_file(donor_inode)) { ext4_debug("ext4 move extent: The argument files should not be quota fil= es [ino:orig %lu, donor %lu]\n", - orig_inode->i_ino, donor_inode->i_ino); - return -EOPNOTSUPP; - } - - /* Ext4 move extent supports only extent based file */ - if (!(ext4_test_inode_flag(orig_inode, EXT4_INODE_EXTENTS))) { - ext4_debug("ext4 move extent: orig file is not extents " - "based file [ino:orig %lu]\n", orig_inode->i_ino); - return -EOPNOTSUPP; - } else if (!(ext4_test_inode_flag(donor_inode, EXT4_INODE_EXTENTS))) { - ext4_debug("ext4 move extent: donor file is not extents " - "based file [ino:donor %lu]\n", donor_inode->i_ino); + orig_inode->i_ino, donor_inode->i_ino); return -EOPNOTSUPP; } =20 @@ -563,12 +539,25 @@ mext_check_arguments(struct inode *orig_inode, return -EINVAL; } =20 + return 0; +} + +/* + * Check the moving range of ext4_move_extents() whether the files can be + * exchanged with each other, and adjust the length to fit within the file + * size. Return 0 on success, or a negative error value on failure. + */ +static int mext_check_adjust_range(struct inode *orig_inode, + struct inode *donor_inode, __u64 orig_start, + __u64 donor_start, __u64 *len) +{ + __u64 orig_eof, donor_eof; + /* Start offset should be same */ if ((orig_start & ~(PAGE_MASK >> orig_inode->i_blkbits)) !=3D (donor_start & ~(PAGE_MASK >> orig_inode->i_blkbits))) { - ext4_debug("ext4 move extent: orig and donor's start " - "offsets are not aligned [ino:orig %lu, donor %lu]\n", - orig_inode->i_ino, donor_inode->i_ino); + ext4_debug("ext4 move extent: orig and donor's start offsets are not ali= gned [ino:orig %lu, donor %lu]\n", + orig_inode->i_ino, donor_inode->i_ino); return -EINVAL; } =20 @@ -577,9 +566,9 @@ mext_check_arguments(struct inode *orig_inode, (*len > EXT_MAX_BLOCKS) || (donor_start + *len >=3D EXT_MAX_BLOCKS) || (orig_start + *len >=3D EXT_MAX_BLOCKS)) { - ext4_debug("ext4 move extent: Can't handle over [%u] blocks " - "[ino:orig %lu, donor %lu]\n", EXT_MAX_BLOCKS, - orig_inode->i_ino, donor_inode->i_ino); + ext4_debug("ext4 move extent: Can't handle over [%u] blocks [ino:orig %l= u, donor %lu]\n", + EXT_MAX_BLOCKS, + orig_inode->i_ino, donor_inode->i_ino); return -EINVAL; } =20 @@ -594,9 +583,8 @@ mext_check_arguments(struct inode *orig_inode, else if (donor_eof < donor_start + *len - 1) *len =3D donor_eof - donor_start; if (!*len) { - ext4_debug("ext4 move extent: len should not be 0 " - "[ino:orig %lu, donor %lu]\n", orig_inode->i_ino, - donor_inode->i_ino); + ext4_debug("ext4 move extent: len should not be 0 [ino:orig %lu, donor %= lu]\n", + orig_inode->i_ino, donor_inode->i_ino); return -EINVAL; } =20 @@ -629,22 +617,22 @@ ext4_move_extents(struct file *o_filp, struct file *d= _filp, __u64 orig_blk, ext4_lblk_t d_start =3D donor_blk; int ret; =20 - ret =3D mext_check_validity(orig_inode, donor_inode); - if (ret) - return ret; - /* Protect orig and donor inodes against a truncate */ lock_two_nondirectories(orig_inode, donor_inode); =20 + ret =3D mext_check_validity(orig_inode, donor_inode); + if (ret) + goto unlock; + /* Wait for all existing dio workers */ inode_dio_wait(orig_inode); inode_dio_wait(donor_inode); =20 /* Protect extent tree against block allocations via delalloc */ ext4_double_down_write_data_sem(orig_inode, donor_inode); - /* Check the filesystem environment whether move_extent can be done */ - ret =3D mext_check_arguments(orig_inode, donor_inode, orig_blk, - donor_blk, &len); + /* Check and adjust the specified move_extent range. */ + ret =3D mext_check_adjust_range(orig_inode, donor_inode, orig_blk, + donor_blk, &len); if (ret) goto out; o_end =3D o_start + len; @@ -725,6 +713,7 @@ ext4_move_extents(struct file *o_filp, struct file *d_f= ilp, __u64 orig_blk, =20 ext4_free_ext_path(path); ext4_double_up_write_data_sem(orig_inode, donor_inode); +unlock: unlock_two_nondirectories(orig_inode, donor_inode); =20 return ret; --=20 2.46.1 From nobody Thu Oct 2 03:28:36 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AE0DC242D99; Tue, 23 Sep 2025 01:29:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590974; cv=none; b=OAtWIOhhlBnvDzGp3ogjY/clKbCaq8TfEF3RyYI828y5iBcGoJiZHEfcyH5P4sXySUUCh1U8AoJ7/XD/PwbYrhUEv8jhHgLgjFKzzAYZnNvhOoKcMz/ifojSl0RtWBy1mY3cRcATDMvT24xPG1mZ7MkKhaVEB9V1/2Dzm2xw+xU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590974; c=relaxed/simple; bh=KQix/TKDown7JMEAEnuylVtCKTjCskt0gkwq3aN7Dt0=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k/njah2j8mk9WPp7HmF+iipRfbpuUmwW/lk41S5sPO4EAIG5qBNdV2+lB0GAmOwBYDPtIPOn917g1z9ZeRbXKMVm5YIydphQ9j6SsyxQigr2DBsLUIFk5JsNEQ+0gl6V5dNWktbwc5JTwNTUDGbIsRflN6wrFKLohMM/nBNvNJI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4cW2Sw1f3rzKHMhN; Tue, 23 Sep 2025 09:29:20 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id B3A0E1A1AF5; Tue, 23 Sep 2025 09:29:24 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP4 (Coremail) with SMTP id gCh0CgAXKWHq99FoGYYGAg--.10941S13; Tue, 23 Sep 2025 09:29:24 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH 09/13] ext4: rename mext_page_mkuptodate() to mext_folio_mkuptodate() Date: Tue, 23 Sep 2025 09:27:19 +0800 Message-ID: <20250923012724.2378858-10-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> References: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAXKWHq99FoGYYGAg--.10941S13 X-Coremail-Antispam: 1UD129KBjvJXoW7Jr1DtF17ur4kWw1Dtw45GFg_yoW8Jr4rpF y7Ca9xtrW8Zw1xuwn7tFnrZr43t347Kr4UWFWfGw1SkFy3tFy0gF1Uta15AFWFgrW8Jw4r uF4fKr1jgayUK3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi mext_page_mkuptodate() no longer works on a single page, so rename it to mext_folio_mkuptodate(). Signed-off-by: Zhang Yi --- fs/ext4/move_extent.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 580d77e51a4c..5faa55109570 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -165,7 +165,7 @@ mext_folio_double_lock(struct inode *inode1, struct ino= de *inode2, } =20 /* Force folio buffers uptodate w/o dropping folio's lock */ -static int mext_page_mkuptodate(struct folio *folio, size_t from, size_t t= o) +static int mext_folio_mkuptodate(struct folio *folio, size_t from, size_t = to) { struct inode *inode =3D folio->mapping->host; sector_t block; @@ -358,7 +358,7 @@ move_extent_per_page(struct file *o_filp, struct inode = *donor_inode, data_copy: from =3D offset_in_folio(folio[0], orig_blk_offset << orig_inode->i_blkbits); - *err =3D mext_page_mkuptodate(folio[0], from, from + replaced_size); + *err =3D mext_folio_mkuptodate(folio[0], from, from + replaced_size); if (*err) goto unlock_folios; =20 --=20 2.46.1 From nobody Thu Oct 2 03:28:36 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C4B8C2701CF; Tue, 23 Sep 2025 01:29:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590978; cv=none; b=o9oRynXAY87y4Hn0Lax7rBUtshH/06ZYnm+bnSozeW0PL+g8auXzzATOF9/aXGB+TyfS9GVeY8BC40nsA+l2HoinhWg/a6fIkkOFvi9tm3Za2gmjw08G8XCS7eadsFo8vF2IAUr6y9O3mmpHnmc3IVQMOyJ4pxCeEv8Omd84ZgA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590978; c=relaxed/simple; bh=pE5jLeNab5P/CHV7cctJJvtK+tfdOJYGsBtA8ilytsw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DjSETd1KYFIW7ArEEVv5PnqPzW2d4M7lD/wJIzGW3cQGo8Na5NlJenFaDfu8IQdbwtIv6JKeCvLxwovGU8h1g7h3o6S0aM52F2Iw996Chgj/fzSI6m/WfvO4JXpo/r4BpaCywi1JzlB/oOhOdaar2RTS4AKfn8VRGCXJF0YD5bI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4cW2Sv4s6hzYQv4N; Tue, 23 Sep 2025 09:29:19 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id D00DE1A1041; Tue, 23 Sep 2025 09:29:24 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP4 (Coremail) with SMTP id gCh0CgAXKWHq99FoGYYGAg--.10941S14; Tue, 23 Sep 2025 09:29:24 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH 10/13] ext4: introduce mext_move_extent() Date: Tue, 23 Sep 2025 09:27:20 +0800 Message-ID: <20250923012724.2378858-11-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> References: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAXKWHq99FoGYYGAg--.10941S14 X-Coremail-Antispam: 1UD129KBjvJXoW3tFyDCF1fJrW3Wr18uw1rJFb_yoWDWF4DpF W2krn8JrWDGayI9r4Iyw48Zr1fKayxGr47AFWfW343ZFyUtry0gas5K3WUZFyrKrWxJFyF qF4Fyry7WayUAaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi When moving extents, the current move_extent_per_page() process can only move extents of length PAGE_SIZE at a time, which is highly inefficient, especially when the fragmentation of the file is not particularly severe, this will result in a large number of unnecessary extent split and merge operations. Moreover, since the ext4 file system now supports large folios, using PAGE_SIZE as the processing unit is no longer practical. Therefore, introduce a new move extents method, mext_move_extent(). It moves one extent of the origin inode at a time, but not exceeding the size of a folio. The parameters for the move are passed through the new mext_data data structure, which includes the origin inode, donor inode, the mapping extent of the origin inode to be moved, and the starting offset of the donor inode. The move process is similar to move_extent_per_page() and can be categorized into three types: MEXT_SKIP_EXTENT, MEXT_MOVE_EXTENT, and MEXT_COPY_DATA. MEXT_SKIP_EXTENT indicates that the corresponding area of the donor file is a hole, meaning no actual space is allocated, so the move is skipped. MEXT_MOVE_EXTENT indicates that the corresponding areas of both the origin and donor files are unwritten, so no data needs to be copied; only the extents are swapped. MEXT_COPY_DATA indicates that the corresponding areas of both the origin and donor files contain data, so data must be copied. The data copying is performed in three steps: first, the data from the original location is read into the page cache; then, the extents are swapped, and the page cache is rebuilt to reflect the index of the physical blocks; finally, the dirty page cache is marked and written back to ensure that the data is written to disk before the metadata is persisted. One important point to note is that the folio lock and i_data_sem are held only during the moving process. Therefore, before moving an extent, it is necessary to check whether the sequence cookie of the area to be moved has changed while holding the folio lock. If a change is detected, it indicates that concurrent write-back operations may have occurred during this period, and the type of the extent to be moved can no longer be considered reliable. For example, it may have changed from unwritten to written. In such cases, return -ESTALE, and the calling function should reacquire the move extent of the original file and retry the movement. Signed-off-by: Zhang Yi --- fs/ext4/move_extent.c | 216 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 216 insertions(+) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 5faa55109570..4edb9a378db7 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -13,6 +13,13 @@ #include "ext4.h" #include "ext4_extents.h" =20 +struct mext_data { + struct inode *orig_inode; /* Origin file inode */ + struct inode *donor_inode; /* Donor file inode */ + struct ext4_map_blocks orig_map;/* Origin file's move mapping */ + ext4_lblk_t donor_lblk; /* Start block of the donor file */ +}; + /** * get_ext_path() - Find an extent path for designated logical block numbe= r. * @inode: inode to be searched @@ -164,6 +171,14 @@ mext_folio_double_lock(struct inode *inode1, struct in= ode *inode2, return 0; } =20 +static void mext_folio_double_unlock(struct folio *folio[2]) +{ + folio_unlock(folio[0]); + folio_put(folio[0]); + folio_unlock(folio[1]); + folio_put(folio[1]); +} + /* Force folio buffers uptodate w/o dropping folio's lock */ static int mext_folio_mkuptodate(struct folio *folio, size_t from, size_t = to) { @@ -238,6 +253,207 @@ static int mext_folio_mkuptodate(struct folio *folio,= size_t from, size_t to) return 0; } =20 +enum mext_move_type {MEXT_SKIP_EXTENT, MEXT_MOVE_EXTENT, MEXT_COPY_DATA}; + +/* + * Start to move extent between the origin inode and the donor inode, + * hold one folio for each inode and check the candidate moving extent + * mapping status again. + */ +static int mext_move_begin(struct mext_data *mext, struct folio *folio[2], + enum mext_move_type *move_type) +{ + struct inode *orig_inode =3D mext->orig_inode; + struct inode *donor_inode =3D mext->donor_inode; + unsigned int blkbits =3D orig_inode->i_blkbits; + struct ext4_map_blocks donor_map =3D {0}; + loff_t orig_pos, donor_pos; + size_t move_len; + int ret; + + orig_pos =3D ((loff_t)mext->orig_map.m_lblk) << blkbits; + donor_pos =3D ((loff_t)mext->donor_lblk) << blkbits; + ret =3D mext_folio_double_lock(orig_inode, donor_inode, + orig_pos >> PAGE_SHIFT, donor_pos >> PAGE_SHIFT, folio); + if (ret) + return ret; + + /* + * Check the origin inode's mapping information again under the + * folio lock, as we do not hold the i_data_sem at all times, and + * it may change during the concurrent write-back operation. + */ + if (mext->orig_map.m_seq !=3D READ_ONCE(EXT4_I(orig_inode)->i_es_seq)) { + ret =3D -ESTALE; + goto error; + } + + /* Adjust the moving length according to the minor folios length. */ + move_len =3D umin(folio_pos(folio[0]) + folio_size(folio[0]) - orig_pos, + folio_pos(folio[1]) + folio_size(folio[1]) - donor_pos); + move_len >>=3D blkbits; + if (move_len < mext->orig_map.m_len) + mext->orig_map.m_len =3D move_len; + + donor_map.m_lblk =3D mext->donor_lblk; + donor_map.m_len =3D mext->orig_map.m_len; + donor_map.m_flags =3D 0; + ret =3D ext4_map_blocks(NULL, donor_inode, &donor_map, 0); + if (ret < 0) + goto error; + + /* Adjust the moving length according to the donor mapping length. */ + mext->orig_map.m_len =3D donor_map.m_len; + + /* Skip moving if the donor range is a hole or a delalloc extent. */ + if (!(donor_map.m_flags & (EXT4_MAP_MAPPED | EXT4_MAP_UNWRITTEN))) + *move_type =3D MEXT_SKIP_EXTENT; + /* If both mapping ranges are unwritten, no need to copy data. */ + else if ((mext->orig_map.m_flags & EXT4_MAP_UNWRITTEN) && + (donor_map.m_flags & EXT4_MAP_UNWRITTEN)) + *move_type =3D MEXT_MOVE_EXTENT; + else + *move_type =3D MEXT_COPY_DATA; + + return 0; +error: + mext_folio_double_unlock(folio); + return ret; +} + +/* + * Re-create the new moved mapping buffers of the original inode and commit + * the entire written range. + */ +static int mext_folio_mkwrite(struct inode *inode, struct folio *folio, + size_t from, size_t to) +{ + unsigned int blocksize =3D i_blocksize(inode); + struct buffer_head *bh, *head; + size_t block_start, block_end; + sector_t block; + int ret; + + head =3D folio_buffers(folio); + if (!head) + head =3D create_empty_buffers(folio, blocksize, 0); + + block =3D folio_pos(folio) >> inode->i_blkbits; + block_end =3D 0; + bh =3D head; + do { + block_start =3D block_end; + block_end =3D block_start + blocksize; + if (block_end <=3D from || block_start >=3D to) + continue; + + ret =3D ext4_get_block(inode, block, bh, 0); + if (ret) + return ret; + } while (block++, (bh =3D bh->b_this_page) !=3D head); + + block_commit_write(folio, from, to); + return 0; +} + +/* + * Save the data in original inode extent blocks and replace one folio size + * aligned original inode extent with one or one partial donor inode exten= t, + * and then write out the saved data in new original inode blocks. Pass out + * the replaced block count through m_len. Return 0 on success, and an err= or + * code otherwise. + */ +static __used int mext_move_extent(struct mext_data *mext, u64 *m_len) +{ + struct inode *orig_inode =3D mext->orig_inode; + struct inode *donor_inode =3D mext->donor_inode; + struct ext4_map_blocks *orig_map =3D &mext->orig_map; + unsigned int blkbits =3D orig_inode->i_blkbits; + struct folio *folio[2] =3D {NULL, NULL}; + loff_t from, length; + enum mext_move_type move_type =3D 0; + handle_t *handle; + u64 r_len =3D 0; + unsigned int credits; + int ret, ret2; + + *m_len =3D 0; + credits =3D ext4_chunk_trans_extent(orig_inode, 0) * 2; + handle =3D ext4_journal_start(orig_inode, EXT4_HT_MOVE_EXTENTS, credits); + if (IS_ERR(handle)) + return PTR_ERR(handle); + + ret =3D mext_move_begin(mext, folio, &move_type); + if (ret) + goto stop_handle; + + if (move_type =3D=3D MEXT_SKIP_EXTENT) + goto unlock; + + /* + * Copy the data. First, read the original inode data into the page + * cache. Then, release the existing mapping relationships and swap + * the extent. Finally, re-establish the new mapping relationships + * and dirty the page cache. + */ + if (move_type =3D=3D MEXT_COPY_DATA) { + from =3D offset_in_folio(folio[0], + ((loff_t)orig_map->m_lblk) << blkbits); + length =3D ((loff_t)orig_map->m_len) << blkbits; + + ret =3D mext_folio_mkuptodate(folio[0], from, from + length); + if (ret) + goto unlock; + } + + if (!filemap_release_folio(folio[0], 0) || + !filemap_release_folio(folio[1], 0)) { + ret =3D -EBUSY; + goto unlock; + } + + /* Move extent */ + ext4_double_down_write_data_sem(orig_inode, donor_inode); + *m_len =3D ext4_swap_extents(handle, orig_inode, donor_inode, + orig_map->m_lblk, mext->donor_lblk, + orig_map->m_len, 1, &ret); + ext4_double_up_write_data_sem(orig_inode, donor_inode); + if (ret) + goto unlock; + + if (move_type =3D=3D MEXT_MOVE_EXTENT) + goto unlock; + + /* Copy data */ + length =3D (*m_len) << blkbits; + ret =3D mext_folio_mkwrite(orig_inode, folio[0], from, from + length); + if (ret) + goto repair_branches; + /* + * Even in case of data=3Dwriteback it is reasonable to pin + * inode to transaction, to prevent unexpected data loss. + */ + ret =3D ext4_jbd2_inode_add_write(handle, orig_inode, + ((loff_t)orig_map->m_lblk) << blkbits, length); +unlock: + mext_folio_double_unlock(folio); +stop_handle: + ext4_journal_stop(handle); + return ret; + +repair_branches: + r_len =3D ext4_swap_extents(handle, donor_inode, orig_inode, + mext->donor_lblk, orig_map->m_lblk, + *m_len, 0, &ret2); + if (ret2 || r_len !=3D *m_len) { + ext4_error_inode_block(orig_inode, (sector_t)(orig_map->m_lblk), + EIO, "Unable to copy data block, data will be lost!"); + ret =3D -EIO; + } + *m_len =3D 0; + goto unlock; +} + /** * move_extent_per_page - Move extent data per page * --=20 2.46.1 From nobody Thu Oct 2 03:28:36 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AE3952459C5; Tue, 23 Sep 2025 01:29:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590975; cv=none; b=NW4ht3d/j8hZZD0uEtxQxqMrxZDcNLJLfwemqODcnzHOyy4dkVQH4w9ecpPi1SM4jox9XLAaLtvJYIjMQMocZ3NwwFWKj8JEWage5zn4G56y37mAY0Frb7d5Y5H6Rlfil7nPsuAO46aCLVia6lQ8o4PnMvSAAJwruNbRuixOaqQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590975; c=relaxed/simple; bh=LrKo2ZPNwXvpg8tQQTd6GkgsTqMFGqJZhV3pr2vi1vw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=j4Fz6UCO4XxZNpoAh0whSSMns1Ywdbr4tDWBhvxqzo7fNZed6SqJSvGoWjoCF74/TCkkIbGtzyluCEJBDjp0b1MpE1IT6sh6hGrVpTVhEp5yNOl7lLSv31PdPYjJQZGwr9aJqhg+RD3uV3tDlE0EcHoIGOBBhNLYuEu2/xuzoaI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4cW2Sw36fXzKHMjk; Tue, 23 Sep 2025 09:29:20 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id E40671A1AFF; Tue, 23 Sep 2025 09:29:24 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP4 (Coremail) with SMTP id gCh0CgAXKWHq99FoGYYGAg--.10941S15; Tue, 23 Sep 2025 09:29:24 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH 11/13] ext4: switch to using the new extent movement method Date: Tue, 23 Sep 2025 09:27:21 +0800 Message-ID: <20250923012724.2378858-12-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> References: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAXKWHq99FoGYYGAg--.10941S15 X-Coremail-Antispam: 1UD129KBjvAXoWfGw1UJry7Wr4UtFW5XF4DCFg_yoW8Wr4fXo WfCF4jqwn5Wr9Ig3ykKw10yFyUXan7Jw4rJrWrursrWFy3X3W5C39xG3Z7Ja43Xa1rKr15 Xa4xJ3WYyrZ7trn3n29KB7ZKAUJUUUU8529EdanIXcx71UUUUU7v73VFW2AGmfu7bjvjm3 AaLaJ3UjIYCTnIWjp_UUUOV7AC8VAFwI0_Wr0E3s1l1xkIjI8I6I8E6xAIw20EY4v20xva j40_Wr0E3s1l1IIY67AEw4v_Jr0_Jr4l82xGYIkIc2x26280x7IE14v26r126s0DM28Irc Ia0xkI8VCY1x0267AKxVW5JVCq3wA2ocxC64kIII0Yj41l84x0c7CEw4AK67xGY2AK021l 84ACjcxK6xIIjxv20xvE14v26w1j6s0DM28EF7xvwVC0I7IYx2IY6xkF7I0E14v26r4UJV WxJr1l84ACjcxK6I8E87Iv67AKxVW0oVCq3wA2z4x0Y4vEx4A2jsIEc7CjxVAFwI0_GcCE 3s1le2I262IYc4CY6c8Ij28IcVAaY2xG8wAqx4xG64xvF2IEw4CE5I8CrVC2j2WlYx0E2I x0cI8IcVAFwI0_JrI_JrylYx0Ex4A2jsIE14v26r1j6r4UMcvjeVCFs4IE7xkEbVWUJVW8 JwACjcxG0xvY0x0EwIxGrwACjI8F5VA0II8E6IAqYI8I648v4I1lFIxGxcIEc7CjxVA2Y2 ka0xkIwI1lc7CjxVAaw2AFwI0_Jw0_GFyl42xK82IYc2Ij64vIr41l4I8I3I0E4IkC6x0Y z7v_Jr0_Gr1lx2IqxVAqx4xG67AKxVWUJVWUGwC20s026x8GjcxK67AKxVWUGVWUWwC2zV AF1VAY17CE14v26r1q6r43MIIYrxkI7VAKI48JMIIF0xvE2Ix0cI8IcVAFwI0_Gr0_Xr1l IxAIcVC0I7IYx2IY6xkF7I0E14v26r4UJVWxJr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r 1xMIIF0xvEx4A2jsIE14v26r4j6F4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr1j6F4UJbIY CTnIWIevJa73UjIFyTuYvjfUriihUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi Now that we have mext_move_extent(), we can switch to this new interface and deprecate move_extent_per_page(). First, after acquiring the i_rwsem, we can directly use ext4_map_blocks() to obtain a contiguous extent from the original inode as the extent to be moved. It can and it's safe to get mapping information from the extent status tree without needing to access the ondisk extent tree, because ext4_move_extent() will check the sequence cookie under the folio lock. Then, after populating the mext_data structure, we call ext4_move_extent() to move the extent. Finally, the length of the extent will be adjusted in mext.orig_map.m_len and the actual length moved is returned through m_len. Signed-off-by: Zhang Yi --- fs/ext4/move_extent.c | 386 +++++------------------------------------- 1 file changed, 42 insertions(+), 344 deletions(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 4edb9a378db7..b478631e243c 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -20,29 +20,6 @@ struct mext_data { ext4_lblk_t donor_lblk; /* Start block of the donor file */ }; =20 -/** - * get_ext_path() - Find an extent path for designated logical block numbe= r. - * @inode: inode to be searched - * @lblock: logical block number to find an extent path - * @path: pointer to an extent path - * - * ext4_find_extent wrapper. Return an extent path pointer on success, - * or an error pointer on failure. - */ -static inline struct ext4_ext_path * -get_ext_path(struct inode *inode, ext4_lblk_t lblock, - struct ext4_ext_path *path) -{ - path =3D ext4_find_extent(inode, lblock, path, EXT4_EX_NOCACHE); - if (IS_ERR(path)) - return path; - if (path[ext_depth(inode)].p_ext =3D=3D NULL) { - ext4_free_ext_path(path); - return ERR_PTR(-ENODATA); - } - return path; -} - /** * ext4_double_down_write_data_sem() - write lock two inodes's i_data_sem * @first: inode to be locked @@ -59,7 +36,6 @@ ext4_double_down_write_data_sem(struct inode *first, stru= ct inode *second) } else { down_write(&EXT4_I(second)->i_data_sem); down_write_nested(&EXT4_I(first)->i_data_sem, I_DATA_SEM_OTHER); - } } =20 @@ -78,42 +54,6 @@ ext4_double_up_write_data_sem(struct inode *orig_inode, up_write(&EXT4_I(donor_inode)->i_data_sem); } =20 -/** - * mext_check_coverage - Check that all extents in range has the same type - * - * @inode: inode in question - * @from: block offset of inode - * @count: block count to be checked - * @unwritten: extents expected to be unwritten - * @err: pointer to save error value - * - * Return 1 if all extents in range has expected type, and zero otherwise. - */ -static int -mext_check_coverage(struct inode *inode, ext4_lblk_t from, ext4_lblk_t cou= nt, - int unwritten, int *err) -{ - struct ext4_ext_path *path =3D NULL; - struct ext4_extent *ext; - int ret =3D 0; - ext4_lblk_t last =3D from + count; - while (from < last) { - path =3D get_ext_path(inode, from, path); - if (IS_ERR(path)) { - *err =3D PTR_ERR(path); - return ret; - } - ext =3D path[ext_depth(inode)].p_ext; - if (unwritten !=3D ext4_ext_is_unwritten(ext)) - goto out; - from +=3D ext4_ext_get_actual_len(ext); - } - ret =3D 1; -out: - ext4_free_ext_path(path); - return ret; -} - /** * mext_folio_double_lock - Grab and lock folio on both @inode1 and @inode2 * @@ -363,7 +303,7 @@ static int mext_folio_mkwrite(struct inode *inode, stru= ct folio *folio, * the replaced block count through m_len. Return 0 on success, and an err= or * code otherwise. */ -static __used int mext_move_extent(struct mext_data *mext, u64 *m_len) +static int mext_move_extent(struct mext_data *mext, u64 *m_len) { struct inode *orig_inode =3D mext->orig_inode; struct inode *donor_inode =3D mext->donor_inode; @@ -454,210 +394,6 @@ static __used int mext_move_extent(struct mext_data *= mext, u64 *m_len) goto unlock; } =20 -/** - * move_extent_per_page - Move extent data per page - * - * @o_filp: file structure of original file - * @donor_inode: donor inode - * @orig_page_offset: page index on original file - * @donor_page_offset: page index on donor file - * @data_offset_in_page: block index where data swapping starts - * @block_len_in_page: the number of blocks to be swapped - * @unwritten: orig extent is unwritten or not - * @err: pointer to save return value - * - * Save the data in original inode blocks and replace original inode exten= ts - * with donor inode extents by calling ext4_swap_extents(). - * Finally, write out the saved data in new original inode blocks. Return - * replaced block count. - */ -static int -move_extent_per_page(struct file *o_filp, struct inode *donor_inode, - pgoff_t orig_page_offset, pgoff_t donor_page_offset, - int data_offset_in_page, - int block_len_in_page, int unwritten, int *err) -{ - struct inode *orig_inode =3D file_inode(o_filp); - struct folio *folio[2] =3D {NULL, NULL}; - handle_t *handle; - ext4_lblk_t orig_blk_offset, donor_blk_offset; - unsigned long blocksize =3D orig_inode->i_sb->s_blocksize; - unsigned int tmp_data_size, data_size, replaced_size; - int i, err2, jblocks, retries =3D 0; - int replaced_count =3D 0; - int from; - int blocks_per_page =3D PAGE_SIZE >> orig_inode->i_blkbits; - struct super_block *sb =3D orig_inode->i_sb; - struct buffer_head *bh =3D NULL; - - /* - * It needs twice the amount of ordinary journal buffers because - * inode and donor_inode may change each different metadata blocks. - */ -again: - *err =3D 0; - jblocks =3D ext4_meta_trans_blocks(orig_inode, block_len_in_page, - block_len_in_page) * 2; - handle =3D ext4_journal_start(orig_inode, EXT4_HT_MOVE_EXTENTS, jblocks); - if (IS_ERR(handle)) { - *err =3D PTR_ERR(handle); - return 0; - } - - orig_blk_offset =3D orig_page_offset * blocks_per_page + - data_offset_in_page; - - donor_blk_offset =3D donor_page_offset * blocks_per_page + - data_offset_in_page; - - /* Calculate data_size */ - if ((orig_blk_offset + block_len_in_page - 1) =3D=3D - ((orig_inode->i_size - 1) >> orig_inode->i_blkbits)) { - /* Replace the last block */ - tmp_data_size =3D orig_inode->i_size & (blocksize - 1); - /* - * If data_size equal zero, it shows data_size is multiples of - * blocksize. So we set appropriate value. - */ - if (tmp_data_size =3D=3D 0) - tmp_data_size =3D blocksize; - - data_size =3D tmp_data_size + - ((block_len_in_page - 1) << orig_inode->i_blkbits); - } else - data_size =3D block_len_in_page << orig_inode->i_blkbits; - - replaced_size =3D data_size; - - *err =3D mext_folio_double_lock(orig_inode, donor_inode, orig_page_offset, - donor_page_offset, folio); - if (unlikely(*err < 0)) - goto stop_journal; - /* - * If orig extent was unwritten it can become initialized - * at any time after i_data_sem was dropped, in order to - * serialize with delalloc we have recheck extent while we - * hold page's lock, if it is still the case data copy is not - * necessary, just swap data blocks between orig and donor. - */ - if (unwritten) { - ext4_double_down_write_data_sem(orig_inode, donor_inode); - /* If any of extents in range became initialized we have to - * fallback to data copying */ - unwritten =3D mext_check_coverage(orig_inode, orig_blk_offset, - block_len_in_page, 1, err); - if (*err) - goto drop_data_sem; - - unwritten &=3D mext_check_coverage(donor_inode, donor_blk_offset, - block_len_in_page, 1, err); - if (*err) - goto drop_data_sem; - - if (!unwritten) { - ext4_double_up_write_data_sem(orig_inode, donor_inode); - goto data_copy; - } - if (!filemap_release_folio(folio[0], 0) || - !filemap_release_folio(folio[1], 0)) { - *err =3D -EBUSY; - goto drop_data_sem; - } - replaced_count =3D ext4_swap_extents(handle, orig_inode, - donor_inode, orig_blk_offset, - donor_blk_offset, - block_len_in_page, 1, err); - drop_data_sem: - ext4_double_up_write_data_sem(orig_inode, donor_inode); - goto unlock_folios; - } -data_copy: - from =3D offset_in_folio(folio[0], - orig_blk_offset << orig_inode->i_blkbits); - *err =3D mext_folio_mkuptodate(folio[0], from, from + replaced_size); - if (*err) - goto unlock_folios; - - /* At this point all buffers in range are uptodate, old mapping layout - * is no longer required, try to drop it now. */ - if (!filemap_release_folio(folio[0], 0) || - !filemap_release_folio(folio[1], 0)) { - *err =3D -EBUSY; - goto unlock_folios; - } - ext4_double_down_write_data_sem(orig_inode, donor_inode); - replaced_count =3D ext4_swap_extents(handle, orig_inode, donor_inode, - orig_blk_offset, donor_blk_offset, - block_len_in_page, 1, err); - ext4_double_up_write_data_sem(orig_inode, donor_inode); - if (*err) { - if (replaced_count) { - block_len_in_page =3D replaced_count; - replaced_size =3D - block_len_in_page << orig_inode->i_blkbits; - } else - goto unlock_folios; - } - /* Perform all necessary steps similar write_begin()/write_end() - * but keeping in mind that i_size will not change */ - bh =3D folio_buffers(folio[0]); - if (!bh) - bh =3D create_empty_buffers(folio[0], - 1 << orig_inode->i_blkbits, 0); - for (i =3D 0; i < from >> orig_inode->i_blkbits; i++) - bh =3D bh->b_this_page; - for (i =3D 0; i < block_len_in_page; i++) { - *err =3D ext4_get_block(orig_inode, orig_blk_offset + i, bh, 0); - if (*err < 0) - goto repair_branches; - bh =3D bh->b_this_page; - } - - block_commit_write(folio[0], from, from + replaced_size); - - /* Even in case of data=3Dwriteback it is reasonable to pin - * inode to transaction, to prevent unexpected data loss */ - *err =3D ext4_jbd2_inode_add_write(handle, orig_inode, - (loff_t)orig_page_offset << PAGE_SHIFT, replaced_size); - -unlock_folios: - folio_unlock(folio[0]); - folio_put(folio[0]); - folio_unlock(folio[1]); - folio_put(folio[1]); -stop_journal: - ext4_journal_stop(handle); - if (*err =3D=3D -ENOSPC && - ext4_should_retry_alloc(sb, &retries)) - goto again; - /* Buffer was busy because probably is pinned to journal transaction, - * force transaction commit may help to free it. */ - if (*err =3D=3D -EBUSY && retries++ < 4 && EXT4_SB(sb)->s_journal && - jbd2_journal_force_commit_nested(EXT4_SB(sb)->s_journal)) - goto again; - return replaced_count; - -repair_branches: - /* - * This should never ever happen! - * Extents are swapped already, but we are not able to copy data. - * Try to swap extents to it's original places - */ - ext4_double_down_write_data_sem(orig_inode, donor_inode); - replaced_count =3D ext4_swap_extents(handle, donor_inode, orig_inode, - orig_blk_offset, donor_blk_offset, - block_len_in_page, 0, &err2); - ext4_double_up_write_data_sem(orig_inode, donor_inode); - if (replaced_count !=3D block_len_in_page) { - ext4_error_inode_block(orig_inode, (sector_t)(orig_blk_offset), - EIO, "Unable to copy data block," - " data will be lost."); - *err =3D -EIO; - } - replaced_count =3D 0; - goto unlock_folios; -} - /* * Check the validity of the basic filesystem environment and the * inodes' support status. @@ -819,106 +555,72 @@ static int mext_check_adjust_range(struct inode *ori= g_inode, * * This function returns 0 and moved block length is set in moved_len * if succeed, otherwise returns error value. - * */ -int -ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk, - __u64 donor_blk, __u64 len, __u64 *moved_len) +int ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig= _blk, + __u64 donor_blk, __u64 len, __u64 *moved_len) { struct inode *orig_inode =3D file_inode(o_filp); struct inode *donor_inode =3D file_inode(d_filp); - struct ext4_ext_path *path =3D NULL; - int blocks_per_page =3D PAGE_SIZE >> orig_inode->i_blkbits; - ext4_lblk_t o_end, o_start =3D orig_blk; - ext4_lblk_t d_start =3D donor_blk; + struct mext_data mext; + struct super_block *sb =3D orig_inode->i_sb; + struct ext4_sb_info *sbi =3D EXT4_SB(sb); + int retries =3D 0; + u64 m_len; int ret; =20 + *moved_len =3D 0; + /* Protect orig and donor inodes against a truncate */ lock_two_nondirectories(orig_inode, donor_inode); =20 ret =3D mext_check_validity(orig_inode, donor_inode); if (ret) - goto unlock; + goto out; =20 /* Wait for all existing dio workers */ inode_dio_wait(orig_inode); inode_dio_wait(donor_inode); =20 - /* Protect extent tree against block allocations via delalloc */ - ext4_double_down_write_data_sem(orig_inode, donor_inode); /* Check and adjust the specified move_extent range. */ ret =3D mext_check_adjust_range(orig_inode, donor_inode, orig_blk, donor_blk, &len); if (ret) goto out; - o_end =3D o_start + len; =20 - *moved_len =3D 0; - while (o_start < o_end) { - struct ext4_extent *ex; - ext4_lblk_t cur_blk, next_blk; - pgoff_t orig_page_index, donor_page_index; - int offset_in_page; - int unwritten, cur_len; - - path =3D get_ext_path(orig_inode, o_start, path); - if (IS_ERR(path)) { - ret =3D PTR_ERR(path); + mext.orig_inode =3D orig_inode; + mext.donor_inode =3D donor_inode; + while (len) { + mext.orig_map.m_lblk =3D orig_blk; + mext.orig_map.m_len =3D len; + mext.orig_map.m_flags =3D 0; + mext.donor_lblk =3D donor_blk; + + ret =3D ext4_map_blocks(NULL, orig_inode, &mext.orig_map, 0); + if (ret < 0) goto out; - } - ex =3D path[path->p_depth].p_ext; - cur_blk =3D le32_to_cpu(ex->ee_block); - cur_len =3D ext4_ext_get_actual_len(ex); - /* Check hole before the start pos */ - if (cur_blk + cur_len - 1 < o_start) { - next_blk =3D ext4_ext_next_allocated_block(path); - if (next_blk =3D=3D EXT_MAX_BLOCKS) { - ret =3D -ENODATA; - goto out; - } - d_start +=3D next_blk - o_start; - o_start =3D next_blk; - continue; - /* Check hole after the start pos */ - } else if (cur_blk > o_start) { - /* Skip hole */ - d_start +=3D cur_blk - o_start; - o_start =3D cur_blk; - /* Extent inside requested range ?*/ - if (cur_blk >=3D o_end) + + /* Skip moving if it is a hole or a delalloc extent. */ + if (mext.orig_map.m_flags & + (EXT4_MAP_MAPPED | EXT4_MAP_UNWRITTEN)) { + ret =3D mext_move_extent(&mext, &m_len); + if (ret =3D=3D -ESTALE) + continue; + if (ret =3D=3D -ENOSPC && + ext4_should_retry_alloc(sb, &retries)) + continue; + if (ret =3D=3D -EBUSY && + sbi->s_journal && retries++ < 4 && + jbd2_journal_force_commit_nested(sbi->s_journal)) + continue; + if (ret) goto out; - } else { /* in_range(o_start, o_blk, o_len) */ - cur_len +=3D cur_blk - o_start; + + *moved_len +=3D m_len; + retries =3D 0; } - unwritten =3D ext4_ext_is_unwritten(ex); - if (o_end - o_start < cur_len) - cur_len =3D o_end - o_start; - - orig_page_index =3D o_start >> (PAGE_SHIFT - - orig_inode->i_blkbits); - donor_page_index =3D d_start >> (PAGE_SHIFT - - donor_inode->i_blkbits); - offset_in_page =3D o_start % blocks_per_page; - if (cur_len > blocks_per_page - offset_in_page) - cur_len =3D blocks_per_page - offset_in_page; - /* - * Up semaphore to avoid following problems: - * a. transaction deadlock among ext4_journal_start, - * ->write_begin via pagefault, and jbd2_journal_commit - * b. racing with ->read_folio, ->write_begin, and - * ext4_get_block in move_extent_per_page - */ - ext4_double_up_write_data_sem(orig_inode, donor_inode); - /* Swap original branches with new branches */ - *moved_len +=3D move_extent_per_page(o_filp, donor_inode, - orig_page_index, donor_page_index, - offset_in_page, cur_len, - unwritten, &ret); - ext4_double_down_write_data_sem(orig_inode, donor_inode); - if (ret < 0) - break; - o_start +=3D cur_len; - d_start +=3D cur_len; + orig_blk +=3D mext.orig_map.m_len; + donor_blk +=3D mext.orig_map.m_len; + len -=3D mext.orig_map.m_len; } =20 out: @@ -927,10 +629,6 @@ ext4_move_extents(struct file *o_filp, struct file *d_= filp, __u64 orig_blk, ext4_discard_preallocations(donor_inode); } =20 - ext4_free_ext_path(path); - ext4_double_up_write_data_sem(orig_inode, donor_inode); -unlock: unlock_two_nondirectories(orig_inode, donor_inode); - return ret; } --=20 2.46.1 From nobody Thu Oct 2 03:28:36 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9ECF526F296; Tue, 23 Sep 2025 01:29:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590977; cv=none; b=mBYnBR78xSovoQXnnkaWQ7TsUks/2zZehjSN4aVwdhUPn/26EsDkglrTSrLP+76wPPIERUZb4VmtqpqwsGIgarg1NkobDnl9tpbI2zgh7r7Rlw3NzyRqH593lLptmtrjFAAPu9LlC7vkN+NFNV9qz9y+S6rXsw6aM+dibAtncuA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590977; c=relaxed/simple; bh=CyYQn+cOs12tN7/PlpInkSiOd/5MvYS7xR1XrqzTHRc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Y/prZgVsCCfz+QayFwxSY+v75d58EaPQzBjMqb8ZasFAoL4ftuJ7VW9+AlZ4u+utR8wLIIfLHCFr2xmBds7OynImUbWU5sq6tWSPkZXhJHwCoeWXhSxOD+n3XPDObfjpWKEAqXSAK+HHnXnIX3QiOiJ0B+vG6ANtJ1ZhCGhnZtw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4cW2Sv5lxLzYQv4h; Tue, 23 Sep 2025 09:29:19 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id F0DD21A1B01; Tue, 23 Sep 2025 09:29:24 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP4 (Coremail) with SMTP id gCh0CgAXKWHq99FoGYYGAg--.10941S16; Tue, 23 Sep 2025 09:29:24 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH 12/13] ext4: add large folios support for moving extents Date: Tue, 23 Sep 2025 09:27:22 +0800 Message-ID: <20250923012724.2378858-13-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> References: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAXKWHq99FoGYYGAg--.10941S16 X-Coremail-Antispam: 1UD129KBjvJXoWxGF1xWrWrJr1rGFWUKrWruFg_yoW5Xw43pF 1xKan3tFWkX34I9ry0qay7Zr15Ka4xtr4UWF4fJw1SyFyqvryIgr1jy3WxZFyrtrW8ArWF qF4SkryUWa1Dt3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi Pass the moving extent length into mext_folio_double_lock() so that it can acquire a higher-order folio if the length exceeds PAGE_SIZE. This can speed up extent moving when the extent is larger than one page. Additionally, remove the unnecessary comments from mext_folio_double_lock(). Signed-off-by: Zhang Yi --- fs/ext4/move_extent.c | 27 ++++++++++----------------- 1 file changed, 10 insertions(+), 17 deletions(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index b478631e243c..c15294ce2aab 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -54,23 +54,14 @@ ext4_double_up_write_data_sem(struct inode *orig_inode, up_write(&EXT4_I(donor_inode)->i_data_sem); } =20 -/** - * mext_folio_double_lock - Grab and lock folio on both @inode1 and @inode2 - * - * @inode1: the inode structure - * @inode2: the inode structure - * @index1: folio index - * @index2: folio index - * @folio: result folio vector - * - * Grab two locked folio for inode's by inode order - */ -static int -mext_folio_double_lock(struct inode *inode1, struct inode *inode2, - pgoff_t index1, pgoff_t index2, struct folio *folio[2]) +/* Grab and lock folio on both @inode1 and @inode2 by inode order. */ +static int mext_folio_double_lock(struct inode *inode1, struct inode *inod= e2, + pgoff_t index1, pgoff_t index2, size_t len, + struct folio *folio[2]) { struct address_space *mapping[2]; unsigned int flags; + fgf_t fgp_flags =3D FGP_WRITEBEGIN; =20 BUG_ON(!inode1 || !inode2); if (inode1 < inode2) { @@ -83,14 +74,15 @@ mext_folio_double_lock(struct inode *inode1, struct ino= de *inode2, } =20 flags =3D memalloc_nofs_save(); - folio[0] =3D __filemap_get_folio(mapping[0], index1, FGP_WRITEBEGIN, + fgp_flags |=3D fgf_set_order(len); + folio[0] =3D __filemap_get_folio(mapping[0], index1, fgp_flags, mapping_gfp_mask(mapping[0])); if (IS_ERR(folio[0])) { memalloc_nofs_restore(flags); return PTR_ERR(folio[0]); } =20 - folio[1] =3D __filemap_get_folio(mapping[1], index2, FGP_WRITEBEGIN, + folio[1] =3D __filemap_get_folio(mapping[1], index2, fgp_flags, mapping_gfp_mask(mapping[1])); memalloc_nofs_restore(flags); if (IS_ERR(folio[1])) { @@ -214,7 +206,8 @@ static int mext_move_begin(struct mext_data *mext, stru= ct folio *folio[2], orig_pos =3D ((loff_t)mext->orig_map.m_lblk) << blkbits; donor_pos =3D ((loff_t)mext->donor_lblk) << blkbits; ret =3D mext_folio_double_lock(orig_inode, donor_inode, - orig_pos >> PAGE_SHIFT, donor_pos >> PAGE_SHIFT, folio); + orig_pos >> PAGE_SHIFT, donor_pos >> PAGE_SHIFT, + mext->orig_map.m_len << blkbits, folio); if (ret) return ret; =20 --=20 2.46.1 From nobody Thu Oct 2 03:28:36 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C3C9D2701CC; Tue, 23 Sep 2025 01:29:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590978; cv=none; b=TACv1PdBCL8qIdvpwD2Fq+rKjXe2BMGt4irLTSAKA+oEYvBKKXnE+afRgcb1xiZQVmKnjU7NPD1G/9kwjYIZjmDneDUgsXktlRUM31/ogtevzbs1twXuHVSDjUsjVqBowUD9BXALzUNEHcnX2sTT4i2H+N2FySGoaoYMIdki1Q0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1758590978; c=relaxed/simple; bh=iYYIRGCB7NPJVlkdCLHGL2lRgfqtqHxpKMJU6D74P5E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Vum/uILT4r8N64c/RwzZXlsky3/GYRsGG+IVqpgQC04NCM6JtS6lZALNw09cGZX80ZHI8eZ5+y2y53jFvt0rw2fy83VPugkb93WIvZb6GJkfON1pr4tFFWBMkqWthzsY9Egchk+sP83/KrDBnsNN36A1S23cK06/DuRAOUkO/C0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4cW2Sv6LR2zYQv3t; Tue, 23 Sep 2025 09:29:19 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.128]) by mail.maildlp.com (Postfix) with ESMTP id 0F02C1A104C; Tue, 23 Sep 2025 09:29:25 +0800 (CST) Received: from huaweicloud.com (unknown [10.50.85.155]) by APP4 (Coremail) with SMTP id gCh0CgAXKWHq99FoGYYGAg--.10941S17; Tue, 23 Sep 2025 09:29:24 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH 13/13] ext4: add two trace points for moving extents Date: Tue, 23 Sep 2025 09:27:23 +0800 Message-ID: <20250923012724.2378858-14-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> References: <20250923012724.2378858-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: gCh0CgAXKWHq99FoGYYGAg--.10941S17 X-Coremail-Antispam: 1UD129KBjvJXoWxCFyDAF4DAF1kWFykWFykZrb_yoWruw15pF n7AFy5K3ykXaya934xCw48Zr45ua4IkrWUKrySg343XayxtrnFgr4kta1jyF9YyrW8Kryf XFWjyryDKa45W3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi To facilitate tracking the length, type, and outcome of the move extent, add a trace point at both the entry and exit of mext_move_extent(). Signed-off-by: Zhang Yi --- fs/ext4/move_extent.c | 14 ++++++- include/trace/events/ext4.h | 74 +++++++++++++++++++++++++++++++++++++ 2 files changed, 86 insertions(+), 2 deletions(-) diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index c15294ce2aab..3ea616b0e929 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -13,6 +13,8 @@ #include "ext4.h" #include "ext4_extents.h" =20 +#include + struct mext_data { struct inode *orig_inode; /* Origin file inode */ struct inode *donor_inode; /* Donor file inode */ @@ -311,10 +313,14 @@ static int mext_move_extent(struct mext_data *mext, u= 64 *m_len) int ret, ret2; =20 *m_len =3D 0; + trace_ext4_move_extent_enter(orig_inode, orig_map, donor_inode, + mext->donor_lblk); credits =3D ext4_chunk_trans_extent(orig_inode, 0) * 2; handle =3D ext4_journal_start(orig_inode, EXT4_HT_MOVE_EXTENTS, credits); - if (IS_ERR(handle)) - return PTR_ERR(handle); + if (IS_ERR(handle)) { + ret =3D PTR_ERR(handle); + goto out; + } =20 ret =3D mext_move_begin(mext, folio, &move_type); if (ret) @@ -372,6 +378,10 @@ static int mext_move_extent(struct mext_data *mext, u6= 4 *m_len) mext_folio_double_unlock(folio); stop_handle: ext4_journal_stop(handle); +out: + trace_ext4_move_extent_exit(orig_inode, orig_map->m_lblk, donor_inode, + mext->donor_lblk, orig_map->m_len, *m_len, + move_type, ret); return ret; =20 repair_branches: diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index 6a0754d38acf..a05bdd48e16e 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -3016,6 +3016,80 @@ TRACE_EVENT(ext4_update_sb, __entry->fsblk, __entry->flags) ); =20 +TRACE_EVENT(ext4_move_extent_enter, + TP_PROTO(struct inode *orig_inode, struct ext4_map_blocks *orig_map, + struct inode *donor_inode, ext4_lblk_t donor_lblk), + + TP_ARGS(orig_inode, orig_map, donor_inode, donor_lblk), + + TP_STRUCT__entry( + __field(dev_t, dev) + __field(ino_t, orig_ino) + __field(ext4_lblk_t, orig_lblk) + __field(unsigned int, orig_flags) + __field(ino_t, donor_ino) + __field(ext4_lblk_t, donor_lblk) + __field(unsigned int, len) + ), + + TP_fast_assign( + __entry->dev =3D orig_inode->i_sb->s_dev; + __entry->orig_ino =3D orig_inode->i_ino; + __entry->orig_lblk =3D orig_map->m_lblk; + __entry->orig_flags =3D orig_map->m_flags; + __entry->donor_ino =3D donor_inode->i_ino; + __entry->donor_lblk =3D donor_lblk; + __entry->len =3D orig_map->m_len; + ), + + TP_printk("dev %d,%d origin ino %lu lblk %u flags %s donor ino %lu lblk %= u len %u", + MAJOR(__entry->dev), MINOR(__entry->dev), + (unsigned long) __entry->orig_ino, __entry->orig_lblk, + show_mflags(__entry->orig_flags), + (unsigned long) __entry->donor_ino, __entry->donor_lblk, + __entry->len) +); + +TRACE_EVENT(ext4_move_extent_exit, + TP_PROTO(struct inode *orig_inode, ext4_lblk_t orig_lblk, + struct inode *donor_inode, ext4_lblk_t donor_lblk, + unsigned int m_len, u64 move_len, int move_type, int ret), + + TP_ARGS(orig_inode, orig_lblk, donor_inode, donor_lblk, m_len, + move_len, move_type, ret), + + TP_STRUCT__entry( + __field(dev_t, dev) + __field(ino_t, orig_ino) + __field(ext4_lblk_t, orig_lblk) + __field(ino_t, donor_ino) + __field(ext4_lblk_t, donor_lblk) + __field(unsigned int, m_len) + __field(u64, move_len) + __field(int, move_type) + __field(int, ret) + ), + + TP_fast_assign( + __entry->dev =3D orig_inode->i_sb->s_dev; + __entry->orig_ino =3D orig_inode->i_ino; + __entry->orig_lblk =3D orig_lblk; + __entry->donor_ino =3D donor_inode->i_ino; + __entry->donor_lblk =3D donor_lblk; + __entry->m_len =3D m_len; + __entry->move_len =3D move_len; + __entry->move_type =3D move_type; + __entry->ret =3D ret; + ), + + TP_printk("dev %d,%d origin ino %lu lblk %u donor ino %lu lblk %u m_len %= u, move_len %llu type %d ret %d", + MAJOR(__entry->dev), MINOR(__entry->dev), + (unsigned long) __entry->orig_ino, __entry->orig_lblk, + (unsigned long) __entry->donor_ino, __entry->donor_lblk, + __entry->m_len, __entry->move_len, __entry->move_type, + __entry->ret) +); + #endif /* _TRACE_EXT4_H */ =20 /* This part must be outside protection */ --=20 2.46.1