From nobody Mon Dec 15 21:27:09 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1C5DA757EA; Tue, 1 Jul 2025 13:20:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376063; cv=none; b=YCUfkngB/ttLr54L9yzI0i4IRUY4QFh/z4ojf2b/Klzasu7ldF9VlzKTQTDunUFPm7iFIv3474n37L//g1C6oKiwwhjomiVSsB5cE7E8tXbOYtZ4IQxCtDOiR5LPx0OpjPKzlFvv/7fVIygOp0JNPRdkLQ1FFoa+iO8GprLSw3Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376063; c=relaxed/simple; bh=OV5rV1yeBFvh08AXAiy+qNBlEV2EciJ3wZtfVB5PD6Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=An8DPJ+AhTHlm4OXzc1zC1Fh5zsxuHbKwCdlgh+A66GSoLgTgo1mfY2jW0hnRB7XlLTzFJOe8318YvUau95VwBQiUUK0k0gTuvIqxPWyxigHZUzAiF7KWEU6lzk3RaIlPKtFbg/vQW9Cpj/hQwnWsY4PWzQdQ7orAjnxYfvKBh8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4bWkDq0q3lzKHN4f; Tue, 1 Jul 2025 21:20:59 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 7ED431A121F; Tue, 1 Jul 2025 21:20:57 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgAXeCWu4GNoXmJGAQ--.26904S5; Tue, 01 Jul 2025 21:20:57 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v3 01/10] ext4: process folios writeback in bytes Date: Tue, 1 Jul 2025 21:06:26 +0800 Message-ID: <20250701130635.4079595-2-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> References: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgAXeCWu4GNoXmJGAQ--.26904S5 X-Coremail-Antispam: 1UD129KBjvJXoW3XF1rWF4DZFWxuFWkGr4UArb_yoWfGw17pF WUKF909r4kX3yjgFn3ZFZrZr10k34xAr48tFy3WanIqF1Ykr18KFyjqFyqvFy5KrZ2vrWx XF4Yyry8WF1xJFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUm014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0x vEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVj vjDU0xZFpf9x0JUfKs8UUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi Since ext4 supports large folios, processing writebacks in pages is no longer appropriate, it can be modified to process writebacks in bytes. Suggested-by: Jan Kara Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/inode.c | 70 +++++++++++++++++++------------------ include/trace/events/ext4.h | 13 ++++--- 2 files changed, 42 insertions(+), 41 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index be9a4cba35fd..ba81df0d87dd 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1667,11 +1667,12 @@ struct mpage_da_data { unsigned int can_map:1; /* Can writepages call map blocks? */ =20 /* These are internal state of ext4_do_writepages() */ - pgoff_t first_page; /* The first page to write */ - pgoff_t next_page; /* Current page to examine */ - pgoff_t last_page; /* Last page to examine */ + loff_t start_pos; /* The start pos to write */ + loff_t next_pos; /* Current pos to examine */ + loff_t end_pos; /* Last pos to examine */ + /* - * Extent to map - this can be after first_page because that can be + * Extent to map - this can be after start_pos because that can be * fully mapped. We somewhat abuse m_flags to store whether the extent * is delalloc or unwritten. */ @@ -1691,38 +1692,38 @@ static void mpage_release_unused_pages(struct mpage= _da_data *mpd, struct inode *inode =3D mpd->inode; struct address_space *mapping =3D inode->i_mapping; =20 - /* This is necessary when next_page =3D=3D 0. */ - if (mpd->first_page >=3D mpd->next_page) + /* This is necessary when next_pos =3D=3D 0. */ + if (mpd->start_pos >=3D mpd->next_pos) return; =20 mpd->scanned_until_end =3D 0; - index =3D mpd->first_page; - end =3D mpd->next_page - 1; if (invalidate) { ext4_lblk_t start, last; - start =3D index << (PAGE_SHIFT - inode->i_blkbits); - last =3D end << (PAGE_SHIFT - inode->i_blkbits); + start =3D EXT4_B_TO_LBLK(inode, mpd->start_pos); + last =3D mpd->next_pos >> inode->i_blkbits; =20 /* * avoid racing with extent status tree scans made by * ext4_insert_delayed_block() */ down_write(&EXT4_I(inode)->i_data_sem); - ext4_es_remove_extent(inode, start, last - start + 1); + ext4_es_remove_extent(inode, start, last - start); up_write(&EXT4_I(inode)->i_data_sem); } =20 folio_batch_init(&fbatch); - while (index <=3D end) { - nr =3D filemap_get_folios(mapping, &index, end, &fbatch); + index =3D mpd->start_pos >> PAGE_SHIFT; + end =3D mpd->next_pos >> PAGE_SHIFT; + while (index < end) { + nr =3D filemap_get_folios(mapping, &index, end - 1, &fbatch); if (nr =3D=3D 0) break; for (i =3D 0; i < nr; i++) { struct folio *folio =3D fbatch.folios[i]; =20 - if (folio->index < mpd->first_page) + if (folio_pos(folio) < mpd->start_pos) continue; - if (folio_next_index(folio) - 1 > end) + if (folio_next_index(folio) > end) continue; BUG_ON(!folio_test_locked(folio)); BUG_ON(folio_test_writeback(folio)); @@ -2024,7 +2025,7 @@ int ext4_da_get_block_prep(struct inode *inode, secto= r_t iblock, =20 static void mpage_folio_done(struct mpage_da_data *mpd, struct folio *foli= o) { - mpd->first_page +=3D folio_nr_pages(folio); + mpd->start_pos +=3D folio_size(folio); folio_unlock(folio); } =20 @@ -2034,7 +2035,7 @@ static int mpage_submit_folio(struct mpage_da_data *m= pd, struct folio *folio) loff_t size; int err; =20 - BUG_ON(folio->index !=3D mpd->first_page); + WARN_ON_ONCE(folio_pos(folio) !=3D mpd->start_pos); folio_clear_dirty_for_io(folio); /* * We have to be very careful here! Nothing protects writeback path @@ -2446,7 +2447,7 @@ static int mpage_map_and_submit_extent(handle_t *hand= le, * Update on-disk size after IO is submitted. Races with * truncate are avoided by checking i_size under i_data_sem. */ - disksize =3D ((loff_t)mpd->first_page) << PAGE_SHIFT; + disksize =3D mpd->start_pos; if (disksize > READ_ONCE(EXT4_I(inode)->i_disksize)) { int err2; loff_t i_size; @@ -2549,8 +2550,8 @@ static int mpage_prepare_extent_to_map(struct mpage_d= a_data *mpd) struct address_space *mapping =3D mpd->inode->i_mapping; struct folio_batch fbatch; unsigned int nr_folios; - pgoff_t index =3D mpd->first_page; - pgoff_t end =3D mpd->last_page; + pgoff_t index =3D mpd->start_pos >> PAGE_SHIFT; + pgoff_t end =3D mpd->end_pos >> PAGE_SHIFT; xa_mark_t tag; int i, err =3D 0; int blkbits =3D mpd->inode->i_blkbits; @@ -2565,7 +2566,7 @@ static int mpage_prepare_extent_to_map(struct mpage_d= a_data *mpd) tag =3D PAGECACHE_TAG_DIRTY; =20 mpd->map.m_len =3D 0; - mpd->next_page =3D index; + mpd->next_pos =3D mpd->start_pos; if (ext4_should_journal_data(mpd->inode)) { handle =3D ext4_journal_start(mpd->inode, EXT4_HT_WRITE_PAGE, bpp); @@ -2596,7 +2597,8 @@ static int mpage_prepare_extent_to_map(struct mpage_d= a_data *mpd) goto out; =20 /* If we can't merge this page, we are done. */ - if (mpd->map.m_len > 0 && mpd->next_page !=3D folio->index) + if (mpd->map.m_len > 0 && + mpd->next_pos !=3D folio_pos(folio)) goto out; =20 if (handle) { @@ -2642,8 +2644,8 @@ static int mpage_prepare_extent_to_map(struct mpage_d= a_data *mpd) } =20 if (mpd->map.m_len =3D=3D 0) - mpd->first_page =3D folio->index; - mpd->next_page =3D folio_next_index(folio); + mpd->start_pos =3D folio_pos(folio); + mpd->next_pos =3D folio_pos(folio) + folio_size(folio); /* * Writeout when we cannot modify metadata is simple. * Just submit the page. For data=3Djournal mode we @@ -2786,18 +2788,18 @@ static int ext4_do_writepages(struct mpage_da_data = *mpd) writeback_index =3D mapping->writeback_index; if (writeback_index) cycled =3D 0; - mpd->first_page =3D writeback_index; - mpd->last_page =3D -1; + mpd->start_pos =3D writeback_index << PAGE_SHIFT; + mpd->end_pos =3D -1; } else { - mpd->first_page =3D wbc->range_start >> PAGE_SHIFT; - mpd->last_page =3D wbc->range_end >> PAGE_SHIFT; + mpd->start_pos =3D wbc->range_start; + mpd->end_pos =3D wbc->range_end; } =20 ext4_io_submit_init(&mpd->io_submit, wbc); retry: if (wbc->sync_mode =3D=3D WB_SYNC_ALL || wbc->tagged_writepages) - tag_pages_for_writeback(mapping, mpd->first_page, - mpd->last_page); + tag_pages_for_writeback(mapping, mpd->start_pos >> PAGE_SHIFT, + mpd->end_pos >> PAGE_SHIFT); blk_start_plug(&plug); =20 /* @@ -2857,7 +2859,7 @@ static int ext4_do_writepages(struct mpage_da_data *m= pd) } mpd->do_map =3D 1; =20 - trace_ext4_da_write_pages(inode, mpd->first_page, wbc); + trace_ext4_da_write_pages(inode, mpd->start_pos, wbc); ret =3D mpage_prepare_extent_to_map(mpd); if (!ret && mpd->map.m_len) ret =3D mpage_map_and_submit_extent(handle, mpd, @@ -2914,8 +2916,8 @@ static int ext4_do_writepages(struct mpage_da_data *m= pd) blk_finish_plug(&plug); if (!ret && !cycled && wbc->nr_to_write > 0) { cycled =3D 1; - mpd->last_page =3D writeback_index - 1; - mpd->first_page =3D 0; + mpd->end_pos =3D (writeback_index << PAGE_SHIFT) - 1; + mpd->start_pos =3D 0; goto retry; } =20 @@ -2925,7 +2927,7 @@ static int ext4_do_writepages(struct mpage_da_data *m= pd) * Set the writeback_index so that range_cyclic * mode will write it back later */ - mapping->writeback_index =3D mpd->first_page; + mapping->writeback_index =3D mpd->start_pos >> PAGE_SHIFT; =20 out_writepages: trace_ext4_writepages_result(inode, wbc, ret, diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index 156908641e68..62d52997b5c6 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -483,15 +483,15 @@ TRACE_EVENT(ext4_writepages, ); =20 TRACE_EVENT(ext4_da_write_pages, - TP_PROTO(struct inode *inode, pgoff_t first_page, + TP_PROTO(struct inode *inode, loff_t start_pos, struct writeback_control *wbc), =20 - TP_ARGS(inode, first_page, wbc), + TP_ARGS(inode, start_pos, wbc), =20 TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) - __field( pgoff_t, first_page ) + __field( loff_t, start_pos ) __field( long, nr_to_write ) __field( int, sync_mode ) ), @@ -499,15 +499,14 @@ TRACE_EVENT(ext4_da_write_pages, TP_fast_assign( __entry->dev =3D inode->i_sb->s_dev; __entry->ino =3D inode->i_ino; - __entry->first_page =3D first_page; + __entry->start_pos =3D start_pos; __entry->nr_to_write =3D wbc->nr_to_write; __entry->sync_mode =3D wbc->sync_mode; ), =20 - TP_printk("dev %d,%d ino %lu first_page %lu nr_to_write %ld " - "sync_mode %d", + TP_printk("dev %d,%d ino %lu start_pos 0x%llx nr_to_write %ld sync_mode %= d", MAJOR(__entry->dev), MINOR(__entry->dev), - (unsigned long) __entry->ino, __entry->first_page, + (unsigned long) __entry->ino, __entry->start_pos, __entry->nr_to_write, __entry->sync_mode) ); =20 --=20 2.46.1 From nobody Mon Dec 15 21:27:09 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1CA3C26FDBD; Tue, 1 Jul 2025 13:20:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376062; cv=none; b=cgk/ldwrWMyP3ArmMmeq6ngF9r75MBroTxPK0HksViWMPTMQmHgo27Db7MR51QyejVVvhUp6TEGcYSVaEw8pq3OWyOtH8ZUMPcAwPL/E+wrxYg7wjprw+sdulMzq+F628BneqPsk1+6QuL8Tq+LYfRCAkJv5EQt8enWeIGjMb40= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376062; c=relaxed/simple; bh=24Q448/x+iyAlMSfCXZcBwZ46qGKyWQNu6PFXRxVhMY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gEgM3faS4sEZtE172xxb/HXjl2EH+4P8eKYpXW+stpLN8qQATGm0/HpIkmBgiLX/LjWkvU/OaL/qO1Xpq931R9zUkoDb96BHmPguojVsJf/YB0Q3pli2p0A1IYK1tbwzyGGsvane4xXY8E7pFdKMfqgAcmA/M2D+OYzmiYnw5J0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4bWkDq0zBKzYQvNH; Tue, 1 Jul 2025 21:20:59 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 0534D1A0E9A; Tue, 1 Jul 2025 21:20:58 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgAXeCWu4GNoXmJGAQ--.26904S6; Tue, 01 Jul 2025 21:20:57 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v3 02/10] ext4: move the calculation of wbc->nr_to_write to mpage_folio_done() Date: Tue, 1 Jul 2025 21:06:27 +0800 Message-ID: <20250701130635.4079595-3-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> References: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgAXeCWu4GNoXmJGAQ--.26904S6 X-Coremail-Antispam: 1UD129KBjvJXoW7Kw1UKr4DCFW3Gw1xuw47CFg_yoW8Jw17pF W5Kas7GFWkXr909Fn7WFsxZr1xtas3Gw4UXFW3Kw13XFy5Cr95KFsFq34Y9F4fJrWkJayI qF4xJFy5u3W7AFDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUm014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0x vEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVj vjDU0xZFpf9x0JUADGOUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi mpage_folio_done() should be a more appropriate place than mpage_submit_folio() for updating the wbc->nr_to_write after we have submitted a fully mapped folio. Preparing to make mpage_submit_folio() allows to submit partially mapped folio that is still under processing. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Reviewed-by: Baokun Li --- fs/ext4/inode.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index ba81df0d87dd..38db1c186f76 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2026,6 +2026,7 @@ int ext4_da_get_block_prep(struct inode *inode, secto= r_t iblock, static void mpage_folio_done(struct mpage_da_data *mpd, struct folio *foli= o) { mpd->start_pos +=3D folio_size(folio); + mpd->wbc->nr_to_write -=3D folio_nr_pages(folio); folio_unlock(folio); } =20 @@ -2056,8 +2057,6 @@ static int mpage_submit_folio(struct mpage_da_data *m= pd, struct folio *folio) !ext4_verity_in_progress(mpd->inode)) len =3D size & (len - 1); err =3D ext4_bio_write_folio(&mpd->io_submit, folio, len); - if (!err) - mpd->wbc->nr_to_write -=3D folio_nr_pages(folio); =20 return err; } --=20 2.46.1 From nobody Mon Dec 15 21:27:09 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 73FA4273818; Tue, 1 Jul 2025 13:21:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376062; cv=none; b=tchbOTJN76uFBoA4nhL5E5jrZ7hXbozduggipqmBN7lmfEyVmFWZevXe5+QTGmXazfwtsBdBC+qfW1pVjzPZ+znfj7fumHV+Y3SZ22gEjNhGAW2HgjK7Mx13+f/uN16D4AUDNWHY0KLhpjanycjxZyW9yT7JMzbinO4fBzYq2Ig= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376062; c=relaxed/simple; bh=iP2TtTvJh15Q8qVcIghGtJuwk7cPJ5blKpE8maiVePY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gETfcAKJSRsnSoFTSjKsV8L1O46X+R0rNoAx8btRyOBI+4JXRzXgR9v9CY9Jx5xXU/UfOpQYZ3yx1eKRjqu1Vb9RN5G+ZLcbaHwEmMPMZDPOuhUb0Djk4VuPYEHn8jH//HS1OUiscMKkdDmBEbdtCWaeOn5wgx+ROpe5IcrxlTw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4bWkDr14bTzKHN5H; Tue, 1 Jul 2025 21:21:00 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 8943C1A19D5; Tue, 1 Jul 2025 21:20:58 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgAXeCWu4GNoXmJGAQ--.26904S7; Tue, 01 Jul 2025 21:20:58 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v3 03/10] ext4: fix stale data if it bail out of the extents mapping loop Date: Tue, 1 Jul 2025 21:06:28 +0800 Message-ID: <20250701130635.4079595-4-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> References: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgAXeCWu4GNoXmJGAQ--.26904S7 X-Coremail-Antispam: 1UD129KBjvJXoWxWryDXr47uFyDGw18GrWfKrg_yoW5CFyDpF Wakwn8Gr4kGayag393JanrXr1Fk395JrWUXFW7GrZrZFy5JFyfKr4xt3WYvFW5JrykJFy0 qr4UKr1UW3W7AFDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmY14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42 IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIev Ja73UjIFyTuYvjfUF3kuDUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi During the process of writing back folios, if mpage_map_and_submit_extent() exits the extent mapping loop due to an ENOSPC or ENOMEM error, it may result in stale data or filesystem inconsistency in environments where the block size is smaller than the folio size. When mapping a discontinuous folio in mpage_map_and_submit_extent(), some buffers may have already be mapped. If we exit the mapping loop prematurely, the folio data within the mapped range will not be written back, and the file's disk size will not be updated. Once the transaction that includes this range of extents is committed, this can lead to stale data or filesystem inconsistency. Fix this by submitting the current processing partially mapped folio. Suggested-by: Jan Kara Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/inode.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 50 insertions(+), 1 deletion(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 38db1c186f76..62f1263d05da 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2361,6 +2361,47 @@ static int mpage_map_one_extent(handle_t *handle, st= ruct mpage_da_data *mpd) return 0; } =20 +/* + * This is used to submit mapped buffers in a single folio that is not ful= ly + * mapped for various reasons, such as insufficient space or journal credi= ts. + */ +static int mpage_submit_buffers(struct mpage_da_data *mpd) +{ + struct inode *inode =3D mpd->inode; + struct folio *folio; + loff_t pos; + int ret; + + folio =3D filemap_get_folio(inode->i_mapping, + mpd->start_pos >> PAGE_SHIFT); + if (IS_ERR(folio)) + return PTR_ERR(folio); + /* + * The mapped position should be within the current processing folio + * but must not be the folio start position. + */ + pos =3D mpd->map.m_lblk << inode->i_blkbits; + if (WARN_ON_ONCE((folio_pos(folio) =3D=3D pos) || + !folio_contains(folio, pos >> PAGE_SHIFT))) + return -EINVAL; + + ret =3D mpage_submit_folio(mpd, folio); + if (ret) + goto out; + /* + * Update start_pos to prevent this folio from being released in + * mpage_release_unused_pages(), it will be reset to the aligned folio + * pos when this folio is written again in the next round. Additionally, + * do not update wbc->nr_to_write here, as it will be updated once the + * entire folio has finished processing. + */ + mpd->start_pos =3D pos; +out: + folio_unlock(folio); + folio_put(folio); + return ret; +} + /* * mpage_map_and_submit_extent - map extent starting at mpd->lblk of length * mpd->len and submit pages underlying it for IO @@ -2411,8 +2452,16 @@ static int mpage_map_and_submit_extent(handle_t *han= dle, */ if ((err =3D=3D -ENOMEM) || (err =3D=3D -ENOSPC && ext4_count_free_clusters(sb))) { - if (progress) + /* + * We may have already allocated extents for + * some bhs inside the folio, issue the + * corresponding data to prevent stale data. + */ + if (progress) { + if (mpage_submit_buffers(mpd)) + goto invalidate_dirty_pages; goto update_disksize; + } return err; } ext4_msg(sb, KERN_CRIT, --=20 2.46.1 From nobody Mon Dec 15 21:27:09 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B6032741D4; Tue, 1 Jul 2025 13:21:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376063; cv=none; b=AdQTcGYfi+w5VBBagCR9oIWOC6NgPKuA5xL/6t+5b5Bu2mh2m/SzhepunThg5jHl/g5N6+WS+3+tY16OhO1bdLotyCrtawmGYO4ex1zwISoU63D6xeAlhM+mF8/0zZjeGghRIBkkrw6H06YirsRmMPTWRFqCljfmat461n8OWao= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376063; c=relaxed/simple; bh=a6cj2y49XC2xiZz1csecTHIMAOij8ZE5LgP/Y8KX5e4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Xdd8Z2lD0h6LLkxeRUghjMqAgLbZ5WYe8OXYzW/M7vw5vxvb0yZEyJ4zxktViYuqMclZSyv5ut4zncufFkVc1OwGzXpVZ1s8P5aNidXlsx9lG1E5HBycMKjbH0K7XRje/AtQZtMuz6Q7Tr0vUmDk9vkBtosmwJLFg+zVHVdIh1g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4bWkDr1SPNzYQvPS; Tue, 1 Jul 2025 21:21:00 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 15C4C1A0E9A; Tue, 1 Jul 2025 21:20:59 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgAXeCWu4GNoXmJGAQ--.26904S8; Tue, 01 Jul 2025 21:20:58 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v3 04/10] ext4: refactor the block allocation process of ext4_page_mkwrite() Date: Tue, 1 Jul 2025 21:06:29 +0800 Message-ID: <20250701130635.4079595-5-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> References: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgAXeCWu4GNoXmJGAQ--.26904S8 X-Coremail-Antispam: 1UD129KBjvJXoWxAF15KF4DGFW5WFyUXFWkJFb_yoWrur4fpr y3Kr95ur47u34DWFs3WF4DZF13Ka4vgrWUGFyxGr1fZa43trnxKF4rt3WvyF4UtrW3Xan2 qr4UAFyUu3WjgrDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUriihUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi The block allocation process and error handling in ext4_page_mkwrite() is complex now. Refactor it by introducing a new helper function, ext4_block_page_mkwrite(). It will call ext4_block_write_begin() to allocate blocks instead of directly calling block_page_mkwrite(). Preparing to implement retry logic in a subsequent patch to address situations where the reserved journal credits are insufficient. Additionally, this modification will help prevent potential deadlocks that may occur when waiting for folio writeback while holding the transaction handle. Suggested-by: Jan Kara Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/inode.c | 95 ++++++++++++++++++++++++++----------------------- 1 file changed, 50 insertions(+), 45 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 62f1263d05da..31731a732df2 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -6605,6 +6605,53 @@ static int ext4_bh_unmapped(handle_t *handle, struct= inode *inode, return !buffer_mapped(bh); } =20 +static int ext4_block_page_mkwrite(struct inode *inode, struct folio *foli= o, + get_block_t get_block) +{ + handle_t *handle; + loff_t size; + unsigned long len; + int ret; + + handle =3D ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, + ext4_writepage_trans_blocks(inode)); + if (IS_ERR(handle)) + return PTR_ERR(handle); + + folio_lock(folio); + size =3D i_size_read(inode); + /* Page got truncated from under us? */ + if (folio->mapping !=3D inode->i_mapping || folio_pos(folio) > size) { + ret =3D -EFAULT; + goto out_error; + } + + len =3D folio_size(folio); + if (folio_pos(folio) + len > size) + len =3D size - folio_pos(folio); + + ret =3D ext4_block_write_begin(handle, folio, 0, len, get_block); + if (ret) + goto out_error; + + if (!ext4_should_journal_data(inode)) { + block_commit_write(folio, 0, len); + folio_mark_dirty(folio); + } else { + ret =3D ext4_journal_folio_buffers(handle, folio, len); + if (ret) + goto out_error; + } + ext4_journal_stop(handle); + folio_wait_stable(folio); + return ret; + +out_error: + folio_unlock(folio); + ext4_journal_stop(handle); + return ret; +} + vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) { struct vm_area_struct *vma =3D vmf->vma; @@ -6616,8 +6663,7 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) struct file *file =3D vma->vm_file; struct inode *inode =3D file_inode(file); struct address_space *mapping =3D inode->i_mapping; - handle_t *handle; - get_block_t *get_block; + get_block_t *get_block =3D ext4_get_block; int retries =3D 0; =20 if (unlikely(IS_IMMUTABLE(inode))) @@ -6685,46 +6731,9 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) /* OK, we need to fill the hole... */ if (ext4_should_dioread_nolock(inode)) get_block =3D ext4_get_block_unwritten; - else - get_block =3D ext4_get_block; retry_alloc: - handle =3D ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, - ext4_writepage_trans_blocks(inode)); - if (IS_ERR(handle)) { - ret =3D VM_FAULT_SIGBUS; - goto out; - } - /* - * Data journalling can't use block_page_mkwrite() because it - * will set_buffer_dirty() before do_journal_get_write_access() - * thus might hit warning messages for dirty metadata buffers. - */ - if (!ext4_should_journal_data(inode)) { - err =3D block_page_mkwrite(vma, vmf, get_block); - } else { - folio_lock(folio); - size =3D i_size_read(inode); - /* Page got truncated from under us? */ - if (folio->mapping !=3D mapping || folio_pos(folio) > size) { - ret =3D VM_FAULT_NOPAGE; - goto out_error; - } - - len =3D folio_size(folio); - if (folio_pos(folio) + len > size) - len =3D size - folio_pos(folio); - - err =3D ext4_block_write_begin(handle, folio, 0, len, - ext4_get_block); - if (!err) { - ret =3D VM_FAULT_SIGBUS; - if (ext4_journal_folio_buffers(handle, folio, len)) - goto out_error; - } else { - folio_unlock(folio); - } - } - ext4_journal_stop(handle); + /* Start jorunal and allocate blocks */ + err =3D ext4_block_page_mkwrite(inode, folio, get_block); if (err =3D=3D -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)) goto retry_alloc; out_ret: @@ -6733,8 +6742,4 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) filemap_invalidate_unlock_shared(mapping); sb_end_pagefault(inode->i_sb); return ret; -out_error: - folio_unlock(folio); - ext4_journal_stop(handle); - goto out; } --=20 2.46.1 From nobody Mon Dec 15 21:27:09 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6DD69274661; Tue, 1 Jul 2025 13:21:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376063; cv=none; b=Z4+10CZK8a63c5i5dhUVZiAMzVN/GEPf6+p7t/qUW6I9AW8PXPE9lgggxvmEENTfZrZQi2Gen+B0ZTxpTIMLhH6BxUlCKpj8iSgzqF2Ru0VGfPhmkGTsIXkNbPxScUVnwbwhwMP3I7jxUtvvuQlNHEl/bk5W9iEcMoSlvR2Vmvs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376063; c=relaxed/simple; bh=OMzw7fwuDNBES4pa/3F/wT/AyXaB44di+mYgT56ndXc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=potTR2kZ1oO6RJEv9xvHPbfx0n8AHdWpLA+5DI+TpWY1oI4D2J+13R+51SLZfWzBKp6VcJebcXN6efHzlWTkuBiICfHs1kbIf3pv9ddIc2n6q4IE+0FRNiSJfdwcsJoxMngUzKm61qBVEScHxCIiasoNSbdEA9qqvee4yIEQ+tI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4bWkDs1dFdzKHN5g; Tue, 1 Jul 2025 21:21:01 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 972C61A0E0A; Tue, 1 Jul 2025 21:20:59 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgAXeCWu4GNoXmJGAQ--.26904S9; Tue, 01 Jul 2025 21:20:59 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v3 05/10] ext4: restart handle if credits are insufficient during allocating blocks Date: Tue, 1 Jul 2025 21:06:30 +0800 Message-ID: <20250701130635.4079595-6-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> References: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgAXeCWu4GNoXmJGAQ--.26904S9 X-Coremail-Antispam: 1UD129KBjvJXoWxCF17Zw4kWw1rCF4DGrWruFg_yoWrCF4Dpr WakFy5Gr17Wry3Wanaqw4DXF13W3WxtrWUJF93W3s0vFy8GrnxKFs0yFyYyFWvkrWkWa13 XF4jkryUWayjyrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUriihUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi After large folios are supported on ext4, writing back a sufficiently large and discontinuous folio may consume a significant number of journal credits, placing considerable strain on the journal. For example, in a 20GB filesystem with 1K block size and 1MB journal size, writing back a 2MB folio could require thousands of credits in the worst-case scenario (when each block is discontinuous and distributed across different block groups), potentially exceeding the journal size. This issue can also occur in ext4_write_begin() and ext4_page_mkwrite() when delalloc is not enabled. Fix this by ensuring that there are sufficient journal credits before allocating an extent in mpage_map_one_extent() and ext4_block_write_begin(). If there are not enough credits, return -EAGAIN, exit the current mapping loop, restart a new handle and a new transaction, and allocating blocks on this folio again in the next iteration. Suggested-by: Jan Kara Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/inode.c | 40 +++++++++++++++++++++++++++++++++++----- 1 file changed, 35 insertions(+), 5 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 31731a732df2..efe778aaf74b 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -877,6 +877,25 @@ static void ext4_update_bh_state(struct buffer_head *b= h, unsigned long flags) } while (unlikely(!try_cmpxchg(&bh->b_state, &old_state, new_state))); } =20 +/* + * Make sure that the current journal transaction has enough credits to map + * one extent. Return -EAGAIN if it cannot extend the current running + * transaction. + */ +static inline int ext4_journal_ensure_extent_credits(handle_t *handle, + struct inode *inode) +{ + int credits; + int ret; + + if (!handle) + return 0; + + credits =3D ext4_chunk_trans_blocks(inode, 1); + ret =3D __ext4_journal_ensure_credits(handle, credits, credits, 0); + return ret <=3D 0 ? ret : -EAGAIN; +} + static int _ext4_get_block(struct inode *inode, sector_t iblock, struct buffer_head *bh, int flags) { @@ -1175,7 +1194,9 @@ int ext4_block_write_begin(handle_t *handle, struct f= olio *folio, clear_buffer_new(bh); if (!buffer_mapped(bh)) { WARN_ON(bh->b_size !=3D blocksize); - err =3D get_block(inode, block, bh, 1); + err =3D ext4_journal_ensure_extent_credits(handle, inode); + if (!err) + err =3D get_block(inode, block, bh, 1); if (err) break; if (buffer_new(bh)) { @@ -1374,8 +1395,9 @@ static int ext4_write_begin(struct file *file, struct= address_space *mapping, ext4_orphan_del(NULL, inode); } =20 - if (ret =3D=3D -ENOSPC && - ext4_should_retry_alloc(inode->i_sb, &retries)) + if (ret =3D=3D -EAGAIN || + (ret =3D=3D -ENOSPC && + ext4_should_retry_alloc(inode->i_sb, &retries))) goto retry_journal; folio_put(folio); return ret; @@ -2323,6 +2345,11 @@ static int mpage_map_one_extent(handle_t *handle, st= ruct mpage_da_data *mpd) int get_blocks_flags; int err, dioread_nolock; =20 + /* Make sure transaction has enough credits for this extent */ + err =3D ext4_journal_ensure_extent_credits(handle, inode); + if (err < 0) + return err; + trace_ext4_da_write_pages_extent(inode, map); /* * Call ext4_map_blocks() to allocate any delayed allocation blocks, or @@ -2450,7 +2477,7 @@ static int mpage_map_and_submit_extent(handle_t *hand= le, * In the case of ENOSPC, if ext4_count_free_blocks() * is non-zero, a commit should free up blocks. */ - if ((err =3D=3D -ENOMEM) || + if ((err =3D=3D -ENOMEM) || (err =3D=3D -EAGAIN) || (err =3D=3D -ENOSPC && ext4_count_free_clusters(sb))) { /* * We may have already allocated extents for @@ -2956,6 +2983,8 @@ static int ext4_do_writepages(struct mpage_da_data *m= pd) ret =3D 0; continue; } + if (ret =3D=3D -EAGAIN) + ret =3D 0; /* Fatal error - ENOMEM, EIO... */ if (ret) break; @@ -6734,7 +6763,8 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) retry_alloc: /* Start jorunal and allocate blocks */ err =3D ext4_block_page_mkwrite(inode, folio, get_block); - if (err =3D=3D -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)) + if (err =3D=3D -EAGAIN || + (err =3D=3D -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)= )) goto retry_alloc; out_ret: ret =3D vmf_fs_error(err); --=20 2.46.1 From nobody Mon Dec 15 21:27:10 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E285727466D; Tue, 1 Jul 2025 13:21:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376064; cv=none; b=U4Cpq+MjTJ2dLuHqlzv7vel468+cbPfXIdLyrHBQuUNIQHaKBQ5A0UsJF5nOBrm+ZE4fNf+ghsRuB4UgZiFibmQAgkKubjLfwzOs/AeDUEaQdS+ZKmvTNMSgBjz90CEwOJdSyCly8BEauX0WEYP9p7R8pogdDkMxnTe3SMCt4bI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376064; c=relaxed/simple; bh=O/9yJgfVsUMKzk6FZTjdJTcxHULZ6m2zNoifI7cVn7Y=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hcWxTa0y7KvJDTCWB6GuvqNjay7W5iAUcWn7CW8fDQiNqIzHpxmwBH01CdrYE6JpvF5eq5Y7i3PoVQZ3rVQ3YgtPiYbqV2OsFKxEbHsZjyrp+DCNgLZO+scCjbUzsHdivqVoAQpR8nUzXZUCyN/cPrcjK7Rv3fq6RnlfTFSORCU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4bWkDs1wHZzYQvKS; Tue, 1 Jul 2025 21:21:01 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 246291A0E2D; Tue, 1 Jul 2025 21:21:00 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgAXeCWu4GNoXmJGAQ--.26904S10; Tue, 01 Jul 2025 21:20:59 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v3 06/10] ext4: enhance tracepoints during the folios writeback Date: Tue, 1 Jul 2025 21:06:31 +0800 Message-ID: <20250701130635.4079595-7-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> References: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgAXeCWu4GNoXmJGAQ--.26904S10 X-Coremail-Antispam: 1UD129KBjvJXoWxuF17Aw4UJw4DXFyDWrW8Crg_yoWrWw48pF WqkF95Wrs7Zw4Y93WfZa1UZr4FvFykur47tr13WFyDJw1xAr1kKa17KryqyFyjyrZ2yryI qF4qk3sxC3WxWrDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUriihUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi After mpage_map_and_submit_extent() supports restarting handle if credits are insufficient during allocating blocks, it is more likely to exit the current mapping iteration and continue to process the current processing partially mapped folio again. The existing tracepoints are not sufficient to track this situation, so enhance the tracepoints to track the writeback position and the return value before and after submitting the folios. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/inode.c | 5 ++++- include/trace/events/ext4.h | 42 ++++++++++++++++++++++++++++++++----- 2 files changed, 41 insertions(+), 6 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index efe778aaf74b..79389874d35f 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2934,7 +2934,8 @@ static int ext4_do_writepages(struct mpage_da_data *m= pd) } mpd->do_map =3D 1; =20 - trace_ext4_da_write_pages(inode, mpd->start_pos, wbc); + trace_ext4_da_write_folios_start(inode, mpd->start_pos, + mpd->next_pos, wbc); ret =3D mpage_prepare_extent_to_map(mpd); if (!ret && mpd->map.m_len) ret =3D mpage_map_and_submit_extent(handle, mpd, @@ -2972,6 +2973,8 @@ static int ext4_do_writepages(struct mpage_da_data *m= pd) } else ext4_put_io_end(mpd->io_submit.io_end); mpd->io_submit.io_end =3D NULL; + trace_ext4_da_write_folios_end(inode, mpd->start_pos, + mpd->next_pos, wbc, ret); =20 if (ret =3D=3D -ENOSPC && sbi->s_journal) { /* diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index 62d52997b5c6..845451077c41 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -482,16 +482,17 @@ TRACE_EVENT(ext4_writepages, (unsigned long) __entry->writeback_index) ); =20 -TRACE_EVENT(ext4_da_write_pages, - TP_PROTO(struct inode *inode, loff_t start_pos, +TRACE_EVENT(ext4_da_write_folios_start, + TP_PROTO(struct inode *inode, loff_t start_pos, loff_t next_pos, struct writeback_control *wbc), =20 - TP_ARGS(inode, start_pos, wbc), + TP_ARGS(inode, start_pos, next_pos, wbc), =20 TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( loff_t, start_pos ) + __field( loff_t, next_pos ) __field( long, nr_to_write ) __field( int, sync_mode ) ), @@ -500,16 +501,47 @@ TRACE_EVENT(ext4_da_write_pages, __entry->dev =3D inode->i_sb->s_dev; __entry->ino =3D inode->i_ino; __entry->start_pos =3D start_pos; + __entry->next_pos =3D next_pos; __entry->nr_to_write =3D wbc->nr_to_write; __entry->sync_mode =3D wbc->sync_mode; ), =20 - TP_printk("dev %d,%d ino %lu start_pos 0x%llx nr_to_write %ld sync_mode %= d", + TP_printk("dev %d,%d ino %lu start_pos 0x%llx next_pos 0x%llx nr_to_write= %ld sync_mode %d", MAJOR(__entry->dev), MINOR(__entry->dev), - (unsigned long) __entry->ino, __entry->start_pos, + (unsigned long) __entry->ino, __entry->start_pos, __entry->next_pos, __entry->nr_to_write, __entry->sync_mode) ); =20 +TRACE_EVENT(ext4_da_write_folios_end, + TP_PROTO(struct inode *inode, loff_t start_pos, loff_t next_pos, + struct writeback_control *wbc, int ret), + + TP_ARGS(inode, start_pos, next_pos, wbc, ret), + + TP_STRUCT__entry( + __field( dev_t, dev ) + __field( ino_t, ino ) + __field( loff_t, start_pos ) + __field( loff_t, next_pos ) + __field( long, nr_to_write ) + __field( int, ret ) + ), + + TP_fast_assign( + __entry->dev =3D inode->i_sb->s_dev; + __entry->ino =3D inode->i_ino; + __entry->start_pos =3D start_pos; + __entry->next_pos =3D next_pos; + __entry->nr_to_write =3D wbc->nr_to_write; + __entry->ret =3D ret; + ), + + TP_printk("dev %d,%d ino %lu start_pos 0x%llx next_pos 0x%llx nr_to_write= %ld ret %d", + MAJOR(__entry->dev), MINOR(__entry->dev), + (unsigned long) __entry->ino, __entry->start_pos, __entry->next_pos, + __entry->nr_to_write, __entry->ret) +); + TRACE_EVENT(ext4_da_write_pages_extent, TP_PROTO(struct inode *inode, struct ext4_map_blocks *map), =20 --=20 2.46.1 From nobody Mon Dec 15 21:27:10 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7876B27466E; Tue, 1 Jul 2025 13:21:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376064; cv=none; b=CPsbGJEI2uvdZ3110F9ZaMz2S4wfVm6ftZlJlvBic2pL8pIzaV9Ia6pLVnRarWjkCp8Bo/n8+tBAqGGaGMHr8LwD+5SjvkdfULKhKRTjYmkF78CfkjIBLa1MuXCsCthOrKo/h2Gyx2AGpGEWDD4hgiFbDJe6YKkraQssTf8PHSQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376064; c=relaxed/simple; bh=sUZv+ag0gXVbytB+J9bC9ioCeq4Mgh32HdgczhLjKmI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=gD3OccKtxeRtRscC3OGr+aEUQt6tZcoFoQbrhImCWB5JKnxVt7Aow3pi0k7RZTq0YIKj6PIRun5OULfB3UBtAsBCNtmmmD6lri3lHJErmh8g/TFfnM5hMVkdq6ot0aPS4sfXwcjTePFjdoaY1hr9kIY7fepyTACI8NMVclY75zs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4bWkDs5gD9zYQvPv; Tue, 1 Jul 2025 21:21:01 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id A3B211A0E22; Tue, 1 Jul 2025 21:21:00 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgAXeCWu4GNoXmJGAQ--.26904S11; Tue, 01 Jul 2025 21:21:00 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v3 07/10] ext4: correct the reserved credits for extent conversion Date: Tue, 1 Jul 2025 21:06:32 +0800 Message-ID: <20250701130635.4079595-8-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> References: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgAXeCWu4GNoXmJGAQ--.26904S11 X-Coremail-Antispam: 1UD129KBjvJXoW7Ar4kAw4fJFyDJFy3XrykuFg_yoW8Gw45pF nxGFykWr18ua48uan3u3W7ZF1rCay8Cw4UXF4Skw1UXa98Gr1xKr1qqw1rt3WUJrWxJrW5 ZF47uryUua13Z3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi Now, we reserve journal credits for converting extents in only one page to written state when the I/O operation is complete. This is insufficient when large folio is enabled. Fix this by reserving credits for converting up to one extent per block in the largest 2MB folio, this calculation should only involve extents index and leaf blocks, so it should not estimate too many credits. Fixes: 7ac67301e82f ("ext4: enable large folio for regular file") Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Reviewed-by: Baokun Li --- fs/ext4/inode.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 79389874d35f..3230734a3014 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2848,12 +2848,12 @@ static int ext4_do_writepages(struct mpage_da_data = *mpd) mpd->journalled_more_data =3D 0; =20 if (ext4_should_dioread_nolock(inode)) { + int bpf =3D ext4_journal_blocks_per_folio(inode); /* * We may need to convert up to one extent per block in - * the page and we may dirty the inode. + * the folio and we may dirty the inode. */ - rsv_blocks =3D 1 + ext4_chunk_trans_blocks(inode, - PAGE_SIZE >> inode->i_blkbits); + rsv_blocks =3D 1 + ext4_ext_index_trans_blocks(inode, bpf); } =20 if (wbc->range_start =3D=3D 0 && wbc->range_end =3D=3D LLONG_MAX) --=20 2.46.1 From nobody Mon Dec 15 21:27:10 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 68929277026; Tue, 1 Jul 2025 13:21:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376066; cv=none; b=SFHzlZoHQ27K+sOWqe93daQh6+sRYhoytFdivf8+7taTiW+BsoPa/hDvsPehm+Z/4aVTplKdB8R7UmsMR7vPIJOhDpnUqDGfe8D0X0IWFARJ2yOY56/N54LrVt4bzlKsBzD13fhnguqHP0EPyd03fWceksVkf8hXr0kpodhI2A0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376066; c=relaxed/simple; bh=HzQwYCnYv+cE7cxLCWkE45mPBh4oe+LIXmYno0y9hCA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hUcWOL+5eYT+2ls77z37smTacgaSpLnpKoesCdtZmkUoqzHZ1lmMQKfem4yX4cC+Bwsk7LDQclhsmxISzjASE7qpc2lsV6fk2n5PxhV6Vr4tsCMcQNZN7yLdqJSbGTeCdXWF9qRaTtCKkZR8ZUrOt77dYQM9LCK75cGXMu4JzW4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4bWkDt2T3XzYQvPm; Tue, 1 Jul 2025 21:21:02 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 33A101A1347; Tue, 1 Jul 2025 21:21:01 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgAXeCWu4GNoXmJGAQ--.26904S12; Tue, 01 Jul 2025 21:21:00 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v3 08/10] ext4: reserved credits for one extent during the folio writeback Date: Tue, 1 Jul 2025 21:06:33 +0800 Message-ID: <20250701130635.4079595-9-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> References: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgAXeCWu4GNoXmJGAQ--.26904S12 X-Coremail-Antispam: 1UD129KBjvJXoWxZry5KryruFyDAFy7XrWUArb_yoW5Xw4xpr ZxCrWkWry7WFyUuFWxWa18ZF1fWa48CrWUJ39xKFn7Wa98Z34xKFn8KayY9FW5KrWxGa4j vF45C3s8Wa42ya7anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi After ext4 supports large folios, reserving journal credits for one maximum-ordered folio based on the worst case cenario during the writeback process can easily exceed the maximum transaction credits. Additionally, reserving journal credits for one page is also no longer appropriate. Currently, the folio writeback process can either extend the journal credits or initiate a new transaction if the currently reserved journal credits are insufficient. Therefore, it can be modified to reserve credits for only one extent at the outset. In most cases involving continuous mapping, these credits are generally adequate, and we may only need to perform some basic credit expansion. However, in extreme cases where the block size and folio size differ significantly, or when the folios are sufficiently discontinuous, it may be necessary to restart a new transaction and resubmit the folios. Suggested-by: Jan Kara Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/inode.c | 25 ++++++++----------------- 1 file changed, 8 insertions(+), 17 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 3230734a3014..ceaede80d791 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2546,21 +2546,6 @@ static int mpage_map_and_submit_extent(handle_t *han= dle, return err; } =20 -/* - * Calculate the total number of credits to reserve for one writepages - * iteration. This is called from ext4_writepages(). We map an extent of - * up to MAX_WRITEPAGES_EXTENT_LEN blocks and then we go on and finish map= ping - * the last partial page. So in total we can map MAX_WRITEPAGES_EXTENT_LEN= + - * bpp - 1 blocks in bpp different extents. - */ -static int ext4_da_writepages_trans_blocks(struct inode *inode) -{ - int bpp =3D ext4_journal_blocks_per_folio(inode); - - return ext4_meta_trans_blocks(inode, - MAX_WRITEPAGES_EXTENT_LEN + bpp - 1, bpp); -} - static int ext4_journal_folio_buffers(handle_t *handle, struct folio *foli= o, size_t len) { @@ -2917,8 +2902,14 @@ static int ext4_do_writepages(struct mpage_da_data *= mpd) * not supported by delalloc. */ BUG_ON(ext4_should_journal_data(inode)); - needed_blocks =3D ext4_da_writepages_trans_blocks(inode); - + /* + * Calculate the number of credits needed to reserve for one + * extent of up to MAX_WRITEPAGES_EXTENT_LEN blocks. It will + * attempt to extend the transaction or start a new iteration + * if the reserved credits are insufficient. + */ + needed_blocks =3D ext4_chunk_trans_blocks(inode, + MAX_WRITEPAGES_EXTENT_LEN); /* start a new transaction */ handle =3D ext4_journal_start_with_reserve(inode, EXT4_HT_WRITE_PAGE, needed_blocks, rsv_blocks); --=20 2.46.1 From nobody Mon Dec 15 21:27:10 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6AA8E277027; Tue, 1 Jul 2025 13:21:04 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376066; cv=none; b=LbxRwdie98B1mqXb1QBdkYxFRBtI325WFQnLKZSKofx2fRwJdstcQKpDhcH7BRn/981muX1V6YhVbvgX6Fcnxs2mcdElV+JY0bhFRA8hi/NFLeTblik6HYyg2EiILfNJJxYdCPyQYBTZyUWVfOtM6M7xhu8kVDmh/OqrXiE8z0k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376066; c=relaxed/simple; bh=UUz/d+j4G1YWUM542UYP4iqTguhEMqe/b1oTWwaRojI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=oPpbUlW9zML06HEc185ew0+0TOj3ytsfzVRfg+KaKmY/EflqhAFLsUDXArLr/QZSSHpGCCmeIlepeUWA7XqjoiCVfe0IE4ycgeDA/i38Id+1xdgRxn9DsKbYS0AFUEaHwBntdYThecNXsOkUdIPsDNKV/b5baOUREkvOWCz8Ohg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4bWkDt6CYQzYQvPt; Tue, 1 Jul 2025 21:21:02 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id B8D241A0E9A; Tue, 1 Jul 2025 21:21:01 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgAXeCWu4GNoXmJGAQ--.26904S13; Tue, 01 Jul 2025 21:21:01 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v3 09/10] ext4: replace ext4_writepage_trans_blocks() Date: Tue, 1 Jul 2025 21:06:34 +0800 Message-ID: <20250701130635.4079595-10-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> References: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgAXeCWu4GNoXmJGAQ--.26904S13 X-Coremail-Antispam: 1UD129KBjvJXoWxtry3tF48Gr4Dtw43tr4kXrb_yoW3Zr43pa sxCF1rKr15W34kuFWI9r47Zr4aga18Cr4UXrySkrnYgayDXw1IgFn0v3WYyFy5trW8Wws0 vF4Yk34UWa1ak37anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi After ext4 supports large folios, the semantics of reserving credits in pages is no longer applicable. In most scenarios, reserving credits in extents is sufficient. Therefore, introduce ext4_chunk_trans_extent() to replace ext4_writepage_trans_blocks(). move_extent_per_page() is the only remaining location where we are still processing extents in pages. Suggested-by: Jan Kara Signed-off-by: Zhang Yi Reviewed-by: Jan Kara --- fs/ext4/ext4.h | 2 +- fs/ext4/extents.c | 6 +++--- fs/ext4/inline.c | 6 +++--- fs/ext4/inode.c | 33 +++++++++++++++------------------ fs/ext4/move_extent.c | 3 ++- fs/ext4/xattr.c | 2 +- 6 files changed, 25 insertions(+), 27 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 18373de980f2..f705046ba6c6 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -3064,9 +3064,9 @@ extern int ext4_punch_hole(struct file *file, loff_t = offset, loff_t length); extern void ext4_set_inode_flags(struct inode *, bool init); extern int ext4_alloc_da_blocks(struct inode *inode); extern void ext4_set_aops(struct inode *inode); -extern int ext4_writepage_trans_blocks(struct inode *); extern int ext4_normal_submit_inode_data_buffers(struct jbd2_inode *jinode= ); extern int ext4_chunk_trans_blocks(struct inode *, int nrblocks); +extern int ext4_chunk_trans_extent(struct inode *inode, int nrblocks); extern int ext4_meta_trans_blocks(struct inode *inode, int lblocks, int pextents); extern int ext4_zero_partial_blocks(handle_t *handle, struct inode *inode, diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index b543a46fc809..f0f155458697 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -5171,7 +5171,7 @@ ext4_ext_shift_path_extents(struct ext4_ext_path *pat= h, ext4_lblk_t shift, credits =3D depth + 2; } =20 - restart_credits =3D ext4_writepage_trans_blocks(inode); + restart_credits =3D ext4_chunk_trans_extent(inode, 0); err =3D ext4_datasem_ensure_credits(handle, inode, credits, restart_credits, 0); if (err) { @@ -5431,7 +5431,7 @@ static int ext4_collapse_range(struct file *file, lof= f_t offset, loff_t len) =20 truncate_pagecache(inode, start); =20 - credits =3D ext4_writepage_trans_blocks(inode); + credits =3D ext4_chunk_trans_extent(inode, 0); handle =3D ext4_journal_start(inode, EXT4_HT_TRUNCATE, credits); if (IS_ERR(handle)) return PTR_ERR(handle); @@ -5527,7 +5527,7 @@ static int ext4_insert_range(struct file *file, loff_= t offset, loff_t len) =20 truncate_pagecache(inode, start); =20 - credits =3D ext4_writepage_trans_blocks(inode); + credits =3D ext4_chunk_trans_extent(inode, 0); handle =3D ext4_journal_start(inode, EXT4_HT_TRUNCATE, credits); if (IS_ERR(handle)) return PTR_ERR(handle); diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c index a1bbcdf40824..d5b32d242495 100644 --- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -562,7 +562,7 @@ static int ext4_convert_inline_data_to_extent(struct ad= dress_space *mapping, return 0; } =20 - needed_blocks =3D ext4_writepage_trans_blocks(inode); + needed_blocks =3D ext4_chunk_trans_extent(inode, 1); =20 ret =3D ext4_get_inode_loc(inode, &iloc); if (ret) @@ -1864,7 +1864,7 @@ int ext4_inline_data_truncate(struct inode *inode, in= t *has_inline) }; =20 =20 - needed_blocks =3D ext4_writepage_trans_blocks(inode); + needed_blocks =3D ext4_chunk_trans_extent(inode, 1); handle =3D ext4_journal_start(inode, EXT4_HT_INODE, needed_blocks); if (IS_ERR(handle)) return PTR_ERR(handle); @@ -1979,7 +1979,7 @@ int ext4_convert_inline_data(struct inode *inode) return 0; } =20 - needed_blocks =3D ext4_writepage_trans_blocks(inode); + needed_blocks =3D ext4_chunk_trans_extent(inode, 1); =20 iloc.bh =3D NULL; error =3D ext4_get_inode_loc(inode, &iloc); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index ceaede80d791..572a70b6a934 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1295,7 +1295,8 @@ static int ext4_write_begin(struct file *file, struct= address_space *mapping, * Reserve one block more for addition to orphan list in case * we allocate blocks but write fails for some reason */ - needed_blocks =3D ext4_writepage_trans_blocks(inode) + 1; + needed_blocks =3D ext4_chunk_trans_extent(inode, + ext4_journal_blocks_per_folio(inode)) + 1; index =3D pos >> PAGE_SHIFT; =20 if (ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)) { @@ -4462,7 +4463,7 @@ int ext4_punch_hole(struct file *file, loff_t offset,= loff_t length) return ret; =20 if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) - credits =3D ext4_writepage_trans_blocks(inode); + credits =3D ext4_chunk_trans_extent(inode, 2); else credits =3D ext4_blocks_for_truncate(inode); handle =3D ext4_journal_start(inode, EXT4_HT_TRUNCATE, credits); @@ -4611,7 +4612,7 @@ int ext4_truncate(struct inode *inode) } =20 if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) - credits =3D ext4_writepage_trans_blocks(inode); + credits =3D ext4_chunk_trans_extent(inode, 1); else credits =3D ext4_blocks_for_truncate(inode); =20 @@ -6238,25 +6239,19 @@ int ext4_meta_trans_blocks(struct inode *inode, int= lblocks, int pextents) } =20 /* - * Calculate the total number of credits to reserve to fit - * the modification of a single pages into a single transaction, - * which may include multiple chunks of block allocations. - * - * This could be called via ext4_write_begin() - * - * We need to consider the worse case, when - * one new block per extent. + * Calculate the journal credits for modifying the number of blocks + * in a single extent within one transaction. 'nrblocks' is used only + * for non-extent inodes. For extent type inodes, 'nrblocks' can be + * zero if the exact number of blocks is unknown. */ -int ext4_writepage_trans_blocks(struct inode *inode) +int ext4_chunk_trans_extent(struct inode *inode, int nrblocks) { - int bpp =3D ext4_journal_blocks_per_folio(inode); int ret; =20 - ret =3D ext4_meta_trans_blocks(inode, bpp, bpp); - + ret =3D ext4_meta_trans_blocks(inode, nrblocks, 1); /* Account for data blocks for journalled mode */ if (ext4_should_journal_data(inode)) - ret +=3D bpp; + ret +=3D nrblocks; return ret; } =20 @@ -6634,10 +6629,12 @@ static int ext4_block_page_mkwrite(struct inode *in= ode, struct folio *folio, handle_t *handle; loff_t size; unsigned long len; + int credits; int ret; =20 - handle =3D ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, - ext4_writepage_trans_blocks(inode)); + credits =3D ext4_chunk_trans_extent(inode, + ext4_journal_blocks_per_folio(inode)); + handle =3D ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, credits); if (IS_ERR(handle)) return PTR_ERR(handle); =20 diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 1f8493a56e8f..adae3caf175a 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -280,7 +280,8 @@ move_extent_per_page(struct file *o_filp, struct inode = *donor_inode, */ again: *err =3D 0; - jblocks =3D ext4_writepage_trans_blocks(orig_inode) * 2; + jblocks =3D ext4_meta_trans_blocks(orig_inode, block_len_in_page, + block_len_in_page) * 2; handle =3D ext4_journal_start(orig_inode, EXT4_HT_MOVE_EXTENTS, jblocks); if (IS_ERR(handle)) { *err =3D PTR_ERR(handle); diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index 8d15acbacc20..3fb93247330d 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -962,7 +962,7 @@ int __ext4_xattr_set_credits(struct super_block *sb, st= ruct inode *inode, * so we need to reserve credits for this eventuality */ if (inode && ext4_has_inline_data(inode)) - credits +=3D ext4_writepage_trans_blocks(inode) + 1; + credits +=3D ext4_chunk_trans_extent(inode, 1) + 1; =20 /* We are done if ea_inode feature is not enabled. */ if (!ext4_has_feature_ea_inode(sb)) --=20 2.46.1 From nobody Mon Dec 15 21:27:10 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 30553276045; Tue, 1 Jul 2025 13:21:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376066; cv=none; b=fm+NkA30JNQ45Sp/bWh0q5ZXZCp35IPwCo6C5o1PJfwvLbCQ3MkWWCQUMjO1VOcp+N/f7TGpBZJw7fzv9QMquUFqKkC449Jx8zlFyTtQ4JQY4jUQ549FoAkDv2MrmdWHgxzraKu5pWqSknsgtvjXEZ1M93CzS+99MQizp54/fAg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751376066; c=relaxed/simple; bh=+SLWKRNauH3YIWfd8LOjue79rGEP1k0eSnJIJBfiVG8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=SnrlEycJz4mJVUIOMkvrhvc0Q6yTco0c7FMDthIABiZdnnQmTyDmBm++gQ3QVv6RTM8ZSj9iW85+R1eBS2hJQ6mxgU2dzUdsVnKSme6lWfZYGfzOpSFs+EfZuVjz+kjB2KWiNtoWbmUMHNj9Bquuvvk9Och+3JOBeZECndkqQRo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4bWkDv6SwnzKHN0r; Tue, 1 Jul 2025 21:21:03 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 49D161A1358; Tue, 1 Jul 2025 21:21:02 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgAXeCWu4GNoXmJGAQ--.26904S14; Tue, 01 Jul 2025 21:21:02 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v3 10/10] ext4: fix insufficient credits calculation in ext4_meta_trans_blocks() Date: Tue, 1 Jul 2025 21:06:35 +0800 Message-ID: <20250701130635.4079595-11-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> References: <20250701130635.4079595-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgAXeCWu4GNoXmJGAQ--.26904S14 X-Coremail-Antispam: 1UD129KBjvJXoW7uFy7Zw48KrW5GFWrCF1fCrg_yoW8Xw4fp3 Z5u3W8J348Ww409a18Wa12qr18Ka1kGF47WFWfJw15XFZxZr97KrsFq3WrAa45tFWSkw1q qF4ayry7Cw1UA3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi The calculation of journal credits in ext4_meta_trans_blocks() should include pextents, as each extent separately may be allocated from a different group and thus need to update different bitmap and group descriptor block. Fixes: 0e32d8617012 ("ext4: correct the journal credits calculations of all= ocating blocks") Reported-by: Jan Kara Closes: https://lore.kernel.org/linux-ext4/nhxfuu53wyacsrq7xqgxvgzcggyscu2t= babginahcygvmc45hy@t4fvmyeky33e/ Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Reviewed-by: Baokun Li --- fs/ext4/inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 572a70b6a934..a75279cceec4 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -6213,7 +6213,7 @@ int ext4_meta_trans_blocks(struct inode *inode, int l= blocks, int pextents) int ret; =20 /* - * How many index and lead blocks need to touch to map @lblocks + * How many index and leaf blocks need to touch to map @lblocks * logical blocks to @pextents physical extents? */ idxblocks =3D ext4_index_trans_blocks(inode, lblocks, pextents); @@ -6222,7 +6222,7 @@ int ext4_meta_trans_blocks(struct inode *inode, int l= blocks, int pextents) * Now let's see how many group bitmaps and group descriptors need * to account */ - groups =3D idxblocks; + groups =3D idxblocks + pextents; gdpblocks =3D groups; if (groups > ngroups) groups =3D ngroups; --=20 2.46.1