From nobody Tue Oct 7 19:22:15 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 307392D8376; Mon, 7 Jul 2025 14:22:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898180; cv=none; b=YgvyCtXy5/OKvabADNSVK1gq4wV+W0AYQexr/y13JqUbp2yPrhH9b6bUtR73bRjHnWXMWrXt4w6iqjxzbmssPH2lpTMzZ3SOQJ+RJW7BNn2e/cXkspuBpoUF2acuQqaolBjcWmfNmwT3hSmurCOJYIJ2VlbjyZTzge1z6vOAG3I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898180; c=relaxed/simple; bh=oBN5eFrvEABn1oFRHoJu9B/rIsmCrQeRltbh5GGJFMQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=tBB/IGAKFKkGZyM4NXWSZAaI5/XFrV63LSYQQWyinmiIyntYN0Tg8w46XyG72mmmDWx8QSbDkjIRCsOnGtKZFo2rmRacwCRqIidPjOBNlqTTdt60m/aReKVVK+Hv7eeUuvQWmW1l/UxznqVl55sc9mWJGKbgBh7mpFZmEOFTqQs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4bbRKY1nRLzYQtsh; Mon, 7 Jul 2025 22:22:57 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 11D901A09E9; Mon, 7 Jul 2025 22:22:56 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgBnxyQ22GtoNazLAw--.46745S5; Mon, 07 Jul 2025 22:22:55 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, naresh.kamboju@linaro.org, jiangqi903@gmail.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 01/11] ext4: process folios writeback in bytes Date: Mon, 7 Jul 2025 22:08:04 +0800 Message-ID: <20250707140814.542883-2-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250707140814.542883-1-yi.zhang@huaweicloud.com> References: <20250707140814.542883-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgBnxyQ22GtoNazLAw--.46745S5 X-Coremail-Antispam: 1UD129KBjvJXoWxtryfuFy5Ar4kWw17tFWxCrg_yoWfXry7pF WUKF909r4kX3yjgFn3ZFZrZr10k34xAr48tFy3WanIqF1Ykr18KFyjqFyqvF15KrZ2vrWx XF4Yyry8WF1xJFJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUm014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jr4l82xGYIkIc2 x26xkF7I0E14v26r4j6ryUM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0x vEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVj vjDU0xZFpf9x0JUfKs8UUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi Since ext4 supports large folios, processing writebacks in pages is no longer appropriate, it can be modified to process writebacks in bytes. Suggested-by: Jan Kara Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Tested-by: Linux Kernel Functional Testing --- fs/ext4/inode.c | 70 +++++++++++++++++++------------------ include/trace/events/ext4.h | 13 ++++--- 2 files changed, 42 insertions(+), 41 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index be9a4cba35fd..39d59274649c 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1667,11 +1667,12 @@ struct mpage_da_data { unsigned int can_map:1; /* Can writepages call map blocks? */ =20 /* These are internal state of ext4_do_writepages() */ - pgoff_t first_page; /* The first page to write */ - pgoff_t next_page; /* Current page to examine */ - pgoff_t last_page; /* Last page to examine */ + loff_t start_pos; /* The start pos to write */ + loff_t next_pos; /* Current pos to examine */ + loff_t end_pos; /* Last pos to examine */ + /* - * Extent to map - this can be after first_page because that can be + * Extent to map - this can be after start_pos because that can be * fully mapped. We somewhat abuse m_flags to store whether the extent * is delalloc or unwritten. */ @@ -1691,38 +1692,38 @@ static void mpage_release_unused_pages(struct mpage= _da_data *mpd, struct inode *inode =3D mpd->inode; struct address_space *mapping =3D inode->i_mapping; =20 - /* This is necessary when next_page =3D=3D 0. */ - if (mpd->first_page >=3D mpd->next_page) + /* This is necessary when next_pos =3D=3D 0. */ + if (mpd->start_pos >=3D mpd->next_pos) return; =20 mpd->scanned_until_end =3D 0; - index =3D mpd->first_page; - end =3D mpd->next_page - 1; if (invalidate) { ext4_lblk_t start, last; - start =3D index << (PAGE_SHIFT - inode->i_blkbits); - last =3D end << (PAGE_SHIFT - inode->i_blkbits); + start =3D EXT4_B_TO_LBLK(inode, mpd->start_pos); + last =3D mpd->next_pos >> inode->i_blkbits; =20 /* * avoid racing with extent status tree scans made by * ext4_insert_delayed_block() */ down_write(&EXT4_I(inode)->i_data_sem); - ext4_es_remove_extent(inode, start, last - start + 1); + ext4_es_remove_extent(inode, start, last - start); up_write(&EXT4_I(inode)->i_data_sem); } =20 folio_batch_init(&fbatch); - while (index <=3D end) { - nr =3D filemap_get_folios(mapping, &index, end, &fbatch); + index =3D mpd->start_pos >> PAGE_SHIFT; + end =3D mpd->next_pos >> PAGE_SHIFT; + while (index < end) { + nr =3D filemap_get_folios(mapping, &index, end - 1, &fbatch); if (nr =3D=3D 0) break; for (i =3D 0; i < nr; i++) { struct folio *folio =3D fbatch.folios[i]; =20 - if (folio->index < mpd->first_page) + if (folio_pos(folio) < mpd->start_pos) continue; - if (folio_next_index(folio) - 1 > end) + if (folio_next_index(folio) > end) continue; BUG_ON(!folio_test_locked(folio)); BUG_ON(folio_test_writeback(folio)); @@ -2024,7 +2025,7 @@ int ext4_da_get_block_prep(struct inode *inode, secto= r_t iblock, =20 static void mpage_folio_done(struct mpage_da_data *mpd, struct folio *foli= o) { - mpd->first_page +=3D folio_nr_pages(folio); + mpd->start_pos +=3D folio_size(folio); folio_unlock(folio); } =20 @@ -2034,7 +2035,7 @@ static int mpage_submit_folio(struct mpage_da_data *m= pd, struct folio *folio) loff_t size; int err; =20 - BUG_ON(folio->index !=3D mpd->first_page); + WARN_ON_ONCE(folio_pos(folio) !=3D mpd->start_pos); folio_clear_dirty_for_io(folio); /* * We have to be very careful here! Nothing protects writeback path @@ -2446,7 +2447,7 @@ static int mpage_map_and_submit_extent(handle_t *hand= le, * Update on-disk size after IO is submitted. Races with * truncate are avoided by checking i_size under i_data_sem. */ - disksize =3D ((loff_t)mpd->first_page) << PAGE_SHIFT; + disksize =3D mpd->start_pos; if (disksize > READ_ONCE(EXT4_I(inode)->i_disksize)) { int err2; loff_t i_size; @@ -2549,8 +2550,8 @@ static int mpage_prepare_extent_to_map(struct mpage_d= a_data *mpd) struct address_space *mapping =3D mpd->inode->i_mapping; struct folio_batch fbatch; unsigned int nr_folios; - pgoff_t index =3D mpd->first_page; - pgoff_t end =3D mpd->last_page; + pgoff_t index =3D mpd->start_pos >> PAGE_SHIFT; + pgoff_t end =3D mpd->end_pos >> PAGE_SHIFT; xa_mark_t tag; int i, err =3D 0; int blkbits =3D mpd->inode->i_blkbits; @@ -2565,7 +2566,7 @@ static int mpage_prepare_extent_to_map(struct mpage_d= a_data *mpd) tag =3D PAGECACHE_TAG_DIRTY; =20 mpd->map.m_len =3D 0; - mpd->next_page =3D index; + mpd->next_pos =3D mpd->start_pos; if (ext4_should_journal_data(mpd->inode)) { handle =3D ext4_journal_start(mpd->inode, EXT4_HT_WRITE_PAGE, bpp); @@ -2596,7 +2597,8 @@ static int mpage_prepare_extent_to_map(struct mpage_d= a_data *mpd) goto out; =20 /* If we can't merge this page, we are done. */ - if (mpd->map.m_len > 0 && mpd->next_page !=3D folio->index) + if (mpd->map.m_len > 0 && + mpd->next_pos !=3D folio_pos(folio)) goto out; =20 if (handle) { @@ -2642,8 +2644,8 @@ static int mpage_prepare_extent_to_map(struct mpage_d= a_data *mpd) } =20 if (mpd->map.m_len =3D=3D 0) - mpd->first_page =3D folio->index; - mpd->next_page =3D folio_next_index(folio); + mpd->start_pos =3D folio_pos(folio); + mpd->next_pos =3D folio_pos(folio) + folio_size(folio); /* * Writeout when we cannot modify metadata is simple. * Just submit the page. For data=3Djournal mode we @@ -2786,18 +2788,18 @@ static int ext4_do_writepages(struct mpage_da_data = *mpd) writeback_index =3D mapping->writeback_index; if (writeback_index) cycled =3D 0; - mpd->first_page =3D writeback_index; - mpd->last_page =3D -1; + mpd->start_pos =3D writeback_index << PAGE_SHIFT; + mpd->end_pos =3D LLONG_MAX; } else { - mpd->first_page =3D wbc->range_start >> PAGE_SHIFT; - mpd->last_page =3D wbc->range_end >> PAGE_SHIFT; + mpd->start_pos =3D wbc->range_start; + mpd->end_pos =3D wbc->range_end; } =20 ext4_io_submit_init(&mpd->io_submit, wbc); retry: if (wbc->sync_mode =3D=3D WB_SYNC_ALL || wbc->tagged_writepages) - tag_pages_for_writeback(mapping, mpd->first_page, - mpd->last_page); + tag_pages_for_writeback(mapping, mpd->start_pos >> PAGE_SHIFT, + mpd->end_pos >> PAGE_SHIFT); blk_start_plug(&plug); =20 /* @@ -2857,7 +2859,7 @@ static int ext4_do_writepages(struct mpage_da_data *m= pd) } mpd->do_map =3D 1; =20 - trace_ext4_da_write_pages(inode, mpd->first_page, wbc); + trace_ext4_da_write_pages(inode, mpd->start_pos, wbc); ret =3D mpage_prepare_extent_to_map(mpd); if (!ret && mpd->map.m_len) ret =3D mpage_map_and_submit_extent(handle, mpd, @@ -2914,8 +2916,8 @@ static int ext4_do_writepages(struct mpage_da_data *m= pd) blk_finish_plug(&plug); if (!ret && !cycled && wbc->nr_to_write > 0) { cycled =3D 1; - mpd->last_page =3D writeback_index - 1; - mpd->first_page =3D 0; + mpd->end_pos =3D (writeback_index << PAGE_SHIFT) - 1; + mpd->start_pos =3D 0; goto retry; } =20 @@ -2925,7 +2927,7 @@ static int ext4_do_writepages(struct mpage_da_data *m= pd) * Set the writeback_index so that range_cyclic * mode will write it back later */ - mapping->writeback_index =3D mpd->first_page; + mapping->writeback_index =3D mpd->start_pos >> PAGE_SHIFT; =20 out_writepages: trace_ext4_writepages_result(inode, wbc, ret, diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index 156908641e68..62d52997b5c6 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -483,15 +483,15 @@ TRACE_EVENT(ext4_writepages, ); =20 TRACE_EVENT(ext4_da_write_pages, - TP_PROTO(struct inode *inode, pgoff_t first_page, + TP_PROTO(struct inode *inode, loff_t start_pos, struct writeback_control *wbc), =20 - TP_ARGS(inode, first_page, wbc), + TP_ARGS(inode, start_pos, wbc), =20 TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) - __field( pgoff_t, first_page ) + __field( loff_t, start_pos ) __field( long, nr_to_write ) __field( int, sync_mode ) ), @@ -499,15 +499,14 @@ TRACE_EVENT(ext4_da_write_pages, TP_fast_assign( __entry->dev =3D inode->i_sb->s_dev; __entry->ino =3D inode->i_ino; - __entry->first_page =3D first_page; + __entry->start_pos =3D start_pos; __entry->nr_to_write =3D wbc->nr_to_write; __entry->sync_mode =3D wbc->sync_mode; ), =20 - TP_printk("dev %d,%d ino %lu first_page %lu nr_to_write %ld " - "sync_mode %d", + TP_printk("dev %d,%d ino %lu start_pos 0x%llx nr_to_write %ld sync_mode %= d", MAJOR(__entry->dev), MINOR(__entry->dev), - (unsigned long) __entry->ino, __entry->first_page, + (unsigned long) __entry->ino, __entry->start_pos, __entry->nr_to_write, __entry->sync_mode) ); =20 --=20 2.46.1 From nobody Tue Oct 7 19:22:15 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 31EF72D879D; Mon, 7 Jul 2025 14:22:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898180; cv=none; b=jUB8QT8QBPJ7lA5IyOwzWYJl8XVverZvaY/M9vz+KprWT8FLqgQf7/an5pSB2PmC+CyXqrjjy2InsIwcK0SuoGvPtUj9LtdsqJVaC+cyQ4VIzJjF2OSLMeidbFudNNxRnZP4kZ7MQjOAIUjrIrCLfRKvW8wCZhZN4f3W9Yf+S6w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898180; c=relaxed/simple; bh=BYnrg8GfctL0IYNIxCV3Xbhk6Qmu8KmRU6hmRWk3rVg=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=GIkZdxL4aRYIO/BEQ7dkKR7xNCHrgbjY84YE2ObMnel1wJBLXdbkbkzfnE72yfFvCR+MX8Ll07f7NyfWSOy9cC06erGfTIymXwSccrbS/yv39BASQrrhYhrOJbUi/CixC2iK4fdaOVqHczfWdwr8Sg0PKvR6OIjO5dtHEe+R6jw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4bbRKZ1Rh3zKHMWF; Mon, 7 Jul 2025 22:22:58 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id A21771A113F; Mon, 7 Jul 2025 22:22:56 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgBnxyQ22GtoNazLAw--.46745S6; Mon, 07 Jul 2025 22:22:56 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, naresh.kamboju@linaro.org, jiangqi903@gmail.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 02/11] ext4: move the calculation of wbc->nr_to_write to mpage_folio_done() Date: Mon, 7 Jul 2025 22:08:05 +0800 Message-ID: <20250707140814.542883-3-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250707140814.542883-1-yi.zhang@huaweicloud.com> References: <20250707140814.542883-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgBnxyQ22GtoNazLAw--.46745S6 X-Coremail-Antispam: 1UD129KBjvJXoW7Kw1UKr4DCFW3Gw1xuw47CFg_yoW8Jw17pF Z8Ka4kGFW8Zr909Fn7WFsxZr1xta4fGw4UXFW7Kw13XFy5Ar95KF47t34Y9F4ftrWkJ3yI qF48JFy5ua17AFDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUm014x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_Jryl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Gr0_Cr1lIxAIcVCF04k26cxKx2IYs7xG6r1j6r1xMIIF0x vEx4A2jsIE14v26r1j6r4UMIIF0xvEx4A2jsIEc7CjxVAFwI0_Gr0_Gr1UYxBIdaVFxhVj vjDU0xZFpf9x0JUADGOUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi mpage_folio_done() should be a more appropriate place than mpage_submit_folio() for updating the wbc->nr_to_write after we have submitted a fully mapped folio. Preparing to make mpage_submit_folio() allows to submit partially mapped folio that is still under processing. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Reviewed-by: Baokun Li Tested-by: Linux Kernel Functional Testing --- fs/ext4/inode.c | 3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 39d59274649c..a88ed7f51afc 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2026,6 +2026,7 @@ int ext4_da_get_block_prep(struct inode *inode, secto= r_t iblock, static void mpage_folio_done(struct mpage_da_data *mpd, struct folio *foli= o) { mpd->start_pos +=3D folio_size(folio); + mpd->wbc->nr_to_write -=3D folio_nr_pages(folio); folio_unlock(folio); } =20 @@ -2056,8 +2057,6 @@ static int mpage_submit_folio(struct mpage_da_data *m= pd, struct folio *folio) !ext4_verity_in_progress(mpd->inode)) len =3D size & (len - 1); err =3D ext4_bio_write_folio(&mpd->io_submit, folio, len); - if (!err) - mpd->wbc->nr_to_write -=3D folio_nr_pages(folio); =20 return err; } --=20 2.46.1 From nobody Tue Oct 7 19:22:15 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 77F9E2D8DA8; Mon, 7 Jul 2025 14:22:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898182; cv=none; b=D5NEkk1hz3GIqT0S6hCMZj7BDxM0uMv57RbB3e7xJqfBCgintvwwGeh8H2mlq3Nys+RmTR5dq5zsSML4E/oUCoH7zHeM3oLWLyZ5/mKBqzADjOi7iwzO/o7yAoxxeltlHbOGXE7YKPBflhBrgj0d7lzmXWxqkbpjTdT7sQrFdZU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898182; c=relaxed/simple; bh=/8a14RKx+/vu23OotjhAuMT/bhp+m/jgRNfnpIWwJx8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=UGOmN01ep6a8h8x22UZflmhT9V5cS6RQsAHYIfjY8uZcCx8xIeFUzredwbPSiZtvscKp/iOeSHfZTu5vT6+6BIqNyzjASP+juEil2GAC6kbl6pPkyYnr845aJnHcRvN0XnMZDlwlFLKIVnK7MPn9XIhn6sZJXriMdWUIz/d3WJc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4bbRKZ5KWZzKHMhd; Mon, 7 Jul 2025 22:22:58 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 34D9F1A018C; Mon, 7 Jul 2025 22:22:57 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgBnxyQ22GtoNazLAw--.46745S7; Mon, 07 Jul 2025 22:22:56 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, naresh.kamboju@linaro.org, jiangqi903@gmail.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 03/11] ext4: fix stale data if it bail out of the extents mapping loop Date: Mon, 7 Jul 2025 22:08:06 +0800 Message-ID: <20250707140814.542883-4-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250707140814.542883-1-yi.zhang@huaweicloud.com> References: <20250707140814.542883-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgBnxyQ22GtoNazLAw--.46745S7 X-Coremail-Antispam: 1UD129KBjvJXoWxWryDXr47uFyDGw18GrWfKrg_yoW5ZryrpF Wakwn8Gr4kGayak393JFs7Zr1Fk395JrWUXFW7GrZrZFy5tFySkr4xt3WYvFW5JrykAFy0 qr45Kr1UW3WUAFDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmY14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JrWl82xGYIkIc2 x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2z4x0 Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJw A2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq3wAS 0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7IYx2 IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4UM4x0 Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2kIc2 xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkEbVWU JVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67AF67 kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI42IY 6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCwCI42 IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnIWIev Ja73UjIFyTuYvjfUF3kuDUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi During the process of writing back folios, if mpage_map_and_submit_extent() exits the extent mapping loop due to an ENOSPC or ENOMEM error, it may result in stale data or filesystem inconsistency in environments where the block size is smaller than the folio size. When mapping a discontinuous folio in mpage_map_and_submit_extent(), some buffers may have already be mapped. If we exit the mapping loop prematurely, the folio data within the mapped range will not be written back, and the file's disk size will not be updated. Once the transaction that includes this range of extents is committed, this can lead to stale data or filesystem inconsistency. Fix this by submitting the current processing partially mapped folio. Suggested-by: Jan Kara Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Tested-by: Linux Kernel Functional Testing --- fs/ext4/inode.c | 51 ++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 50 insertions(+), 1 deletion(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index a88ed7f51afc..a59d148b9185 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2361,6 +2361,47 @@ static int mpage_map_one_extent(handle_t *handle, st= ruct mpage_da_data *mpd) return 0; } =20 +/* + * This is used to submit mapped buffers in a single folio that is not ful= ly + * mapped for various reasons, such as insufficient space or journal credi= ts. + */ +static int mpage_submit_partial_folio(struct mpage_da_data *mpd) +{ + struct inode *inode =3D mpd->inode; + struct folio *folio; + loff_t pos; + int ret; + + folio =3D filemap_get_folio(inode->i_mapping, + mpd->start_pos >> PAGE_SHIFT); + if (IS_ERR(folio)) + return PTR_ERR(folio); + /* + * The mapped position should be within the current processing folio + * but must not be the folio start position. + */ + pos =3D ((loff_t)mpd->map.m_lblk) << inode->i_blkbits; + if (WARN_ON_ONCE((folio_pos(folio) =3D=3D pos) || + !folio_contains(folio, pos >> PAGE_SHIFT))) + return -EINVAL; + + ret =3D mpage_submit_folio(mpd, folio); + if (ret) + goto out; + /* + * Update start_pos to prevent this folio from being released in + * mpage_release_unused_pages(), it will be reset to the aligned folio + * pos when this folio is written again in the next round. Additionally, + * do not update wbc->nr_to_write here, as it will be updated once the + * entire folio has finished processing. + */ + mpd->start_pos =3D pos; +out: + folio_unlock(folio); + folio_put(folio); + return ret; +} + /* * mpage_map_and_submit_extent - map extent starting at mpd->lblk of length * mpd->len and submit pages underlying it for IO @@ -2411,8 +2452,16 @@ static int mpage_map_and_submit_extent(handle_t *han= dle, */ if ((err =3D=3D -ENOMEM) || (err =3D=3D -ENOSPC && ext4_count_free_clusters(sb))) { - if (progress) + /* + * We may have already allocated extents for + * some bhs inside the folio, issue the + * corresponding data to prevent stale data. + */ + if (progress) { + if (mpage_submit_partial_folio(mpd)) + goto invalidate_dirty_pages; goto update_disksize; + } return err; } ext4_msg(sb, KERN_CRIT, --=20 2.46.1 From nobody Tue Oct 7 19:22:15 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A91622D8DAA; Mon, 7 Jul 2025 14:22:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898181; cv=none; b=cmuThspm3AByi6mGSC4Tbzv4uAdz3YuNAoLL0rog98bwxqlUmbdtSv5adX+I/tTtsecWIkCrqEIOrxqdU3I75R9L8ujZ2HL+dADygEtUiFAztEX+fUb/QP/q6idU0fRNhYWDrAmJtluGsbmqI5csHgF+STqo6pkITLWUfcW2jAk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898181; c=relaxed/simple; bh=XO6d4bPTCkVk1lbDioUFFaw4Np8AxFS9/x8a88uVd08=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cSDkJ+KQfR2tLAUnfzwhaQ0syqqGQZgxdfcMX1VAZeksSHHuB2uu1cOKN66XakfQipD5S8VyepoJX/xc6Ifaj/aOrgQ65+zuHwXJthmbz1oFzd1vyj+B/8IYiwnpoL8oIL/Bk5t8v1nuQfIs9xl9qvzKy4Fw2sPU1tzdGtg4lVI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4bbRKb2NjlzKHMhj; Mon, 7 Jul 2025 22:22:59 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id C4C8E1A1A74; Mon, 7 Jul 2025 22:22:57 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgBnxyQ22GtoNazLAw--.46745S8; Mon, 07 Jul 2025 22:22:57 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, naresh.kamboju@linaro.org, jiangqi903@gmail.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 04/11] ext4: refactor the block allocation process of ext4_page_mkwrite() Date: Mon, 7 Jul 2025 22:08:07 +0800 Message-ID: <20250707140814.542883-5-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250707140814.542883-1-yi.zhang@huaweicloud.com> References: <20250707140814.542883-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgBnxyQ22GtoNazLAw--.46745S8 X-Coremail-Antispam: 1UD129KBjvJXoWxAF15KF4DGFW5WFyUXFWkJFb_yoWruF4Upr y3Kr95ur47u34DWFs3WF4DZF13Ka4vgrWUGFyxGr1fZ3W3trnxKF4rt3WvyF4UtrW3Xan2 qF4UAFyUu3WjgrDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmI14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUJVWUCwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4j6r4UJbIYCTnI WIevJa73UjIFyTuYvjfUriihUUUUU X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi The block allocation process and error handling in ext4_page_mkwrite() is complex now. Refactor it by introducing a new helper function, ext4_block_page_mkwrite(). It will call ext4_block_write_begin() to allocate blocks instead of directly calling block_page_mkwrite(). Preparing to implement retry logic in a subsequent patch to address situations where the reserved journal credits are insufficient. Additionally, this modification will help prevent potential deadlocks that may occur when waiting for folio writeback while holding the transaction handle. Suggested-by: Jan Kara Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Tested-by: Linux Kernel Functional Testing --- fs/ext4/inode.c | 95 ++++++++++++++++++++++++++----------------------- 1 file changed, 50 insertions(+), 45 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index a59d148b9185..e73d5379b8f0 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -6605,6 +6605,53 @@ static int ext4_bh_unmapped(handle_t *handle, struct= inode *inode, return !buffer_mapped(bh); } =20 +static int ext4_block_page_mkwrite(struct inode *inode, struct folio *foli= o, + get_block_t get_block) +{ + handle_t *handle; + loff_t size; + unsigned long len; + int ret; + + handle =3D ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, + ext4_writepage_trans_blocks(inode)); + if (IS_ERR(handle)) + return PTR_ERR(handle); + + folio_lock(folio); + size =3D i_size_read(inode); + /* Page got truncated from under us? */ + if (folio->mapping !=3D inode->i_mapping || folio_pos(folio) > size) { + ret =3D -EFAULT; + goto out_error; + } + + len =3D folio_size(folio); + if (folio_pos(folio) + len > size) + len =3D size - folio_pos(folio); + + ret =3D ext4_block_write_begin(handle, folio, 0, len, get_block); + if (ret) + goto out_error; + + if (!ext4_should_journal_data(inode)) { + block_commit_write(folio, 0, len); + folio_mark_dirty(folio); + } else { + ret =3D ext4_journal_folio_buffers(handle, folio, len); + if (ret) + goto out_error; + } + ext4_journal_stop(handle); + folio_wait_stable(folio); + return ret; + +out_error: + folio_unlock(folio); + ext4_journal_stop(handle); + return ret; +} + vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) { struct vm_area_struct *vma =3D vmf->vma; @@ -6616,8 +6663,7 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) struct file *file =3D vma->vm_file; struct inode *inode =3D file_inode(file); struct address_space *mapping =3D inode->i_mapping; - handle_t *handle; - get_block_t *get_block; + get_block_t *get_block =3D ext4_get_block; int retries =3D 0; =20 if (unlikely(IS_IMMUTABLE(inode))) @@ -6685,46 +6731,9 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) /* OK, we need to fill the hole... */ if (ext4_should_dioread_nolock(inode)) get_block =3D ext4_get_block_unwritten; - else - get_block =3D ext4_get_block; retry_alloc: - handle =3D ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, - ext4_writepage_trans_blocks(inode)); - if (IS_ERR(handle)) { - ret =3D VM_FAULT_SIGBUS; - goto out; - } - /* - * Data journalling can't use block_page_mkwrite() because it - * will set_buffer_dirty() before do_journal_get_write_access() - * thus might hit warning messages for dirty metadata buffers. - */ - if (!ext4_should_journal_data(inode)) { - err =3D block_page_mkwrite(vma, vmf, get_block); - } else { - folio_lock(folio); - size =3D i_size_read(inode); - /* Page got truncated from under us? */ - if (folio->mapping !=3D mapping || folio_pos(folio) > size) { - ret =3D VM_FAULT_NOPAGE; - goto out_error; - } - - len =3D folio_size(folio); - if (folio_pos(folio) + len > size) - len =3D size - folio_pos(folio); - - err =3D ext4_block_write_begin(handle, folio, 0, len, - ext4_get_block); - if (!err) { - ret =3D VM_FAULT_SIGBUS; - if (ext4_journal_folio_buffers(handle, folio, len)) - goto out_error; - } else { - folio_unlock(folio); - } - } - ext4_journal_stop(handle); + /* Start journal and allocate blocks */ + err =3D ext4_block_page_mkwrite(inode, folio, get_block); if (err =3D=3D -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)) goto retry_alloc; out_ret: @@ -6733,8 +6742,4 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) filemap_invalidate_unlock_shared(mapping); sb_end_pagefault(inode->i_sb); return ret; -out_error: - folio_unlock(folio); - ext4_journal_stop(handle); - goto out; } --=20 2.46.1 From nobody Tue Oct 7 19:22:15 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1794F2D8DBA; Mon, 7 Jul 2025 14:22:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898182; cv=none; b=lZHELgtqNqZQdezA4riIK3RdANwbFCpekJfcZcbt1m9bkduPMb8BXj3Scn+XhSCqyVBpus/DZJXDhFYcGhKbrr889dNjpxvZ8kc9XX3Ch8dklMUYDG5EYuwAtGJYNsIJjf2ARxICpXtfvuQpXOzahvHTvkAM5yPKVk6+Fe7d8PQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898182; c=relaxed/simple; bh=mSpaELxxjIRKdvL366xC/wHU3Z4XsTkmkVgQr6zu/pc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=VJhBeET/6U6UD67h8QMz4RvnwVV1DCWgr8eZU/KW5lC9b40HJsLgtRjp7sTjrX+bRq86nrYpleMPPiMMrIU2jsfoCsr9yXhGqFTjCHN7JJrmiTpHvqLCfhEJtWDhx3NQySS/4TkfGuR+EbCaLs3tu/TUIMTOfWkknFIGDUbxhuw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4bbRKb3yGhzYQv6V; Mon, 7 Jul 2025 22:22:59 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 5D1C71A0AF1; Mon, 7 Jul 2025 22:22:58 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgBnxyQ22GtoNazLAw--.46745S9; Mon, 07 Jul 2025 22:22:58 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, naresh.kamboju@linaro.org, jiangqi903@gmail.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 05/11] ext4: restart handle if credits are insufficient during allocating blocks Date: Mon, 7 Jul 2025 22:08:08 +0800 Message-ID: <20250707140814.542883-6-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250707140814.542883-1-yi.zhang@huaweicloud.com> References: <20250707140814.542883-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgBnxyQ22GtoNazLAw--.46745S9 X-Coremail-Antispam: 1UD129KBjvJXoWxCF17Zw4kWw1rCw45tF1fWFg_yoWrZF45pr W3CFy5Gr17Wry3Wa1Sqw4DXF13W3W0yrWUJF93W3s0va48Gr9xKFs8tF1YyFWvkrWkWa13 XF4jkryUWayjyrJanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUma14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Cr0_Gr1UMIIF0xvE42xK8VAvwI8IcIk0rVWUJVWUCw CI42IY6I8E87Iv67AKxVWUJVW8JwCI42IY6I8E87Iv6xkF7I0E14v26r4UJVWxJrUvcSsG vfC2KfnxnUUI43ZEXa7VU1zpBDUUUUU== X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi After large folios are supported on ext4, writing back a sufficiently large and discontinuous folio may consume a significant number of journal credits, placing considerable strain on the journal. For example, in a 20GB filesystem with 1K block size and 1MB journal size, writing back a 2MB folio could require thousands of credits in the worst-case scenario (when each block is discontinuous and distributed across different block groups), potentially exceeding the journal size. This issue can also occur in ext4_write_begin() and ext4_page_mkwrite() when delalloc is not enabled. Fix this by ensuring that there are sufficient journal credits before allocating an extent in mpage_map_one_extent() and ext4_block_write_begin(). If there are not enough credits, return -EAGAIN, exit the current mapping loop, restart a new handle and a new transaction, and allocating blocks on this folio again in the next iteration. Suggested-by: Jan Kara Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Tested-by: Linux Kernel Functional Testing --- fs/ext4/inode.c | 41 ++++++++++++++++++++++++++++++++++++----- 1 file changed, 36 insertions(+), 5 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index e73d5379b8f0..10d4f86a5c15 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -877,6 +877,26 @@ static void ext4_update_bh_state(struct buffer_head *b= h, unsigned long flags) } while (unlikely(!try_cmpxchg(&bh->b_state, &old_state, new_state))); } =20 +/* + * Make sure that the current journal transaction has enough credits to map + * one extent. Return -EAGAIN if it cannot extend the current running + * transaction. + */ +static inline int ext4_journal_ensure_extent_credits(handle_t *handle, + struct inode *inode) +{ + int credits; + int ret; + + /* Called from ext4_da_write_begin() which has no handle started? */ + if (!handle) + return 0; + + credits =3D ext4_chunk_trans_blocks(inode, 1); + ret =3D __ext4_journal_ensure_credits(handle, credits, credits, 0); + return ret <=3D 0 ? ret : -EAGAIN; +} + static int _ext4_get_block(struct inode *inode, sector_t iblock, struct buffer_head *bh, int flags) { @@ -1175,7 +1195,9 @@ int ext4_block_write_begin(handle_t *handle, struct f= olio *folio, clear_buffer_new(bh); if (!buffer_mapped(bh)) { WARN_ON(bh->b_size !=3D blocksize); - err =3D get_block(inode, block, bh, 1); + err =3D ext4_journal_ensure_extent_credits(handle, inode); + if (!err) + err =3D get_block(inode, block, bh, 1); if (err) break; if (buffer_new(bh)) { @@ -1374,8 +1396,9 @@ static int ext4_write_begin(struct file *file, struct= address_space *mapping, ext4_orphan_del(NULL, inode); } =20 - if (ret =3D=3D -ENOSPC && - ext4_should_retry_alloc(inode->i_sb, &retries)) + if (ret =3D=3D -EAGAIN || + (ret =3D=3D -ENOSPC && + ext4_should_retry_alloc(inode->i_sb, &retries))) goto retry_journal; folio_put(folio); return ret; @@ -2323,6 +2346,11 @@ static int mpage_map_one_extent(handle_t *handle, st= ruct mpage_da_data *mpd) int get_blocks_flags; int err, dioread_nolock; =20 + /* Make sure transaction has enough credits for this extent */ + err =3D ext4_journal_ensure_extent_credits(handle, inode); + if (err < 0) + return err; + trace_ext4_da_write_pages_extent(inode, map); /* * Call ext4_map_blocks() to allocate any delayed allocation blocks, or @@ -2450,7 +2478,7 @@ static int mpage_map_and_submit_extent(handle_t *hand= le, * In the case of ENOSPC, if ext4_count_free_blocks() * is non-zero, a commit should free up blocks. */ - if ((err =3D=3D -ENOMEM) || + if ((err =3D=3D -ENOMEM) || (err =3D=3D -EAGAIN) || (err =3D=3D -ENOSPC && ext4_count_free_clusters(sb))) { /* * We may have already allocated extents for @@ -2956,6 +2984,8 @@ static int ext4_do_writepages(struct mpage_da_data *m= pd) ret =3D 0; continue; } + if (ret =3D=3D -EAGAIN) + ret =3D 0; /* Fatal error - ENOMEM, EIO... */ if (ret) break; @@ -6734,7 +6764,8 @@ vm_fault_t ext4_page_mkwrite(struct vm_fault *vmf) retry_alloc: /* Start journal and allocate blocks */ err =3D ext4_block_page_mkwrite(inode, folio, get_block); - if (err =3D=3D -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)) + if (err =3D=3D -EAGAIN || + (err =3D=3D -ENOSPC && ext4_should_retry_alloc(inode->i_sb, &retries)= )) goto retry_alloc; out_ret: ret =3D vmf_fs_error(err); --=20 2.46.1 From nobody Tue Oct 7 19:22:15 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 09A8C2D94B0; Mon, 7 Jul 2025 14:23:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898183; cv=none; b=KpoRbIo+G2qNsrYg/TlNgySZ9fQipNhyqi5vhVTMIfFB53wv9M/5+/mgM1jDMmgxosHGoxGzeBx9FB+tUlyIdsGVKTtQXEBKlseHCVw5wzQEQYhTCMmGYM6JjjBMc09vY+3lT/KVlPZYGTeIy1WFvayqSUtdb2r21eExqVXxM6w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898183; c=relaxed/simple; bh=e5AlF55mgUXWP5hKltKYm9ZfVlK26l3HHPO2XHQGZBU=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=YxRL1+OKrQx/FLCmK8r7hweMOfBLFOaYj0k8X6nLbKCMlmo2lh0ppiMAdC/rXN9PCqA9D/OCgo5urosae1UfdRtmVcs12sOzHIXLQ/xNuHDAvzH15gSpXkuihqAqUvMmDw/2r1bq7dIg/o7lD9i/3HI3sSeIRuApY/QzKO8OWOk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4bbRKc3LWDzKHMlf; Mon, 7 Jul 2025 22:23:00 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id E6B8B1A1A82; Mon, 7 Jul 2025 22:22:58 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgBnxyQ22GtoNazLAw--.46745S10; Mon, 07 Jul 2025 22:22:58 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, naresh.kamboju@linaro.org, jiangqi903@gmail.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 06/11] ext4: enhance tracepoints during the folios writeback Date: Mon, 7 Jul 2025 22:08:09 +0800 Message-ID: <20250707140814.542883-7-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250707140814.542883-1-yi.zhang@huaweicloud.com> References: <20250707140814.542883-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgBnxyQ22GtoNazLAw--.46745S10 X-Coremail-Antispam: 1UD129KBjvJXoWxuF17Aw47Xr17Xr45Zw1fWFg_yoWrAry7pF WqkF95Wrs7Zw4Y93WfZa1UZr4FvFykur47tr13WFyDXw1xAr1kKa17KryqyFyjyrZ2kryI qF4qk3sxC3WxWrDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi After mpage_map_and_submit_extent() supports restarting handle if credits are insufficient during allocating blocks, it is more likely to exit the current mapping iteration and continue to process the current processing partially mapped folio again. The existing tracepoints are not sufficient to track this situation, so enhance the tracepoints to track the writeback position and the return value before and after submitting the folios. Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Tested-by: Linux Kernel Functional Testing --- fs/ext4/inode.c | 5 ++++- include/trace/events/ext4.h | 42 ++++++++++++++++++++++++++++++++----- 2 files changed, 41 insertions(+), 6 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 10d4f86a5c15..51effbad90e5 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2935,7 +2935,8 @@ static int ext4_do_writepages(struct mpage_da_data *m= pd) } mpd->do_map =3D 1; =20 - trace_ext4_da_write_pages(inode, mpd->start_pos, wbc); + trace_ext4_da_write_folios_start(inode, mpd->start_pos, + mpd->next_pos, wbc); ret =3D mpage_prepare_extent_to_map(mpd); if (!ret && mpd->map.m_len) ret =3D mpage_map_and_submit_extent(handle, mpd, @@ -2973,6 +2974,8 @@ static int ext4_do_writepages(struct mpage_da_data *m= pd) } else ext4_put_io_end(mpd->io_submit.io_end); mpd->io_submit.io_end =3D NULL; + trace_ext4_da_write_folios_end(inode, mpd->start_pos, + mpd->next_pos, wbc, ret); =20 if (ret =3D=3D -ENOSPC && sbi->s_journal) { /* diff --git a/include/trace/events/ext4.h b/include/trace/events/ext4.h index 62d52997b5c6..845451077c41 100644 --- a/include/trace/events/ext4.h +++ b/include/trace/events/ext4.h @@ -482,16 +482,17 @@ TRACE_EVENT(ext4_writepages, (unsigned long) __entry->writeback_index) ); =20 -TRACE_EVENT(ext4_da_write_pages, - TP_PROTO(struct inode *inode, loff_t start_pos, +TRACE_EVENT(ext4_da_write_folios_start, + TP_PROTO(struct inode *inode, loff_t start_pos, loff_t next_pos, struct writeback_control *wbc), =20 - TP_ARGS(inode, start_pos, wbc), + TP_ARGS(inode, start_pos, next_pos, wbc), =20 TP_STRUCT__entry( __field( dev_t, dev ) __field( ino_t, ino ) __field( loff_t, start_pos ) + __field( loff_t, next_pos ) __field( long, nr_to_write ) __field( int, sync_mode ) ), @@ -500,16 +501,47 @@ TRACE_EVENT(ext4_da_write_pages, __entry->dev =3D inode->i_sb->s_dev; __entry->ino =3D inode->i_ino; __entry->start_pos =3D start_pos; + __entry->next_pos =3D next_pos; __entry->nr_to_write =3D wbc->nr_to_write; __entry->sync_mode =3D wbc->sync_mode; ), =20 - TP_printk("dev %d,%d ino %lu start_pos 0x%llx nr_to_write %ld sync_mode %= d", + TP_printk("dev %d,%d ino %lu start_pos 0x%llx next_pos 0x%llx nr_to_write= %ld sync_mode %d", MAJOR(__entry->dev), MINOR(__entry->dev), - (unsigned long) __entry->ino, __entry->start_pos, + (unsigned long) __entry->ino, __entry->start_pos, __entry->next_pos, __entry->nr_to_write, __entry->sync_mode) ); =20 +TRACE_EVENT(ext4_da_write_folios_end, + TP_PROTO(struct inode *inode, loff_t start_pos, loff_t next_pos, + struct writeback_control *wbc, int ret), + + TP_ARGS(inode, start_pos, next_pos, wbc, ret), + + TP_STRUCT__entry( + __field( dev_t, dev ) + __field( ino_t, ino ) + __field( loff_t, start_pos ) + __field( loff_t, next_pos ) + __field( long, nr_to_write ) + __field( int, ret ) + ), + + TP_fast_assign( + __entry->dev =3D inode->i_sb->s_dev; + __entry->ino =3D inode->i_ino; + __entry->start_pos =3D start_pos; + __entry->next_pos =3D next_pos; + __entry->nr_to_write =3D wbc->nr_to_write; + __entry->ret =3D ret; + ), + + TP_printk("dev %d,%d ino %lu start_pos 0x%llx next_pos 0x%llx nr_to_write= %ld ret %d", + MAJOR(__entry->dev), MINOR(__entry->dev), + (unsigned long) __entry->ino, __entry->start_pos, __entry->next_pos, + __entry->nr_to_write, __entry->ret) +); + TRACE_EVENT(ext4_da_write_pages_extent, TP_PROTO(struct inode *inode, struct ext4_map_blocks *map), =20 --=20 2.46.1 From nobody Tue Oct 7 19:22:15 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 92A082D979B; Mon, 7 Jul 2025 14:23:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898183; cv=none; b=r/WEJqJqtyDaqaxDV5K1plOTxIIvoaf2cSNrB0Kchl3WTBAQS4XuAGLB2Q/f7t3NVRtKgwQh0kBbITaE/fvBPKCXRgm23K5x/asGpGlOZECavKmORuZg7/Y7T2DNKypPzvT9h5bibSaAG3TK8n1f/WyYnsUxkPHna0t4e/eYKRU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898183; c=relaxed/simple; bh=Tlr4jkPGncwYUzYVHVlanVU+MejPUHbRbaB44ARJM34=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kvO7Q57TPJp8r7SHlfbox7pCEmLXv5n+UU6QMIHfrSZSL1zJb7WYUEShBp8ZfAdV16n4Jir6XDhxNufD3CDpC0A7+KW3gksqT4BwFn7gWJdOwjhdbK3BI0nI8se8eAwRUZ1kBI8DejiQWTPtN4kDpnWOQPW58KHBXXghTPnVhPM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.216]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4bbRKd0VX9zKHMkt; Mon, 7 Jul 2025 22:23:01 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 8492D1A1A87; Mon, 7 Jul 2025 22:22:59 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgBnxyQ22GtoNazLAw--.46745S11; Mon, 07 Jul 2025 22:22:59 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, naresh.kamboju@linaro.org, jiangqi903@gmail.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 07/11] ext4: correct the reserved credits for extent conversion Date: Mon, 7 Jul 2025 22:08:10 +0800 Message-ID: <20250707140814.542883-8-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250707140814.542883-1-yi.zhang@huaweicloud.com> References: <20250707140814.542883-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgBnxyQ22GtoNazLAw--.46745S11 X-Coremail-Antispam: 1UD129KBjvJXoW7Ar4kAw4fJFyDJFyxZr1fZwb_yoW8Gw45pF nxGFykWF18u348uana93W7AF1fCayxC3yUXF4fCw1UXa98GryxKr1qgw1rtF1UJrWxJrWr ZF47CryUu3W3Z3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Jr0_Gr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi Now, we reserve journal credits for converting extents in only one page to written state when the I/O operation is complete. This is insufficient when large folio is enabled. Fix this by reserving credits for converting up to one extent per block in the largest 2MB folio, this calculation should only involve extents index and leaf blocks, so it should not estimate too many credits. Fixes: 7ac67301e82f ("ext4: enable large folio for regular file") Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Reviewed-by: Baokun Li Tested-by: Linux Kernel Functional Testing --- fs/ext4/inode.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 51effbad90e5..3ed4bc6c02f8 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2849,12 +2849,12 @@ static int ext4_do_writepages(struct mpage_da_data = *mpd) mpd->journalled_more_data =3D 0; =20 if (ext4_should_dioread_nolock(inode)) { + int bpf =3D ext4_journal_blocks_per_folio(inode); /* * We may need to convert up to one extent per block in - * the page and we may dirty the inode. + * the folio and we may dirty the inode. */ - rsv_blocks =3D 1 + ext4_chunk_trans_blocks(inode, - PAGE_SIZE >> inode->i_blkbits); + rsv_blocks =3D 1 + ext4_ext_index_trans_blocks(inode, bpf); } =20 if (wbc->range_start =3D=3D 0 && wbc->range_end =3D=3D LLONG_MAX) --=20 2.46.1 From nobody Tue Oct 7 19:22:15 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E69072DA747; Mon, 7 Jul 2025 14:23:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898185; cv=none; b=frJ8PhMMGt41yAANzCMxsfYwyMDLkVG7o5G351+xUBuPVZW3T/JMV/DT7UfJnDAHpd6or6yW0LY7P54yYRbNbMW+g70blla7zhYjiguDIWtxbZdWScNprkG6bZLuEcDYfuHRBID0apZwqciFXhW4pYPxxuyEK3y7SRudWLCyMSY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898185; c=relaxed/simple; bh=R2asvLQhBdfY8dbrbq7YimBkMAdxaJcr/EZrATrWVw8=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ilR+8GM62bPRaEWvOaSJZfYusqt9jr/frWOtvVKDsiX0F3s60nWRz37Q3HP8PZGQnq406ozmIpgzr6C/BvEyjdjlZ2ZKzBneLiFPKXZnbH99+QQPgM5W426METgcVRtoJoPMnyJgZgxaE9ji3NLM/AJgwMAIpBFhLi2OZsyJXzI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.163.235]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4bbRKd4gWkzKHMZR; Mon, 7 Jul 2025 22:23:01 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 1D58D1A0AE1; Mon, 7 Jul 2025 22:23:00 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgBnxyQ22GtoNazLAw--.46745S12; Mon, 07 Jul 2025 22:22:59 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, naresh.kamboju@linaro.org, jiangqi903@gmail.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 08/11] ext4: reserved credits for one extent during the folio writeback Date: Mon, 7 Jul 2025 22:08:11 +0800 Message-ID: <20250707140814.542883-9-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250707140814.542883-1-yi.zhang@huaweicloud.com> References: <20250707140814.542883-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgBnxyQ22GtoNazLAw--.46745S12 X-Coremail-Antispam: 1UD129KBjvJXoWxZry5KryruFyDAFy7Cw1Dtrb_yoW5WryDpF W3CrWkWr17WFyUuF4xWa1xZF1fWa48C3yUJr9xKFn7Wa98Z34IgFn8KayY9FW5KrWxGa4j vF45C34Duay2yaDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi After ext4 supports large folios, reserving journal credits for one maximum-ordered folio based on the worst case cenario during the writeback process can easily exceed the maximum transaction credits. Additionally, reserving journal credits for one page is also no longer appropriate. Currently, the folio writeback process can either extend the journal credits or initiate a new transaction if the currently reserved journal credits are insufficient. Therefore, it can be modified to reserve credits for only one extent at the outset. In most cases involving continuous mapping, these credits are generally adequate, and we may only need to perform some basic credit expansion. However, in extreme cases where the block size and folio size differ significantly, or when the folios are sufficiently discontinuous, it may be necessary to restart a new transaction and resubmit the folios. Suggested-by: Jan Kara Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Tested-by: Linux Kernel Functional Testing --- fs/ext4/inode.c | 25 ++++++++----------------- 1 file changed, 8 insertions(+), 17 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 3ed4bc6c02f8..d9d12529b7fc 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -2547,21 +2547,6 @@ static int mpage_map_and_submit_extent(handle_t *han= dle, return err; } =20 -/* - * Calculate the total number of credits to reserve for one writepages - * iteration. This is called from ext4_writepages(). We map an extent of - * up to MAX_WRITEPAGES_EXTENT_LEN blocks and then we go on and finish map= ping - * the last partial page. So in total we can map MAX_WRITEPAGES_EXTENT_LEN= + - * bpp - 1 blocks in bpp different extents. - */ -static int ext4_da_writepages_trans_blocks(struct inode *inode) -{ - int bpp =3D ext4_journal_blocks_per_folio(inode); - - return ext4_meta_trans_blocks(inode, - MAX_WRITEPAGES_EXTENT_LEN + bpp - 1, bpp); -} - static int ext4_journal_folio_buffers(handle_t *handle, struct folio *foli= o, size_t len) { @@ -2918,8 +2903,14 @@ static int ext4_do_writepages(struct mpage_da_data *= mpd) * not supported by delalloc. */ BUG_ON(ext4_should_journal_data(inode)); - needed_blocks =3D ext4_da_writepages_trans_blocks(inode); - + /* + * Calculate the number of credits needed to reserve for one + * extent of up to MAX_WRITEPAGES_EXTENT_LEN blocks. It will + * attempt to extend the transaction or start a new iteration + * if the reserved credits are insufficient. + */ + needed_blocks =3D ext4_chunk_trans_blocks(inode, + MAX_WRITEPAGES_EXTENT_LEN); /* start a new transaction */ handle =3D ext4_journal_start_with_reserve(inode, EXT4_HT_WRITE_PAGE, needed_blocks, rsv_blocks); --=20 2.46.1 From nobody Tue Oct 7 19:22:15 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C48732D9EFA; Mon, 7 Jul 2025 14:23:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898186; cv=none; b=msg0FRedzVIlI75jLaJuqUs9GpE5cUs3PITvysnBHnTfSANewVpoZJxWlW+RO06bPJGO6uRbetJs3sB9S+1GLOceODvbjGvJdll/aH+WluTPYjIYIqhaE1flGBCz5B3092Hu4iarVz1jLX53Umw29Rbq3Jwy1oKZs+Vg5H8kIG8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898186; c=relaxed/simple; bh=+U2My+Aqn5Ydqp5DcchuYsZosyjraV8IFJkLWCd/550=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=hGGZZvjh3OK8ST0kbrFMiokq7kIPYFkP2d3IUDPgdJJFEYckEtTg9/Ot4CtaBNB8RU4Qg2K3SqYs09Fvo5d8ER0cAud8Ee6koelF49snn2s64tktXk1PezuuvXjilHl1mEJKMnJW1HYPqNL0IVmeWRMVhSimdEhvBvW71KfWUjk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4bbRKd6F5jzYQv6N; Mon, 7 Jul 2025 22:23:01 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id AA03A1A1301; Mon, 7 Jul 2025 22:23:00 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgBnxyQ22GtoNazLAw--.46745S13; Mon, 07 Jul 2025 22:23:00 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, naresh.kamboju@linaro.org, jiangqi903@gmail.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 09/11] ext4: replace ext4_writepage_trans_blocks() Date: Mon, 7 Jul 2025 22:08:12 +0800 Message-ID: <20250707140814.542883-10-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250707140814.542883-1-yi.zhang@huaweicloud.com> References: <20250707140814.542883-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgBnxyQ22GtoNazLAw--.46745S13 X-Coremail-Antispam: 1UD129KBjvJXoWxtry3tF48Gr4Dtw43tr4kXrb_yoW3ZF1kpa sxCF1rKr15W34kuFWI9r47Zr4Sg3W8Cr4UXrySkrnYgayDJw1IgFn0v3WFyFy5trW8Wws0 vF4Yk34UWa1ak37anT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVWUCVW8JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi After ext4 supports large folios, the semantics of reserving credits in pages is no longer applicable. In most scenarios, reserving credits in extents is sufficient. Therefore, introduce ext4_chunk_trans_extent() to replace ext4_writepage_trans_blocks(). move_extent_per_page() is the only remaining location where we are still processing extents in pages. Suggested-by: Jan Kara Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Tested-by: Linux Kernel Functional Testing --- fs/ext4/ext4.h | 2 +- fs/ext4/extents.c | 6 +++--- fs/ext4/inline.c | 6 +++--- fs/ext4/inode.c | 33 +++++++++++++++------------------ fs/ext4/move_extent.c | 3 ++- fs/ext4/xattr.c | 2 +- 6 files changed, 25 insertions(+), 27 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 18373de980f2..f705046ba6c6 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -3064,9 +3064,9 @@ extern int ext4_punch_hole(struct file *file, loff_t = offset, loff_t length); extern void ext4_set_inode_flags(struct inode *, bool init); extern int ext4_alloc_da_blocks(struct inode *inode); extern void ext4_set_aops(struct inode *inode); -extern int ext4_writepage_trans_blocks(struct inode *); extern int ext4_normal_submit_inode_data_buffers(struct jbd2_inode *jinode= ); extern int ext4_chunk_trans_blocks(struct inode *, int nrblocks); +extern int ext4_chunk_trans_extent(struct inode *inode, int nrblocks); extern int ext4_meta_trans_blocks(struct inode *inode, int lblocks, int pextents); extern int ext4_zero_partial_blocks(handle_t *handle, struct inode *inode, diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index b543a46fc809..f0f155458697 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -5171,7 +5171,7 @@ ext4_ext_shift_path_extents(struct ext4_ext_path *pat= h, ext4_lblk_t shift, credits =3D depth + 2; } =20 - restart_credits =3D ext4_writepage_trans_blocks(inode); + restart_credits =3D ext4_chunk_trans_extent(inode, 0); err =3D ext4_datasem_ensure_credits(handle, inode, credits, restart_credits, 0); if (err) { @@ -5431,7 +5431,7 @@ static int ext4_collapse_range(struct file *file, lof= f_t offset, loff_t len) =20 truncate_pagecache(inode, start); =20 - credits =3D ext4_writepage_trans_blocks(inode); + credits =3D ext4_chunk_trans_extent(inode, 0); handle =3D ext4_journal_start(inode, EXT4_HT_TRUNCATE, credits); if (IS_ERR(handle)) return PTR_ERR(handle); @@ -5527,7 +5527,7 @@ static int ext4_insert_range(struct file *file, loff_= t offset, loff_t len) =20 truncate_pagecache(inode, start); =20 - credits =3D ext4_writepage_trans_blocks(inode); + credits =3D ext4_chunk_trans_extent(inode, 0); handle =3D ext4_journal_start(inode, EXT4_HT_TRUNCATE, credits); if (IS_ERR(handle)) return PTR_ERR(handle); diff --git a/fs/ext4/inline.c b/fs/ext4/inline.c index a1bbcdf40824..d5b32d242495 100644 --- a/fs/ext4/inline.c +++ b/fs/ext4/inline.c @@ -562,7 +562,7 @@ static int ext4_convert_inline_data_to_extent(struct ad= dress_space *mapping, return 0; } =20 - needed_blocks =3D ext4_writepage_trans_blocks(inode); + needed_blocks =3D ext4_chunk_trans_extent(inode, 1); =20 ret =3D ext4_get_inode_loc(inode, &iloc); if (ret) @@ -1864,7 +1864,7 @@ int ext4_inline_data_truncate(struct inode *inode, in= t *has_inline) }; =20 =20 - needed_blocks =3D ext4_writepage_trans_blocks(inode); + needed_blocks =3D ext4_chunk_trans_extent(inode, 1); handle =3D ext4_journal_start(inode, EXT4_HT_INODE, needed_blocks); if (IS_ERR(handle)) return PTR_ERR(handle); @@ -1979,7 +1979,7 @@ int ext4_convert_inline_data(struct inode *inode) return 0; } =20 - needed_blocks =3D ext4_writepage_trans_blocks(inode); + needed_blocks =3D ext4_chunk_trans_extent(inode, 1); =20 iloc.bh =3D NULL; error =3D ext4_get_inode_loc(inode, &iloc); diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index d9d12529b7fc..85ad14451b26 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -1296,7 +1296,8 @@ static int ext4_write_begin(struct file *file, struct= address_space *mapping, * Reserve one block more for addition to orphan list in case * we allocate blocks but write fails for some reason */ - needed_blocks =3D ext4_writepage_trans_blocks(inode) + 1; + needed_blocks =3D ext4_chunk_trans_extent(inode, + ext4_journal_blocks_per_folio(inode)) + 1; index =3D pos >> PAGE_SHIFT; =20 if (ext4_test_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA)) { @@ -4463,7 +4464,7 @@ int ext4_punch_hole(struct file *file, loff_t offset,= loff_t length) return ret; =20 if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) - credits =3D ext4_writepage_trans_blocks(inode); + credits =3D ext4_chunk_trans_extent(inode, 2); else credits =3D ext4_blocks_for_truncate(inode); handle =3D ext4_journal_start(inode, EXT4_HT_TRUNCATE, credits); @@ -4612,7 +4613,7 @@ int ext4_truncate(struct inode *inode) } =20 if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)) - credits =3D ext4_writepage_trans_blocks(inode); + credits =3D ext4_chunk_trans_extent(inode, 1); else credits =3D ext4_blocks_for_truncate(inode); =20 @@ -6239,25 +6240,19 @@ int ext4_meta_trans_blocks(struct inode *inode, int= lblocks, int pextents) } =20 /* - * Calculate the total number of credits to reserve to fit - * the modification of a single pages into a single transaction, - * which may include multiple chunks of block allocations. - * - * This could be called via ext4_write_begin() - * - * We need to consider the worse case, when - * one new block per extent. + * Calculate the journal credits for modifying the number of blocks + * in a single extent within one transaction. 'nrblocks' is used only + * for non-extent inodes. For extent type inodes, 'nrblocks' can be + * zero if the exact number of blocks is unknown. */ -int ext4_writepage_trans_blocks(struct inode *inode) +int ext4_chunk_trans_extent(struct inode *inode, int nrblocks) { - int bpp =3D ext4_journal_blocks_per_folio(inode); int ret; =20 - ret =3D ext4_meta_trans_blocks(inode, bpp, bpp); - + ret =3D ext4_meta_trans_blocks(inode, nrblocks, 1); /* Account for data blocks for journalled mode */ if (ext4_should_journal_data(inode)) - ret +=3D bpp; + ret +=3D nrblocks; return ret; } =20 @@ -6635,10 +6630,12 @@ static int ext4_block_page_mkwrite(struct inode *in= ode, struct folio *folio, handle_t *handle; loff_t size; unsigned long len; + int credits; int ret; =20 - handle =3D ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, - ext4_writepage_trans_blocks(inode)); + credits =3D ext4_chunk_trans_extent(inode, + ext4_journal_blocks_per_folio(inode)); + handle =3D ext4_journal_start(inode, EXT4_HT_WRITE_PAGE, credits); if (IS_ERR(handle)) return PTR_ERR(handle); =20 diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c index 1f8493a56e8f..adae3caf175a 100644 --- a/fs/ext4/move_extent.c +++ b/fs/ext4/move_extent.c @@ -280,7 +280,8 @@ move_extent_per_page(struct file *o_filp, struct inode = *donor_inode, */ again: *err =3D 0; - jblocks =3D ext4_writepage_trans_blocks(orig_inode) * 2; + jblocks =3D ext4_meta_trans_blocks(orig_inode, block_len_in_page, + block_len_in_page) * 2; handle =3D ext4_journal_start(orig_inode, EXT4_HT_MOVE_EXTENTS, jblocks); if (IS_ERR(handle)) { *err =3D PTR_ERR(handle); diff --git a/fs/ext4/xattr.c b/fs/ext4/xattr.c index 8d15acbacc20..3fb93247330d 100644 --- a/fs/ext4/xattr.c +++ b/fs/ext4/xattr.c @@ -962,7 +962,7 @@ int __ext4_xattr_set_credits(struct super_block *sb, st= ruct inode *inode, * so we need to reserve credits for this eventuality */ if (inode && ext4_has_inline_data(inode)) - credits +=3D ext4_writepage_trans_blocks(inode) + 1; + credits +=3D ext4_chunk_trans_extent(inode, 1) + 1; =20 /* We are done if ea_inode feature is not enabled. */ if (!ext4_has_feature_ea_inode(sb)) --=20 2.46.1 From nobody Tue Oct 7 19:22:15 2025 Received: from dggsgout12.his.huawei.com (dggsgout12.his.huawei.com [45.249.212.56]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2A6572DA75A; Mon, 7 Jul 2025 14:23:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.56 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898185; cv=none; b=aFvmOxpMnq2uNSEu9wE5im9QzpWmAEkLr3h8GziHtvVbw93ueQeEvLud/LQcF0x99//EDdoO2zXKnPkwGbKdmFNo1GmGG3iQa5vEiRf+vQVxBOR7UwrktYG4FDzczH0OPQdwntXZO5b2+mBuQZJPecCk3jznp9+GkRWwCNkFZ6E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898185; c=relaxed/simple; bh=FZKwy07XvmKoMENra/1jCtoEiCc4x782o8p/B/1DuME=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ssx5MvXKgjtGCZjzcfdKbP56FHFZH4LJdVrUK0DhTWopyI5FfK2/UodGSDCxtuWrV+f7gjX1+Asb4QBRuauWNbi8rEK3dFUyG5MPz+9rQ3+lUX9aJABLp4B0KRinewQiWh35qPLx7Oje8kgxune9ifaIpPMk05RoxqUUKMPD3mE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=none smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.56 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout12.his.huawei.com (SkyGuard) with ESMTPS id 4bbRKf5vr1zKHMbB; Mon, 7 Jul 2025 22:23:02 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id 46A991A12FE; Mon, 7 Jul 2025 22:23:01 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgBnxyQ22GtoNazLAw--.46745S14; Mon, 07 Jul 2025 22:23:01 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, naresh.kamboju@linaro.org, jiangqi903@gmail.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 10/11] ext4: fix insufficient credits calculation in ext4_meta_trans_blocks() Date: Mon, 7 Jul 2025 22:08:13 +0800 Message-ID: <20250707140814.542883-11-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250707140814.542883-1-yi.zhang@huaweicloud.com> References: <20250707140814.542883-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgBnxyQ22GtoNazLAw--.46745S14 X-Coremail-Antispam: 1UD129KBjvJXoW7uFy7Zw48KrW5GFWrCF1fCrg_yoW8Xw4fp3 Z5Ca48Gry8Ww409a18Wa12qr48Ka1kGa17WFWfJw15XFZxZryfKrnFq348Aa45tFWSkw1q qF4ayry3Gw1UA3DanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi The calculation of journal credits in ext4_meta_trans_blocks() should include pextents, as each extent separately may be allocated from a different group and thus need to update different bitmap and group descriptor block. Fixes: 0e32d8617012 ("ext4: correct the journal credits calculations of all= ocating blocks") Reported-by: Jan Kara Closes: https://lore.kernel.org/linux-ext4/nhxfuu53wyacsrq7xqgxvgzcggyscu2t= babginahcygvmc45hy@t4fvmyeky33e/ Signed-off-by: Zhang Yi Reviewed-by: Jan Kara Reviewed-by: Baokun Li Tested-by: Linux Kernel Functional Testing --- fs/ext4/inode.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 85ad14451b26..4b679cb6c8bd 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -6214,7 +6214,7 @@ int ext4_meta_trans_blocks(struct inode *inode, int l= blocks, int pextents) int ret; =20 /* - * How many index and lead blocks need to touch to map @lblocks + * How many index and leaf blocks need to touch to map @lblocks * logical blocks to @pextents physical extents? */ idxblocks =3D ext4_index_trans_blocks(inode, lblocks, pextents); @@ -6223,7 +6223,7 @@ int ext4_meta_trans_blocks(struct inode *inode, int l= blocks, int pextents) * Now let's see how many group bitmaps and group descriptors need * to account */ - groups =3D idxblocks; + groups =3D idxblocks + pextents; gdpblocks =3D groups; if (groups > ngroups) groups =3D ngroups; --=20 2.46.1 From nobody Tue Oct 7 19:22:15 2025 Received: from dggsgout11.his.huawei.com (dggsgout11.his.huawei.com [45.249.212.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BE7C12DAFAE; Mon, 7 Jul 2025 14:23:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898186; cv=none; b=YOCinodBD+HezEDbZwp5pTGKphD9j1YTUffCV/7EqOFXJK7shPrvHtvN6u8YP3L1cIYD0dqcEfq6nfFlr5Ejd+iq//Q38ojQTaJWg4qGmW8gghzWdE6jmHOhKLyrLa/Ws/xLPuyEAeekVElbsnx+PdfdM1ev7Y3pJ4tJzmLyLWk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1751898186; c=relaxed/simple; bh=pD7hY1OcA4uWjfnB809xyY1iSJS5H+vjuHjwhhn46dY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=PnuYIMVxRvtmYoxD6IrR8qzovorcyQ/SPiQlUyedTn6wJu7TGH81Z4MacZKJSV+jvwmuGFnVFWYdJkercNxabGrW0/BOKehusxVddjKS9EbIRUBL8yMlrGi/nXXwy5kAj9oMntkbw8fZpOhVLegDY52N/USYRW2Da3ROGqbowVk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com; spf=pass smtp.mailfrom=huaweicloud.com; arc=none smtp.client-ip=45.249.212.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=huaweicloud.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huaweicloud.com Received: from mail.maildlp.com (unknown [172.19.93.142]) by dggsgout11.his.huawei.com (SkyGuard) with ESMTPS id 4bbRKg094BzYQttj; Mon, 7 Jul 2025 22:23:03 +0800 (CST) Received: from mail02.huawei.com (unknown [10.116.40.252]) by mail.maildlp.com (Postfix) with ESMTP id CF7661A0CF1; Mon, 7 Jul 2025 22:23:01 +0800 (CST) Received: from huaweicloud.com (unknown [10.175.112.188]) by APP3 (Coremail) with SMTP id _Ch0CgBnxyQ22GtoNazLAw--.46745S15; Mon, 07 Jul 2025 22:23:01 +0800 (CST) From: Zhang Yi To: linux-ext4@vger.kernel.org Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, tytso@mit.edu, adilger.kernel@dilger.ca, jack@suse.cz, ojaswin@linux.ibm.com, sashal@kernel.org, naresh.kamboju@linaro.org, jiangqi903@gmail.com, yi.zhang@huawei.com, yi.zhang@huaweicloud.com, libaokun1@huawei.com, yukuai3@huawei.com, yangerkun@huawei.com Subject: [PATCH v4 11/11] ext4: limit the maximum folio order Date: Mon, 7 Jul 2025 22:08:14 +0800 Message-ID: <20250707140814.542883-12-yi.zhang@huaweicloud.com> X-Mailer: git-send-email 2.46.1 In-Reply-To: <20250707140814.542883-1-yi.zhang@huaweicloud.com> References: <20250707140814.542883-1-yi.zhang@huaweicloud.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CM-TRANSID: _Ch0CgBnxyQ22GtoNazLAw--.46745S15 X-Coremail-Antispam: 1UD129KBjvJXoWxCr4kWw1rtr1xJw4kJF18Xwb_yoW5KFyfpF y7GF1rGr40q3sFgr4xtw47Zr13tayxGrWUA3yfCw43ZFWUX34rtF40kF13Z3WUtrWkXa1S qF42kFyUua13CrDanT9S1TB71UUUUU7qnTZGkaVYY2UrUUUUjbIjqfuFe4nvWSU5nxnvy2 9KBjDU0xBIdaVrnRJUUUmS14x267AKxVWrJVCq3wAFc2x0x2IEx4CE42xK8VAvwI8IcIk0 rVWrJVCq3wAFIxvE14AKwVWUJVWUGwA2048vs2IY020E87I2jVAFwI0_JF0E3s1l82xGYI kIc2x26xkF7I0E14v26ryj6s0DM28lY4IEw2IIxxk0rwA2F7IY1VAKz4vEj48ve4kI8wA2 z4x0Y4vE2Ix0cI8IcVAFwI0_tr0E3s1l84ACjcxK6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F 4UJwA2z4x0Y4vEx4A2jsIE14v26rxl6s0DM28EF7xvwVC2z280aVCY1x0267AKxVW0oVCq 3wAS0I0E0xvYzxvE52x082IY62kv0487Mc02F40EFcxC0VAKzVAqx4xG6I80ewAv7VC0I7 IYx2IY67AKxVWUGVWUXwAv7VC2z280aVAFwI0_Jr0_Gr1lOx8S6xCaFVCjc4AY6r1j6r4U M4x0Y48IcxkI7VAKI48JM4x0x7Aq67IIx4CEVc8vx2IErcIFxwACI402YVCY1x02628vn2 kIc2xKxwCY1x0262kKe7AKxVWUtVW8ZwCF04k20xvY0x0EwIxGrwCFx2IqxVCFs4IE7xkE bVWUJVW8JwC20s026c02F40E14v26r1j6r18MI8I3I0E7480Y4vE14v26r106r1rMI8E67 AF67kF1VAFwI0_Jw0_GFylIxkGc2Ij64vIr41lIxAIcVC0I7IYx2IY67AKxVW8JVW5JwCI 42IY6xIIjxv20xvEc7CjxVAFwI0_Gr1j6F4UJwCI42IY6xAIw20EY4v20xvaj40_Jr0_JF 4lIxAIcVC2z280aVAFwI0_Gr0_Cr1lIxAIcVC2z280aVCY1x0267AKxVW8Jr0_Cr1UYxBI daVFxhVjvjDU0xZFpf9x0JUWMKtUUUUU= X-CM-SenderInfo: d1lo6xhdqjqx5xdzvxpfor3voofrz/ Content-Type: text/plain; charset="utf-8" From: Zhang Yi In environments with a page size of 64KB, the maximum size of a folio can reach up to 128MB. Consequently, during the write-back of folios, the 'rsv_blocks' will be overestimated to 1,577, which can make pressure on the journal space where the journal is small. This can easily exceed the limit of a single transaction. Besides, an excessively large folio is meaningless and will instead increase the overhead of traversing the bhs within the folio. Therefore, limit the maximum order of a folio to 2048 filesystem blocks. Reported-by: Naresh Kamboju Reported-by: Joseph Qi Closes: https://lore.kernel.org/linux-ext4/CA+G9fYsyYQ3ZL4xaSg1-Tt5Evto7Zd+= hgNWZEa9cQLbahA1+xg@mail.gmail.com/ Signed-off-by: Zhang Yi Tested-by: Linux Kernel Functional Testing --- fs/ext4/ext4.h | 2 +- fs/ext4/ialloc.c | 3 +-- fs/ext4/inode.c | 22 +++++++++++++++++++--- 3 files changed, 21 insertions(+), 6 deletions(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index f705046ba6c6..9ac0a7d4fa0c 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -3020,7 +3020,7 @@ int ext4_walk_page_buffers(handle_t *handle, struct buffer_head *bh)); int do_journal_get_write_access(handle_t *handle, struct inode *inode, struct buffer_head *bh); -bool ext4_should_enable_large_folio(struct inode *inode); +void ext4_set_inode_mapping_order(struct inode *inode); #define FALL_BACK_TO_NONDELALLOC 1 #define CONVERT_INLINE_DATA 2 =20 diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c index 79aa3df8d019..df4051613b29 100644 --- a/fs/ext4/ialloc.c +++ b/fs/ext4/ialloc.c @@ -1335,8 +1335,7 @@ struct inode *__ext4_new_inode(struct mnt_idmap *idma= p, } } =20 - if (ext4_should_enable_large_folio(inode)) - mapping_set_large_folios(inode->i_mapping); + ext4_set_inode_mapping_order(inode); =20 ext4_update_inode_fsync_trans(handle, inode, 1); =20 diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c index 4b679cb6c8bd..1bce9ebaedb7 100644 --- a/fs/ext4/inode.c +++ b/fs/ext4/inode.c @@ -5181,7 +5181,7 @@ static int check_igot_inode(struct inode *inode, ext4= _iget_flags flags, return -EFSCORRUPTED; } =20 -bool ext4_should_enable_large_folio(struct inode *inode) +static bool ext4_should_enable_large_folio(struct inode *inode) { struct super_block *sb =3D inode->i_sb; =20 @@ -5198,6 +5198,22 @@ bool ext4_should_enable_large_folio(struct inode *in= ode) return true; } =20 +/* + * Limit the maximum folio order to 2048 blocks to prevent overestimation + * of reserve handle credits during the folio writeback in environments + * where the PAGE_SIZE exceeds 4KB. + */ +#define EXT4_MAX_PAGECACHE_ORDER(i) \ + min(MAX_PAGECACHE_ORDER, (11 + (i)->i_blkbits - PAGE_SHIFT)) +void ext4_set_inode_mapping_order(struct inode *inode) +{ + if (!ext4_should_enable_large_folio(inode)) + return; + + mapping_set_folio_order_range(inode->i_mapping, 0, + EXT4_MAX_PAGECACHE_ORDER(inode)); +} + struct inode *__ext4_iget(struct super_block *sb, unsigned long ino, ext4_iget_flags flags, const char *function, unsigned int line) @@ -5515,8 +5531,8 @@ struct inode *__ext4_iget(struct super_block *sb, uns= igned long ino, ret =3D -EFSCORRUPTED; goto bad_inode; } - if (ext4_should_enable_large_folio(inode)) - mapping_set_large_folios(inode->i_mapping); + + ext4_set_inode_mapping_order(inode); =20 ret =3D check_igot_inode(inode, flags, function, line); /* --=20 2.46.1