[v2] iomap: allow partial folio write with iomap_folio_state

[PATCH v2 0/4] iomap: allow partial folio write with iomap_folio_state

Posted by alexjlzheng@gmail.com 1 month, 3 weeks ago

From: Jinliang Zheng <alexjlzheng@tencent.com>

With iomap_folio_state, we can identify uptodate states at the block
level, and a read_folio reading can correctly handle partially
uptodate folios.

Therefore, when a partial write occurs, accept the block-aligned
partial write instead of rejecting the entire write.

This patchset has been tested by xfstests' generic and xfs group, and
there's no new failed cases compared to the lastest upstream version kernel.

Changelog:

V2: use & instead of % for 64 bit variable on m68k/xtensa, try to make them happy:
       m68k-linux-ld: fs/iomap/buffered-io.o: in function `iomap_adjust_read_range':
    >> buffered-io.c:(.text+0xa8a): undefined reference to `__moddi3'
    >> m68k-linux-ld: buffered-io.c:(.text+0xaa8): undefined reference to `__moddi3'

V1: https://lore.kernel.org/linux-fsdevel/20250810044806.3433783-1-alexjlzheng@tencent.com/

Jinliang Zheng (4):
  iomap: make sure iomap_adjust_read_range() are aligned with block_size
  iomap: move iter revert case out of the unwritten branch
  iomap: make iomap_write_end() return the number of written length again
  iomap: don't abandon the whole thing with iomap_folio_state

 fs/iomap/buffered-io.c | 68 +++++++++++++++++++++++++++++-------------
 1 file changed, 47 insertions(+), 21 deletions(-)

-- 
2.49.0

Re: [PATCH v2 0/4] iomap: allow partial folio write with iomap_folio_state

Posted by Christoph Hellwig 1 month, 3 weeks ago

On Sun, Aug 10, 2025 at 06:15:50PM +0800, alexjlzheng@gmail.com wrote:
> From: Jinliang Zheng <alexjlzheng@tencent.com>
> 
> With iomap_folio_state, we can identify uptodate states at the block
> level, and a read_folio reading can correctly handle partially
> uptodate folios.
> 
> Therefore, when a partial write occurs, accept the block-aligned
> partial write instead of rejecting the entire write.

We're not rejecting the entire write, but instead moving on to the
next loop iteration.

> This patchset has been tested by xfstests' generic and xfs group, and
> there's no new failed cases compared to the lastest upstream version kernel.

What is the motivation for this series?  Do you see performance
improvements in a workload you care about?

Re: [PATCH v2 0/4] iomap: allow partial folio write with iomap_folio_state

Posted by Jinliang Zheng 1 month, 3 weeks ago

On Mon, 11 Aug 2025 03:38:17 -0700, Christoph Hellwig wrote:
> On Sun, Aug 10, 2025 at 06:15:50PM +0800, alexjlzheng@gmail.com wrote:
> > From: Jinliang Zheng <alexjlzheng@tencent.com>
> > 
> > With iomap_folio_state, we can identify uptodate states at the block
> > level, and a read_folio reading can correctly handle partially
> > uptodate folios.
> > 
> > Therefore, when a partial write occurs, accept the block-aligned
> > partial write instead of rejecting the entire write.
>

Thank you for your reply. :)

> We're not rejecting the entire write, but instead moving on to the
> next loop iteration.

Yes, but the next iteration will need to re-copy from the beginning,
which means that all copies in this iteration are useless. The purpose
of this patch set is to reduce the number of bytes that need to be
re-copied and reduce the number of discarded copies.

For example, suppose a folio is 2MB, blocksize is 4kB, and the copied
bytes are 2MB-3kB.

Without this patchset, we'd need to recopy 2MB-3kB of bytes in the next
iteration.

 |<-------------------- 2MB -------------------->|
 +-------+-------+-------+-------+-------+-------+
 | block |  ...  | block | block |  ...  | block | folio
 +-------+-------+-------+-------+-------+-------+
 |<-4kB->|

 |<--------------- copied 2MB-3kB --------->|       first time copied
 |<-------- 1MB -------->|                          next time we need copy (chunk /= 2)
                         |<-------- 1MB -------->|  next next time we need copy.

 |<------ 2MB-3kB bytes duplicate copy ---->|

With this patchset, we can accept 2MB-4kB of bytes, which is block-aligned.
This means we only need to process the remaining 4kB in the next iteration.

 |<-------------------- 2MB -------------------->|
 +-------+-------+-------+-------+-------+-------+
 | block |  ...  | block | block |  ...  | block | folio
 +-------+-------+-------+-------+-------+-------+
 |<-4kB->|

 |<--------------- copied 2MB-3kB --------->|       first time copied
                                         |<-4kB->|  next time we need copy

                                         |<>|
                              only 1kB bytes duplicate copy

> 
> > This patchset has been tested by xfstests' generic and xfs group, and
> > there's no new failed cases compared to the lastest upstream version kernel.
> 
> What is the motivation for this series?  Do you see performance
> improvements in a workload you care about?

Paritial writes are inherently a relatively unusual situation and don't account
for a significant portion of performance testing.

However, in scenarios with numerous memory errors, they can significantly reduce
the number of bytes copied.

thanks,
Jinliang Zheng :)