fs/iomap/buffered-io.c | 3 +++ 1 file changed, 3 insertions(+)
From: Jinliang Zheng <alexjlzheng@tencent.com>
In the buffer write path, iomap_set_range_uptodate() is called every
time iomap_end_write() is called. But if folio_test_uptodate() holds, we
know that all blocks in this folio are already in the uptodate state, so
there is no need to go deep into the critical section of state_lock to
execute bitmap_set().
Although state_lock may not have significant lock contention due to
folio lock, this patch at least reduces the number of instructions.
Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com>
---
fs/iomap/buffered-io.c | 3 +++
1 file changed, 3 insertions(+)
diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c
index 3729391a18f3..fb4519158f3a 100644
--- a/fs/iomap/buffered-io.c
+++ b/fs/iomap/buffered-io.c
@@ -71,6 +71,9 @@ static void iomap_set_range_uptodate(struct folio *folio, size_t off,
unsigned long flags;
bool uptodate = true;
+ if (folio_test_uptodate(folio))
+ return;
+
if (ifs) {
spin_lock_irqsave(&ifs->state_lock, flags);
uptodate = ifs_set_range_uptodate(folio, ifs, off, len);
--
2.49.0
On Tue, Jul 01, 2025 at 10:48:47PM +0800, alexjlzheng@gmail.com wrote: > From: Jinliang Zheng <alexjlzheng@tencent.com> > > In the buffer write path, iomap_set_range_uptodate() is called every > time iomap_end_write() is called. But if folio_test_uptodate() holds, we > know that all blocks in this folio are already in the uptodate state, so > there is no need to go deep into the critical section of state_lock to > execute bitmap_set(). > > Although state_lock may not have significant lock contention due to > folio lock, this patch at least reduces the number of instructions. That means the uptodate bitmap is stale in that case. That would only matter if we could clear the folio uptodate bit and still expect the page content to survive. Which sounds dubious and I could not find anything relevant grepping the tree, but I'm adding the linux-mm list just in case. > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com> > --- > fs/iomap/buffered-io.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index 3729391a18f3..fb4519158f3a 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -71,6 +71,9 @@ static void iomap_set_range_uptodate(struct folio *folio, size_t off, > unsigned long flags; > bool uptodate = true; > > + if (folio_test_uptodate(folio)) > + return; > + > if (ifs) { > spin_lock_irqsave(&ifs->state_lock, flags); > uptodate = ifs_set_range_uptodate(folio, ifs, off, len); > -- > 2.49.0 > > ---end quoted text---
On Thu, 3 Jul 2025 06:52:44 -0700, Christoph Hellwig wrote: > On Tue, Jul 01, 2025 at 10:48:47PM +0800, alexjlzheng@gmail.com wrote: > > From: Jinliang Zheng <alexjlzheng@tencent.com> > > > > In the buffer write path, iomap_set_range_uptodate() is called every > > time iomap_end_write() is called. But if folio_test_uptodate() holds, we > > know that all blocks in this folio are already in the uptodate state, so > > there is no need to go deep into the critical section of state_lock to > > execute bitmap_set(). > > > > Although state_lock may not have significant lock contention due to > > folio lock, this patch at least reduces the number of instructions. > > That means the uptodate bitmap is stale in that case. That would Hi, after days of silence, I re-read this email thread to make sure I didn't miss something important. I realized that maybe we are not aligned and I didn't understand your sentence above. Would you mind explaining your meaning in more detail? In addition, what I want to say is that once folio_test_uptodate() is true, all bits in ifs->state are in the uptodate state. So there is no need to acquire the lock and set it again. This repeated setting happens in __iomap_write_end(). thanks, Jinliang Zheng. :) > only matter if we could clear the folio uptodate bit and still > expect the page content to survive. Which sounds dubious and I could > not find anything relevant grepping the tree, but I'm adding the > linux-mm list just in case. > > > > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com> > > --- > > fs/iomap/buffered-io.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > > index 3729391a18f3..fb4519158f3a 100644 > > --- a/fs/iomap/buffered-io.c > > +++ b/fs/iomap/buffered-io.c > > @@ -71,6 +71,9 @@ static void iomap_set_range_uptodate(struct folio *folio, size_t off, > > unsigned long flags; > > bool uptodate = true; > > > > + if (folio_test_uptodate(folio)) > > + return; > > + > > if (ifs) { > > spin_lock_irqsave(&ifs->state_lock, flags); > > uptodate = ifs_set_range_uptodate(folio, ifs, off, len); > > -- > > 2.49.0 > > > > > ---end quoted text---
On Wed, Jul 09, 2025 at 11:30:42AM +0800, Jinliang Zheng wrote: > In addition, what I want to say is that once folio_test_uptodate() is > true, all bits in ifs->state are in the uptodate state. So there is no > need to acquire the lock and set it again. This repeated setting happens > in __iomap_write_end(). Yes, that seems fine. Can you update the commit message with some of the insights from this discussion, and with that the patch should be fine.
On Thu, Jul 03, 2025 at 06:52:44AM -0700, Christoph Hellwig wrote: > On Tue, Jul 01, 2025 at 10:48:47PM +0800, alexjlzheng@gmail.com wrote: > > From: Jinliang Zheng <alexjlzheng@tencent.com> > > > > In the buffer write path, iomap_set_range_uptodate() is called every > > time iomap_end_write() is called. But if folio_test_uptodate() holds, we > > know that all blocks in this folio are already in the uptodate state, so > > there is no need to go deep into the critical section of state_lock to > > execute bitmap_set(). > > > > Although state_lock may not have significant lock contention due to > > folio lock, this patch at least reduces the number of instructions. > > That means the uptodate bitmap is stale in that case. That would > only matter if we could clear the folio uptodate bit and still > expect the page content to survive. Which sounds dubious and I could > not find anything relevant grepping the tree, but I'm adding the > linux-mm list just in case. Once a folio is uptodate, there is no route back to !uptodate without going through the removal of the folio from the page cache. The read() path relies on this for example; once it has a refcount on the folio, and has checked the uptodate bit, it will copy the contents to userspace.
On Thu, 3 Jul 2025 18:34:20 +0100, Matthew Wilcox wrote: > On Thu, Jul 03, 2025 at 06:52:44AM -0700, Christoph Hellwig wrote: > > On Tue, Jul 01, 2025 at 10:48:47PM +0800, alexjlzheng@gmail.com wrote: > > > From: Jinliang Zheng <alexjlzheng@tencent.com> > > > > > > In the buffer write path, iomap_set_range_uptodate() is called every > > > time iomap_end_write() is called. But if folio_test_uptodate() holds, we > > > know that all blocks in this folio are already in the uptodate state, so > > > there is no need to go deep into the critical section of state_lock to > > > execute bitmap_set(). > > > > > > Although state_lock may not have significant lock contention due to > > > folio lock, this patch at least reduces the number of instructions. > > > > That means the uptodate bitmap is stale in that case. That would > > only matter if we could clear the folio uptodate bit and still > > expect the page content to survive. Which sounds dubious and I could > > not find anything relevant grepping the tree, but I'm adding the > > linux-mm list just in case. > > Once a folio is uptodate, there is no route back to !uptodate without > going through the removal of the folio from the page cache. The read() > path relies on this for example; once it has a refcount on the folio, > and has checked the uptodate bit, it will copy the contents to userspace. I agree, and this aligns with my perspective. Thank you for confirming this. Jinliang Zheng. :)
On Tue, Jul 01, 2025 at 10:48:47PM +0800, alexjlzheng@gmail.com wrote: > From: Jinliang Zheng <alexjlzheng@tencent.com> > > In the buffer write path, iomap_set_range_uptodate() is called every > time iomap_end_write() is called. But if folio_test_uptodate() holds, we > know that all blocks in this folio are already in the uptodate state, so > there is no need to go deep into the critical section of state_lock to > execute bitmap_set(). > > Although state_lock may not have significant lock contention due to > folio lock, this patch at least reduces the number of instructions. > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com> > --- > fs/iomap/buffered-io.c | 3 +++ > 1 file changed, 3 insertions(+) > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > index 3729391a18f3..fb4519158f3a 100644 > --- a/fs/iomap/buffered-io.c > +++ b/fs/iomap/buffered-io.c > @@ -71,6 +71,9 @@ static void iomap_set_range_uptodate(struct folio *folio, size_t off, > unsigned long flags; > bool uptodate = true; > > + if (folio_test_uptodate(folio)) > + return; Looks fine, but how exhaustively have you tested this with heavy IO workloads? I /think/ it's the case that folios always creep towards ifs_is_fully_uptodate() == true state and once they've gotten there never go back. But folio state bugs are tricky to detect once they've crept in. --D > + > if (ifs) { > spin_lock_irqsave(&ifs->state_lock, flags); > uptodate = ifs_set_range_uptodate(folio, ifs, off, len); > -- > 2.49.0 > >
On Tue, 1 Jul 2025 11:47:37 -0700, djwong@kernel.org wrote: > On Tue, Jul 03, 2025 at 10:48:47PM +0800, alexjlzheng@gmail.com wrote: > > From: Jinliang Zheng <alexjlzheng@tencent.com> > > > > In the buffer write path, iomap_set_range_uptodate() is called every > > time iomap_end_write() is called. But if folio_test_uptodate() holds, we > > know that all blocks in this folio are already in the uptodate state, so > > there is no need to go deep into the critical section of state_lock to > > execute bitmap_set(). > > > > Although state_lock may not have significant lock contention due to > > folio lock, this patch at least reduces the number of instructions. > > > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com> > > --- > > fs/iomap/buffered-io.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > > index 3729391a18f3..fb4519158f3a 100644 > > --- a/fs/iomap/buffered-io.c > > +++ b/fs/iomap/buffered-io.c > > @@ -71,6 +71,9 @@ static void iomap_set_range_uptodate(struct folio *folio, size_t off, > > unsigned long flags; > > bool uptodate = true; > > > > + if (folio_test_uptodate(folio)) > > + return; > > Looks fine, but how exhaustively have you tested this with heavy IO > workloads? I /think/ it's the case that folios always creep towards > ifs_is_fully_uptodate() == true state and once they've gotten there > never go back. But folio state bugs are tricky to detect once they've > crept in. I tested fio, ltp and xfstests combined for about 30 hours. The command used for fio test is: fio --name=4k-rw \ --filename=/data2/testfile \ --size=1G \ --bs=4096 \ --ioengine=libaio \ --iodepth=32 \ --rw=randrw \ --direct=0 \ --buffered=1 \ --numjobs=16 \ --runtime=60 \ --time_based \ --group_reporting ltp and xfstests showed no noticeable errors caused by this patch. thanks, Jinliang Zheng. :) > > --D > > > + > > if (ifs) { > > spin_lock_irqsave(&ifs->state_lock, flags); > > uptodate = ifs_set_range_uptodate(folio, ifs, off, len); > > -- > > 2.49.0 > > > >
On Wed, Jul 02, 2025 at 08:09:12PM +0800, Jinliang Zheng wrote: > ltp and xfstests showed no noticeable errors caused by this patch. With what block and page size? I guess it was block size < PAGE_SIZE as otherwise you wouldn't want to optimize this past, but just asking in case.
On Thu, 3 Jul 2025 06:50:24 -0700, Christoph Hellwig wrote: > On Wed, Jul 03, 2025 at 08:09:12PM +0800, Jinliang Zheng wrote: > > ltp and xfstests showed no noticeable errors caused by this patch. > > With what block and page size? I guess it was block size < PAGE_SIZE > as otherwise you wouldn't want to optimize this past, but just asking > in case. Hahaha, I really want to try -b size=512, but I don't want to turn off crc, so I can only choose -b size=1024. By the way, the test was done on xfs. thanks, Jinliang Zheng. :)
On Wed, Jul 02, 2025 at 08:09:12PM +0800, Jinliang Zheng wrote: > On Tue, 1 Jul 2025 11:47:37 -0700, djwong@kernel.org wrote: > > On Tue, Jul 03, 2025 at 10:48:47PM +0800, alexjlzheng@gmail.com wrote: > > > From: Jinliang Zheng <alexjlzheng@tencent.com> > > > > > > In the buffer write path, iomap_set_range_uptodate() is called every > > > time iomap_end_write() is called. But if folio_test_uptodate() holds, we > > > know that all blocks in this folio are already in the uptodate state, so > > > there is no need to go deep into the critical section of state_lock to > > > execute bitmap_set(). > > > > > > Although state_lock may not have significant lock contention due to > > > folio lock, this patch at least reduces the number of instructions. > > > > > > Signed-off-by: Jinliang Zheng <alexjlzheng@tencent.com> > > > --- > > > fs/iomap/buffered-io.c | 3 +++ > > > 1 file changed, 3 insertions(+) > > > > > > diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c > > > index 3729391a18f3..fb4519158f3a 100644 > > > --- a/fs/iomap/buffered-io.c > > > +++ b/fs/iomap/buffered-io.c > > > @@ -71,6 +71,9 @@ static void iomap_set_range_uptodate(struct folio *folio, size_t off, > > > unsigned long flags; > > > bool uptodate = true; > > > > > > + if (folio_test_uptodate(folio)) > > > + return; > > > > Looks fine, but how exhaustively have you tested this with heavy IO > > workloads? I /think/ it's the case that folios always creep towards > > ifs_is_fully_uptodate() == true state and once they've gotten there > > never go back. But folio state bugs are tricky to detect once they've > > crept in. > > I tested fio, ltp and xfstests combined for about 30 hours. The command > used for fio test is: > > fio --name=4k-rw \ > --filename=/data2/testfile \ > --size=1G \ > --bs=4096 \ > --ioengine=libaio \ > --iodepth=32 \ > --rw=randrw \ > --direct=0 \ > --buffered=1 \ > --numjobs=16 \ > --runtime=60 \ > --time_based \ > --group_reporting > > ltp and xfstests showed no noticeable errors caused by this patch. <nod> I think this fine then... Reviewed-by: "Darrick J. Wong" <djwong@kernel.org> --D > > thanks, > Jinliang Zheng. :) > > > > > --D > > > > > + > > > if (ifs) { > > > spin_lock_irqsave(&ifs->state_lock, flags); > > > uptodate = ifs_set_range_uptodate(folio, ifs, off, len); > > > -- > > > 2.49.0 > > > > > > >
© 2016 - 2025 Red Hat, Inc.