[PATCH v3 09/14] mm/mglru: use the common routine for dirty/writeback reactivation

Kairui Song via B4 Relay posted 14 patches 3 days, 18 hours ago
[PATCH v3 09/14] mm/mglru: use the common routine for dirty/writeback reactivation
Posted by Kairui Song via B4 Relay 3 days, 18 hours ago
From: Kairui Song <kasong@tencent.com>

Currently MGLRU will move the dirty writeback folios to the second
oldest gen instead of reactivate them like the classical LRU. This
might help to reduce the LRU contention as it skipped the isolation.
But as a result we will see these folios at the LRU tail more frequently
leading to inefficient reclaim.

Besides, the dirty / writeback check after isolation in
shrink_folio_list is more accurate and covers more cases. So instead,
just drop the special handling for dirty writeback, use the common
routine and re-activate it like the classical LRU.

This should in theory improve the scan efficiency. These folios will be
rotated back to LRU tail once writeback is done so there is no risk of
hotness inversion. And now each reclaim loop will have a higher
success rate. This also prepares for unifying the writeback and
throttling mechanism with classical LRU, we keep these folios far from
tail so detecting the tail batch will have a similar pattern with
classical LRU.

The micro optimization that avoids LRU contention by skipping the
isolation is gone, which should be fine. Compared to IO and writeback
cost, the isolation overhead is trivial.

Reviewed-by: Axel Rasmussen <axelrasmussen@google.com>
Signed-off-by: Kairui Song <kasong@tencent.com>
---
 mm/vmscan.c | 19 -------------------
 1 file changed, 19 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 9f4512a4d35f..2a36cf937061 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -4578,7 +4578,6 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c
 		       int tier_idx)
 {
 	bool success;
-	bool dirty, writeback;
 	int gen = folio_lru_gen(folio);
 	int type = folio_is_file_lru(folio);
 	int zone = folio_zonenum(folio);
@@ -4628,21 +4627,6 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c
 		return true;
 	}
 
-	dirty = folio_test_dirty(folio);
-	writeback = folio_test_writeback(folio);
-	if (type == LRU_GEN_FILE && dirty) {
-		sc->nr.file_taken += delta;
-		if (!writeback)
-			sc->nr.unqueued_dirty += delta;
-	}
-
-	/* waiting for writeback */
-	if (writeback || (type == LRU_GEN_FILE && dirty)) {
-		gen = folio_inc_gen(lruvec, folio, true);
-		list_move(&folio->lru, &lrugen->folios[gen][type][zone]);
-		return true;
-	}
-
 	return false;
 }
 
@@ -4664,9 +4648,6 @@ static bool isolate_folio(struct lruvec *lruvec, struct folio *folio, struct sca
 	if (!folio_test_referenced(folio))
 		set_mask_bits(&folio->flags.f, LRU_REFS_MASK, 0);
 
-	/* for shrink_folio_list() */
-	folio_clear_reclaim(folio);
-
 	success = lru_gen_del_folio(lruvec, folio, true);
 	VM_WARN_ON_ONCE_FOLIO(!success, folio);
 

-- 
2.53.0
Re: [PATCH v3 09/14] mm/mglru: use the common routine for dirty/writeback reactivation
Posted by Kairui Song 3 days, 8 hours ago
Forwarding sashiko review, which seems not a real issue but just in case:

> commit e6160f371a1c25f6b49b0a6639582d0636d9905b
> Author: Kairui Song <kasong@tencent.com>
> mm/mglru: use the common routine for dirty/writeback reactivation
> This commit drops MGLRU-specific handling for dirty and writeback folios,
> instead allowing the common shrink_folio_list() routine to process and
> reactivate them like the classical LRU.
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index 9f4512a4d35fc..2a36cf937061b 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -4628,21 +4627,6 @@ static bool sort_folio(struct lruvec *lruvec, struct folio *folio, struct scan_c
> >  		return true;
> >  	}
> >  
> > -	dirty = folio_test_dirty(folio);
> > -	writeback = folio_test_writeback(folio);
> > -	if (type == LRU_GEN_FILE && dirty) {
> > -		sc->nr.file_taken += delta;
> > -		if (!writeback)
> > -			sc->nr.unqueued_dirty += delta;
> > -	}
> > -
> > -	/* waiting for writeback */
> > -	if (writeback || (type == LRU_GEN_FILE && dirty)) {
> > -		gen = folio_inc_gen(lruvec, folio, true);
> > -		list_move(&folio->lru, &lrugen->folios[gen][type][zone]);
> > -		return true;
> > -	}
> Does removing this explicit promotion and instead falling back to 
> shrink_folio_list() cause these folios to retain stale tier bits?
> When dirty and writeback folios are processed by shrink_folio_list(),
> they frequently hit the activate_locked path, which sets PG_active on
> the folio.
> When these folios are returned in a list to evict_folios(), it tries to
> clear stale tier bits by checking if the folio is from the oldest
> generation:
>     if (lru_gen_folio_seq(lruvec, folio, false) == min_seq[type])
> Because PG_active is set, lru_gen_folio_seq() calculates a younger
> generation, so this check evaluates to false and the tier bits
> (LRU_REFS_FLAGS) are not cleared.
> Later, the folios are added back to the LRU via move_folios_to_lru() and
> lru_gen_add_folio(). While lru_gen_add_folio() clears PG_active, it
> intentionally does not clear LRU_REFS_FLAGS.
> Could this sequence leave the folios in a younger generation with stale
> PG_referenced and LRU_REFS_MASK bits, artificially inflating their access
> counts?

Actually the new behavior is better I believe, clearing the ref bits for
dirty folios make no sense. We even previously had a commit locally to
disable resetting the refs bits for folio_inc_gen unless it's getting
protected by ref bits (MGLRU's PID).

The access count contributes to the PID protection, refault tracking
and things like PSI, leave these counter untouched should help them to
track the folio hotness info better upon the real reclaim when the
writeback is done.

I better mention this in the commit message indeed.