optimize the logic for handling dirty file folios during reclaim

[PATCH v2 2/2] mm: vmscan: simplify the logic for activating dirty file folios

Posted by Baolin Wang 3 months, 3 weeks ago

After commit 6b0dfabb3555 ("fs: Remove aops->writepage"), we no longer
attempt to write back filesystem folios through reclaim.

However, in the shrink_folio_list() function, there still remains some
logic related to writeback control of dirty file folios. The original
logic was that, for direct reclaim, or when folio_test_reclaim() is false,
or the PGDAT_DIRTY flag is not set, the dirty file folios would be directly
activated to avoid being scanned again; otherwise, it will try to writeback
the dirty file folios. However, since we can no longer perform writeback on
dirty folios, the dirty file folios will still be activated.

Additionally, under the original logic, if we continue to try writeback dirty
file folios, we will also check the references flag, sc->may_writepage, and
may_enter_fs(), which may result in dirty file folios being left in the inactive
list. This is unreasonable. Even if these dirty folios are scanned again, we
still cannot clean them.

Therefore, the checks on these dirty file folios appear to be redundant and can
be removed. Dirty file folios should be directly moved to the active list to
avoid being scanned again. Since we set the PG_reclaim flag for the dirty folios,
once the writeback is completed, they will be moved back to the tail of the
inactive list to be retried for quick reclaim.

Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
---
 include/linux/mmzone.h |  4 ----
 mm/vmscan.c            | 25 +++----------------------
 2 files changed, 3 insertions(+), 26 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 7fb7331c5725..4398e027f450 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -1060,10 +1060,6 @@ struct zone {
 } ____cacheline_internodealigned_in_smp;
 
 enum pgdat_flags {
-	PGDAT_DIRTY,			/* reclaim scanning has recently found
-					 * many dirty file pages at the tail
-					 * of the LRU.
-					 */
 	PGDAT_WRITEBACK,		/* reclaim scanning has recently found
 					 * many pages under writeback
 					 */
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 65f299e4b8f0..c922bad2b8fd 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1387,21 +1387,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 
 		mapping = folio_mapping(folio);
 		if (folio_test_dirty(folio)) {
-			/*
-			 * Only kswapd can writeback filesystem folios
-			 * to avoid risk of stack overflow. But avoid
-			 * injecting inefficient single-folio I/O into
-			 * flusher writeback as much as possible: only
-			 * write folios when we've encountered many
-			 * dirty folios, and when we've already scanned
-			 * the rest of the LRU for clean folios and see
-			 * the same dirty folios again (with the reclaim
-			 * flag set).
-			 */
-			if (folio_is_file_lru(folio) &&
-			    (!current_is_kswapd() ||
-			     !folio_test_reclaim(folio) ||
-			     !test_bit(PGDAT_DIRTY, &pgdat->flags))) {
+			if (folio_is_file_lru(folio)) {
 				/*
 				 * Immediately reclaim when written back.
 				 * Similar in principle to folio_deactivate()
@@ -1410,7 +1396,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
 				 */
 				node_stat_mod_folio(folio, NR_VMSCAN_IMMEDIATE,
 						nr_pages);
-				folio_set_reclaim(folio);
+				if (!folio_test_reclaim(folio))
+					folio_set_reclaim(folio);
 
 				goto activate_locked;
 			}
@@ -6105,11 +6092,6 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
 		if (sc->nr.writeback && sc->nr.writeback == sc->nr.taken)
 			set_bit(PGDAT_WRITEBACK, &pgdat->flags);
 
-		/* Allow kswapd to start writing pages during reclaim.*/
-		if (sc->nr.unqueued_dirty &&
-			sc->nr.unqueued_dirty == sc->nr.file_taken)
-			set_bit(PGDAT_DIRTY, &pgdat->flags);
-
 		/*
 		 * If kswapd scans pages marked for immediate
 		 * reclaim and under writeback (nr_immediate), it
@@ -6850,7 +6832,6 @@ static void clear_pgdat_congested(pg_data_t *pgdat)
 
 	clear_bit(LRUVEC_NODE_CONGESTED, &lruvec->flags);
 	clear_bit(LRUVEC_CGROUP_CONGESTED, &lruvec->flags);
-	clear_bit(PGDAT_DIRTY, &pgdat->flags);
 	clear_bit(PGDAT_WRITEBACK, &pgdat->flags);
 }
 
-- 
2.43.7

Re: [PATCH v2 2/2] mm: vmscan: simplify the logic for activating dirty file folios

Posted by Michal Hocko 3 months, 3 weeks ago

On Fri 17-10-25 15:53:07, Baolin Wang wrote:
> After commit 6b0dfabb3555 ("fs: Remove aops->writepage"), we no longer
> attempt to write back filesystem folios through reclaim.
> 
> However, in the shrink_folio_list() function, there still remains some
> logic related to writeback control of dirty file folios. The original
> logic was that, for direct reclaim, or when folio_test_reclaim() is false,
> or the PGDAT_DIRTY flag is not set, the dirty file folios would be directly
> activated to avoid being scanned again; otherwise, it will try to writeback
> the dirty file folios. However, since we can no longer perform writeback on
> dirty folios, the dirty file folios will still be activated.
> 
> Additionally, under the original logic, if we continue to try writeback dirty
> file folios, we will also check the references flag, sc->may_writepage, and
> may_enter_fs(), which may result in dirty file folios being left in the inactive
> list. This is unreasonable. Even if these dirty folios are scanned again, we
> still cannot clean them.
> 
> Therefore, the checks on these dirty file folios appear to be redundant and can
> be removed. Dirty file folios should be directly moved to the active list to
> avoid being scanned again. Since we set the PG_reclaim flag for the dirty folios,
> once the writeback is completed, they will be moved back to the tail of the
> inactive list to be retried for quick reclaim.

Is there any actual problem you are trying to address or is this a code
clean up? How have you evaluated this change? 

> Signed-off-by: Baolin Wang <baolin.wang@linux.alibaba.com>
> ---
>  include/linux/mmzone.h |  4 ----
>  mm/vmscan.c            | 25 +++----------------------
>  2 files changed, 3 insertions(+), 26 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 7fb7331c5725..4398e027f450 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -1060,10 +1060,6 @@ struct zone {
>  } ____cacheline_internodealigned_in_smp;
>  
>  enum pgdat_flags {
> -	PGDAT_DIRTY,			/* reclaim scanning has recently found
> -					 * many dirty file pages at the tail
> -					 * of the LRU.
> -					 */
>  	PGDAT_WRITEBACK,		/* reclaim scanning has recently found
>  					 * many pages under writeback
>  					 */
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 65f299e4b8f0..c922bad2b8fd 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1387,21 +1387,7 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>  
>  		mapping = folio_mapping(folio);
>  		if (folio_test_dirty(folio)) {
> -			/*
> -			 * Only kswapd can writeback filesystem folios
> -			 * to avoid risk of stack overflow. But avoid
> -			 * injecting inefficient single-folio I/O into
> -			 * flusher writeback as much as possible: only
> -			 * write folios when we've encountered many
> -			 * dirty folios, and when we've already scanned
> -			 * the rest of the LRU for clean folios and see
> -			 * the same dirty folios again (with the reclaim
> -			 * flag set).
> -			 */
> -			if (folio_is_file_lru(folio) &&
> -			    (!current_is_kswapd() ||
> -			     !folio_test_reclaim(folio) ||
> -			     !test_bit(PGDAT_DIRTY, &pgdat->flags))) {
> +			if (folio_is_file_lru(folio)) {
>  				/*
>  				 * Immediately reclaim when written back.
>  				 * Similar in principle to folio_deactivate()
> @@ -1410,7 +1396,8 @@ static unsigned int shrink_folio_list(struct list_head *folio_list,
>  				 */
>  				node_stat_mod_folio(folio, NR_VMSCAN_IMMEDIATE,
>  						nr_pages);
> -				folio_set_reclaim(folio);
> +				if (!folio_test_reclaim(folio))
> +					folio_set_reclaim(folio);
>  
>  				goto activate_locked;
>  			}
> @@ -6105,11 +6092,6 @@ static void shrink_node(pg_data_t *pgdat, struct scan_control *sc)
>  		if (sc->nr.writeback && sc->nr.writeback == sc->nr.taken)
>  			set_bit(PGDAT_WRITEBACK, &pgdat->flags);
>  
> -		/* Allow kswapd to start writing pages during reclaim.*/
> -		if (sc->nr.unqueued_dirty &&
> -			sc->nr.unqueued_dirty == sc->nr.file_taken)
> -			set_bit(PGDAT_DIRTY, &pgdat->flags);
> -
>  		/*
>  		 * If kswapd scans pages marked for immediate
>  		 * reclaim and under writeback (nr_immediate), it
> @@ -6850,7 +6832,6 @@ static void clear_pgdat_congested(pg_data_t *pgdat)
>  
>  	clear_bit(LRUVEC_NODE_CONGESTED, &lruvec->flags);
>  	clear_bit(LRUVEC_CGROUP_CONGESTED, &lruvec->flags);
> -	clear_bit(PGDAT_DIRTY, &pgdat->flags);
>  	clear_bit(PGDAT_WRITEBACK, &pgdat->flags);
>  }
>  
> -- 
> 2.43.7
> 

-- 
Michal Hocko
SUSE Labs

Re: [PATCH v2 2/2] mm: vmscan: simplify the logic for activating dirty file folios

Posted by Baolin Wang 3 months, 2 weeks ago


On 2025/10/17 20:02, Michal Hocko wrote:
> On Fri 17-10-25 15:53:07, Baolin Wang wrote:
>> After commit 6b0dfabb3555 ("fs: Remove aops->writepage"), we no longer
>> attempt to write back filesystem folios through reclaim.
>>
>> However, in the shrink_folio_list() function, there still remains some
>> logic related to writeback control of dirty file folios. The original
>> logic was that, for direct reclaim, or when folio_test_reclaim() is false,
>> or the PGDAT_DIRTY flag is not set, the dirty file folios would be directly
>> activated to avoid being scanned again; otherwise, it will try to writeback
>> the dirty file folios. However, since we can no longer perform writeback on
>> dirty folios, the dirty file folios will still be activated.
>>
>> Additionally, under the original logic, if we continue to try writeback dirty
>> file folios, we will also check the references flag, sc->may_writepage, and
>> may_enter_fs(), which may result in dirty file folios being left in the inactive
>> list. This is unreasonable. Even if these dirty folios are scanned again, we
>> still cannot clean them.
>>
>> Therefore, the checks on these dirty file folios appear to be redundant and can
>> be removed. Dirty file folios should be directly moved to the active list to
>> avoid being scanned again. Since we set the PG_reclaim flag for the dirty folios,
>> once the writeback is completed, they will be moved back to the tail of the
>> inactive list to be retried for quick reclaim.
> 
> Is there any actual problem you are trying to address or is this a code
> clean up? How have you evaluated this change?

This patch is more of a cleanup patch, since dirty file folios are also 
activated in pageout(), so there are essentially no significant logical 
changes. Moreover, this patch set is a continuation of the previous 
cleanup work[1] for dirty file folios, and further cleanup and 
optimization work for file folios reclamation is still ongoing.

I conducted some evaluations (such as building the kernel in memcg to 
reclaim file folios and running thpcompact to reclaim file folios), and 
I did not observe any obvious changes in reclaim efficiency.

[1] 
https://lore.kernel.org/all/cover.1758166683.git.baolin.wang@linux.alibaba.com/T/#u

[PATCH v2 1/2] mm: vmscan: filter out the dirty file folios for node_reclaim()
[PATCH v2 2/2] mm: vmscan: simplify the logic for activating dirty file folios