[PATCH v2 1/2] mm/swap: Add VM_WARN_ON to isolate_lock_cluster()

Hui Zhu posted 2 patches 1 month ago
[PATCH v2 1/2] mm/swap: Add VM_WARN_ON to isolate_lock_cluster()
Posted by Hui Zhu 1 month ago
From: Hui Zhu <zhuhui@kylinos.cn>

swap_cluster_alloc_table() assumes that the caller holds the following
locks:
ci->lock
percpu_swap_cluster.lock
si->global_cluster_lock (required for non-SWP_SOLIDSTATE devices)

There are five call paths leading to swap_cluster_alloc_table():
swap_alloc_hibernation_slot->cluster_alloc_swap_entry
->alloc_swap_scan_list->isolate_lock_cluster->swap_cluster_alloc_table

swap_alloc_slow->cluster_alloc_swap_entry->alloc_swap_scan_list
->isolate_lock_cluster->swap_cluster_alloc_table

swap_alloc_hibernation_slot->cluster_alloc_swap_entry
->swap_reclaim_full_clusters->isolate_lock_cluster
->swap_cluster_alloc_table

swap_alloc_slow->cluster_alloc_swap_entry->swap_reclaim_full_clusters
->isolate_lock_cluster->swap_cluster_alloc_table

swap_reclaim_work->swap_reclaim_full_clusters->isolate_lock_cluster
->swap_cluster_alloc_table

Other paths correctly acquire the necessary locks before calling
swap_cluster_alloc_table().
But swap_reclaim_work() doesn't need do that because
swap_reclaim_full_clusters() just isolate si->full_clusters that
the tables of it must be allocated.
Then isolate_lock_cluster() that is called by
swap_reclaim_full_clusters() will never call swap_cluster_alloc_table().

This patch add a VM_WARN_ON to warning if a fill cluster will be handle
by swap_cluster_alloc_table()

Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
---
 mm/swapfile.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 94af29d1de88..3fc2eb30c187 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -596,6 +596,9 @@ static struct swap_cluster_info *isolate_lock_cluster(
 	spin_unlock(&si->lock);
 
 	if (found && !cluster_table_is_alloced(found)) {
+		/* Table of full cluster must be allocated. */
+		VM_WARN_ON(ci->flags == CLUSTER_FLAG_FULL);
+
 		/* Only an empty free cluster's swap table can be freed. */
 		VM_WARN_ON_ONCE(list != &si->free_clusters);
 		VM_WARN_ON_ONCE(!cluster_is_empty(found));
-- 
2.43.0
Re: [PATCH v2 1/2] mm/swap: Add VM_WARN_ON to isolate_lock_cluster()
Posted by Kairui Song 1 month ago
On Mon, Mar 9, 2026 at 4:06 PM Hui Zhu <hui.zhu@linux.dev> wrote:
>
> From: Hui Zhu <zhuhui@kylinos.cn>
>
> swap_cluster_alloc_table() assumes that the caller holds the following
> locks:
> ci->lock
> percpu_swap_cluster.lock
> si->global_cluster_lock (required for non-SWP_SOLIDSTATE devices)
>
> There are five call paths leading to swap_cluster_alloc_table():
> swap_alloc_hibernation_slot->cluster_alloc_swap_entry
> ->alloc_swap_scan_list->isolate_lock_cluster->swap_cluster_alloc_table
>
> swap_alloc_slow->cluster_alloc_swap_entry->alloc_swap_scan_list
> ->isolate_lock_cluster->swap_cluster_alloc_table
>
> swap_alloc_hibernation_slot->cluster_alloc_swap_entry
> ->swap_reclaim_full_clusters->isolate_lock_cluster
> ->swap_cluster_alloc_table
>
> swap_alloc_slow->cluster_alloc_swap_entry->swap_reclaim_full_clusters
> ->isolate_lock_cluster->swap_cluster_alloc_table
>
> swap_reclaim_work->swap_reclaim_full_clusters->isolate_lock_cluster
> ->swap_cluster_alloc_table
>
> Other paths correctly acquire the necessary locks before calling
> swap_cluster_alloc_table().
> But swap_reclaim_work() doesn't need do that because
> swap_reclaim_full_clusters() just isolate si->full_clusters that
> the tables of it must be allocated.
> Then isolate_lock_cluster() that is called by
> swap_reclaim_full_clusters() will never call swap_cluster_alloc_table().
>
> This patch add a VM_WARN_ON to warning if a fill cluster will be handle
> by swap_cluster_alloc_table()
>
> Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
> ---
>  mm/swapfile.c | 3 +++
>  1 file changed, 3 insertions(+)

Hi Hui,

Thanks for the quick update.

>
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 94af29d1de88..3fc2eb30c187 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -596,6 +596,9 @@ static struct swap_cluster_info *isolate_lock_cluster(
>         spin_unlock(&si->lock);
>
>         if (found && !cluster_table_is_alloced(found)) {
> +               /* Table of full cluster must be allocated. */
> +               VM_WARN_ON(ci->flags == CLUSTER_FLAG_FULL);
> +
>                 /* Only an empty free cluster's swap table can be freed. */
>                 VM_WARN_ON_ONCE(list != &si->free_clusters);
>                 VM_WARN_ON_ONCE(!cluster_is_empty(found));

Hmm, now as I see it, we actually already have the "list !=
&si->free_clusters" and cluster_is_empty check, adding more debug
checks seems not that necessary. Or it might be better if you check
ci->flags != CLUSTER_FLAG_FREE to be more strict? I saw YoungJun have
the same idea here. Also you might want the _ONCE version.

The later lockdep check could be helpful, but for chores like this I
think having one patch is enough if there is no strong reason to split
them. These two patches are all hardening the swap table allocation
path with more sanity checks, right?
Re: [PATCH v2 1/2] mm/swap: Add VM_WARN_ON to isolate_lock_cluster()
Posted by YoungJun Park 1 month ago
On Mon, Mar 09, 2026 at 04:05:41PM +0800, Hui Zhu wrote:
> From: Hui Zhu <zhuhui@kylinos.cn>
> 
> swap_cluster_alloc_table() assumes that the caller holds the following
> locks:
> ci->lock
> percpu_swap_cluster.lock
> si->global_cluster_lock (required for non-SWP_SOLIDSTATE devices)
> 
> There are five call paths leading to swap_cluster_alloc_table():
> swap_alloc_hibernation_slot->cluster_alloc_swap_entry
> ->alloc_swap_scan_list->isolate_lock_cluster->swap_cluster_alloc_table
> 
> swap_alloc_slow->cluster_alloc_swap_entry->alloc_swap_scan_list
> ->isolate_lock_cluster->swap_cluster_alloc_table
> 
> swap_alloc_hibernation_slot->cluster_alloc_swap_entry
> ->swap_reclaim_full_clusters->isolate_lock_cluster
> ->swap_cluster_alloc_table
> 
> swap_alloc_slow->cluster_alloc_swap_entry->swap_reclaim_full_clusters
> ->isolate_lock_cluster->swap_cluster_alloc_table
> 
> swap_reclaim_work->swap_reclaim_full_clusters->isolate_lock_cluster
> ->swap_cluster_alloc_table
> 
> Other paths correctly acquire the necessary locks before calling
> swap_cluster_alloc_table().
> But swap_reclaim_work() doesn't need do that because
> swap_reclaim_full_clusters() just isolate si->full_clusters that
> the tables of it must be allocated.
> Then isolate_lock_cluster() that is called by
> swap_reclaim_full_clusters() will never call swap_cluster_alloc_table().
> 
> This patch add a VM_WARN_ON to warning if a fill cluster will be handle
> by swap_cluster_alloc_table()
> 
> Signed-off-by: Hui Zhu <zhuhui@kylinos.cn>
> ---
>  mm/swapfile.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/mm/swapfile.c b/mm/swapfile.c
> index 94af29d1de88..3fc2eb30c187 100644
> --- a/mm/swapfile.c
> +++ b/mm/swapfile.c
> @@ -596,6 +596,9 @@ static struct swap_cluster_info *isolate_lock_cluster(
>  	spin_unlock(&si->lock);
>  
>  	if (found && !cluster_table_is_alloced(found)) {
> +		/* Table of full cluster must be allocated. */
> +		VM_WARN_ON(ci->flags == CLUSTER_FLAG_FULL);


As Kairui suggested,
Would a VM_WARN to check if `list == si->free_cluster` be good?
Or perhaps `ci->flags != CLUSTER_FLAG_FREE`?

This check only cares about CLUSTER_FLAG_FULL, so it might be better to
cover other flags except the free flag. 

Thanks! 
Youngjun Park