Suppose a folio is under migration, and khugepaged is also trying to
collapse it. collapse_pte_mapped_thp() will retrieve the folio from the
page cache via filemap_lock_folio(), thus taking a reference on the folio
and sleeping on the folio lock, since the lock is held by the migration
path. Migration will then fail in
__folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of
such a race happening (leading to migration failure) by bailing out
if we detect a PMD is marked with a migration entry.
This fixes the migration-shared-anon-thp testcase failure on Apple M3.
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
mm/khugepaged.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 4c8d33abfbd8..bc8774f62e86 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -31,6 +31,7 @@ enum scan_result {
SCAN_FAIL,
SCAN_SUCCEED,
SCAN_PMD_NULL,
+ SCAN_PMD_MIGRATION,
SCAN_PMD_NONE,
SCAN_PMD_MAPPED,
SCAN_EXCEED_NONE_PTE,
@@ -956,6 +957,8 @@ static inline int check_pmd_state(pmd_t *pmd)
if (pmd_none(pmde))
return SCAN_PMD_NONE;
+ if (is_pmd_migration_entry(pmde))
+ return SCAN_PMD_MIGRATION;
if (!pmd_present(pmde))
return SCAN_PMD_NULL;
if (pmd_trans_huge(pmde))
@@ -1518,9 +1521,12 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
!range_in_vma(vma, haddr, haddr + HPAGE_PMD_SIZE))
return SCAN_VMA_CHECK;
- /* Fast check before locking page if already PMD-mapped */
+ /*
+ * Fast check before locking folio if already PMD-mapped, or if the
+ * folio is under migration
+ */
result = find_pmd_or_thp_or_none(mm, haddr, &pmd);
- if (result == SCAN_PMD_MAPPED)
+ if (result == SCAN_PMD_MAPPED || result == SCAN_PMD_MIGRATION)
return result;
/*
@@ -2745,6 +2751,7 @@ static int madvise_collapse_errno(enum scan_result r)
case SCAN_PAGE_LRU:
case SCAN_DEL_PAGE_LRU:
case SCAN_PAGE_FILLED:
+ case SCAN_PMD_MIGRATION:
return -EAGAIN;
/*
* Other: Trying again likely not to succeed / error intrinsic to
@@ -2834,6 +2841,7 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev,
goto handle_result;
/* Whitelisted set of results where continuing OK */
case SCAN_PMD_NULL:
+ case SCAN_PMD_MIGRATION:
case SCAN_PTE_NON_PRESENT:
case SCAN_PTE_UFFD_WP:
case SCAN_PAGE_RO:
--
2.30.2
On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote:
> Suppose a folio is under migration, and khugepaged is also trying to
> collapse it. collapse_pte_mapped_thp() will retrieve the folio from the
> page cache via filemap_lock_folio(), thus taking a reference on the folio
> and sleeping on the folio lock, since the lock is held by the migration
> path. Migration will then fail in
> __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of
> such a race happening (leading to migration failure) by bailing out
> if we detect a PMD is marked with a migration entry.
>
> This fixes the migration-shared-anon-thp testcase failure on Apple M3.
Hm is this related to the series at all? Seems somewhat unrelated?
Is there a Fixes, Closes, etc.? Do we need something in stable?
>
> Signed-off-by: Dev Jain <dev.jain@arm.com>
> ---
> mm/khugepaged.c | 12 ++++++++++--
> 1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 4c8d33abfbd8..bc8774f62e86 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -31,6 +31,7 @@ enum scan_result {
> SCAN_FAIL,
> SCAN_SUCCEED,
> SCAN_PMD_NULL,
> + SCAN_PMD_MIGRATION,
> SCAN_PMD_NONE,
> SCAN_PMD_MAPPED,
> SCAN_EXCEED_NONE_PTE,
> @@ -956,6 +957,8 @@ static inline int check_pmd_state(pmd_t *pmd)
>
> if (pmd_none(pmde))
> return SCAN_PMD_NONE;
> + if (is_pmd_migration_entry(pmde))
> + return SCAN_PMD_MIGRATION;
> if (!pmd_present(pmde))
> return SCAN_PMD_NULL;
> if (pmd_trans_huge(pmde))
> @@ -1518,9 +1521,12 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
> !range_in_vma(vma, haddr, haddr + HPAGE_PMD_SIZE))
> return SCAN_VMA_CHECK;
>
> - /* Fast check before locking page if already PMD-mapped */
> + /*
> + * Fast check before locking folio if already PMD-mapped, or if the
> + * folio is under migration
> + */
> result = find_pmd_or_thp_or_none(mm, haddr, &pmd);
> - if (result == SCAN_PMD_MAPPED)
> + if (result == SCAN_PMD_MAPPED || result == SCAN_PMD_MIGRATION)
> return result;
>
> /*
> @@ -2745,6 +2751,7 @@ static int madvise_collapse_errno(enum scan_result r)
> case SCAN_PAGE_LRU:
> case SCAN_DEL_PAGE_LRU:
> case SCAN_PAGE_FILLED:
> + case SCAN_PMD_MIGRATION:
> return -EAGAIN;
> /*
> * Other: Trying again likely not to succeed / error intrinsic to
> @@ -2834,6 +2841,7 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev,
> goto handle_result;
> /* Whitelisted set of results where continuing OK */
> case SCAN_PMD_NULL:
> + case SCAN_PMD_MIGRATION:
> case SCAN_PTE_NON_PRESENT:
> case SCAN_PTE_UFFD_WP:
> case SCAN_PAGE_RO:
> --
> 2.30.2
>
On 25/06/25 6:58 pm, Lorenzo Stoakes wrote:
> On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote:
>> Suppose a folio is under migration, and khugepaged is also trying to
>> collapse it. collapse_pte_mapped_thp() will retrieve the folio from the
>> page cache via filemap_lock_folio(), thus taking a reference on the folio
>> and sleeping on the folio lock, since the lock is held by the migration
>> path. Migration will then fail in
>> __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of
>> such a race happening (leading to migration failure) by bailing out
>> if we detect a PMD is marked with a migration entry.
>>
>> This fixes the migration-shared-anon-thp testcase failure on Apple M3.
> Hm is this related to the series at all? Seems somewhat unrelated?
Not related.
>
> Is there a Fixes, Closes, etc.? Do we need something in stable?
We don't need anything. This is an "expected race" in the sense that
both migration and khugepaged collapse are best effort algorithms.
I am just seeing a test failure on my system because my system hits
the race more often. So this patch reduces the window for the race.
For some previous context, you may look at:
https://lore.kernel.org/all/20240801081657.1386743-1-dev.jain@arm.com/
and
https://lore.kernel.org/all/20240830051609.4037834-1-dev.jain@arm.com/
>
>> Signed-off-by: Dev Jain <dev.jain@arm.com>
>> ---
>> mm/khugepaged.c | 12 ++++++++++--
>> 1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
>> index 4c8d33abfbd8..bc8774f62e86 100644
>> --- a/mm/khugepaged.c
>> +++ b/mm/khugepaged.c
>> @@ -31,6 +31,7 @@ enum scan_result {
>> SCAN_FAIL,
>> SCAN_SUCCEED,
>> SCAN_PMD_NULL,
>> + SCAN_PMD_MIGRATION,
>> SCAN_PMD_NONE,
>> SCAN_PMD_MAPPED,
>> SCAN_EXCEED_NONE_PTE,
>> @@ -956,6 +957,8 @@ static inline int check_pmd_state(pmd_t *pmd)
>>
>> if (pmd_none(pmde))
>> return SCAN_PMD_NONE;
>> + if (is_pmd_migration_entry(pmde))
>> + return SCAN_PMD_MIGRATION;
>> if (!pmd_present(pmde))
>> return SCAN_PMD_NULL;
>> if (pmd_trans_huge(pmde))
>> @@ -1518,9 +1521,12 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
>> !range_in_vma(vma, haddr, haddr + HPAGE_PMD_SIZE))
>> return SCAN_VMA_CHECK;
>>
>> - /* Fast check before locking page if already PMD-mapped */
>> + /*
>> + * Fast check before locking folio if already PMD-mapped, or if the
>> + * folio is under migration
>> + */
>> result = find_pmd_or_thp_or_none(mm, haddr, &pmd);
>> - if (result == SCAN_PMD_MAPPED)
>> + if (result == SCAN_PMD_MAPPED || result == SCAN_PMD_MIGRATION)
>> return result;
>>
>> /*
>> @@ -2745,6 +2751,7 @@ static int madvise_collapse_errno(enum scan_result r)
>> case SCAN_PAGE_LRU:
>> case SCAN_DEL_PAGE_LRU:
>> case SCAN_PAGE_FILLED:
>> + case SCAN_PMD_MIGRATION:
>> return -EAGAIN;
>> /*
>> * Other: Trying again likely not to succeed / error intrinsic to
>> @@ -2834,6 +2841,7 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev,
>> goto handle_result;
>> /* Whitelisted set of results where continuing OK */
>> case SCAN_PMD_NULL:
>> + case SCAN_PMD_MIGRATION:
>> case SCAN_PTE_NON_PRESENT:
>> case SCAN_PTE_UFFD_WP:
>> case SCAN_PAGE_RO:
>> --
>> 2.30.2
>>
On Thu, Jun 26, 2025 at 09:22:28AM +0530, Dev Jain wrote: > > On 25/06/25 6:58 pm, Lorenzo Stoakes wrote: > > On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote: > > > Suppose a folio is under migration, and khugepaged is also trying to > > > collapse it. collapse_pte_mapped_thp() will retrieve the folio from the > > > page cache via filemap_lock_folio(), thus taking a reference on the folio > > > and sleeping on the folio lock, since the lock is held by the migration > > > path. Migration will then fail in > > > __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of > > > such a race happening (leading to migration failure) by bailing out > > > if we detect a PMD is marked with a migration entry. > > > > > > This fixes the migration-shared-anon-thp testcase failure on Apple M3. > > Hm is this related to the series at all? Seems somewhat unrelated? > > Not related. > > > > > Is there a Fixes, Closes, etc.? Do we need something in stable? > > We don't need anything. This is an "expected race" in the sense that > both migration and khugepaged collapse are best effort algorithms. > I am just seeing a test failure on my system because my system hits > the race more often. So this patch reduces the window for the race. Does it rely on previous patches? If not probably best to send this one separately :)
On 26/06/25 10:27 am, Lorenzo Stoakes wrote: > On Thu, Jun 26, 2025 at 09:22:28AM +0530, Dev Jain wrote: >> On 25/06/25 6:58 pm, Lorenzo Stoakes wrote: >>> On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote: >>>> Suppose a folio is under migration, and khugepaged is also trying to >>>> collapse it. collapse_pte_mapped_thp() will retrieve the folio from the >>>> page cache via filemap_lock_folio(), thus taking a reference on the folio >>>> and sleeping on the folio lock, since the lock is held by the migration >>>> path. Migration will then fail in >>>> __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of >>>> such a race happening (leading to migration failure) by bailing out >>>> if we detect a PMD is marked with a migration entry. >>>> >>>> This fixes the migration-shared-anon-thp testcase failure on Apple M3. >>> Hm is this related to the series at all? Seems somewhat unrelated? >> Not related. >> >>> Is there a Fixes, Closes, etc.? Do we need something in stable? >> We don't need anything. This is an "expected race" in the sense that >> both migration and khugepaged collapse are best effort algorithms. >> I am just seeing a test failure on my system because my system hits >> the race more often. So this patch reduces the window for the race. > Does it rely on previous patches? If not probably best to send this one > separately :) To prevent rebasing headaches for others (if any) I thought to send all together. I'll send it separately if still that is the preference.
On Thu, Jun 26, 2025 at 10:29:22AM +0530, Dev Jain wrote: > > On 26/06/25 10:27 am, Lorenzo Stoakes wrote: > > On Thu, Jun 26, 2025 at 09:22:28AM +0530, Dev Jain wrote: > > > On 25/06/25 6:58 pm, Lorenzo Stoakes wrote: > > > > On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote: > > > > > Suppose a folio is under migration, and khugepaged is also trying to > > > > > collapse it. collapse_pte_mapped_thp() will retrieve the folio from the > > > > > page cache via filemap_lock_folio(), thus taking a reference on the folio > > > > > and sleeping on the folio lock, since the lock is held by the migration > > > > > path. Migration will then fail in > > > > > __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of > > > > > such a race happening (leading to migration failure) by bailing out > > > > > if we detect a PMD is marked with a migration entry. > > > > > > > > > > This fixes the migration-shared-anon-thp testcase failure on Apple M3. > > > > Hm is this related to the series at all? Seems somewhat unrelated? > > > Not related. > > > > > > > Is there a Fixes, Closes, etc.? Do we need something in stable? > > > We don't need anything. This is an "expected race" in the sense that > > > both migration and khugepaged collapse are best effort algorithms. > > > I am just seeing a test failure on my system because my system hits > > > the race more often. So this patch reduces the window for the race. > > Does it rely on previous patches? If not probably best to send this one > > separately :) > > To prevent rebasing headaches for others (if any) I thought to send all together. > I'll send it separately if still that is the preference. > > Oh actually would it be a pain to rebase given the previous 2 patches? Maybe leave it then. And I can actually finally review it... :)
On Thu, Jun 26, 2025 at 06:02:25AM +0100, Lorenzo Stoakes wrote: > On Thu, Jun 26, 2025 at 10:29:22AM +0530, Dev Jain wrote: > > > > On 26/06/25 10:27 am, Lorenzo Stoakes wrote: > > > On Thu, Jun 26, 2025 at 09:22:28AM +0530, Dev Jain wrote: > > > > On 25/06/25 6:58 pm, Lorenzo Stoakes wrote: > > > > > On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote: > > > > > > Suppose a folio is under migration, and khugepaged is also trying to > > > > > > collapse it. collapse_pte_mapped_thp() will retrieve the folio from the > > > > > > page cache via filemap_lock_folio(), thus taking a reference on the folio > > > > > > and sleeping on the folio lock, since the lock is held by the migration > > > > > > path. Migration will then fail in > > > > > > __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of > > > > > > such a race happening (leading to migration failure) by bailing out > > > > > > if we detect a PMD is marked with a migration entry. > > > > > > > > > > > > This fixes the migration-shared-anon-thp testcase failure on Apple M3. > > > > > Hm is this related to the series at all? Seems somewhat unrelated? > > > > Not related. > > > > > > > > > Is there a Fixes, Closes, etc.? Do we need something in stable? > > > > We don't need anything. This is an "expected race" in the sense that > > > > both migration and khugepaged collapse are best effort algorithms. > > > > I am just seeing a test failure on my system because my system hits > > > > the race more often. So this patch reduces the window for the race. > > > Does it rely on previous patches? If not probably best to send this one > > > separately :) > > > > To prevent rebasing headaches for others (if any) I thought to send all together. > > I'll send it separately if still that is the preference. > > > > > > Oh actually would it be a pain to rebase given the previous 2 patches? Maybe > leave it then. And I can actually finally review it... :) At the same time you'd have to update cover letter to mention and it's really not good practice to include unrelated patches to a series. So probably best overall to send separately but _wait_ for this series to be merged first :>) then you can based it on mm-new without problems.
On 26/06/25 10:36 am, Lorenzo Stoakes wrote: > On Thu, Jun 26, 2025 at 06:02:25AM +0100, Lorenzo Stoakes wrote: >> On Thu, Jun 26, 2025 at 10:29:22AM +0530, Dev Jain wrote: >>> On 26/06/25 10:27 am, Lorenzo Stoakes wrote: >>>> On Thu, Jun 26, 2025 at 09:22:28AM +0530, Dev Jain wrote: >>>>> On 25/06/25 6:58 pm, Lorenzo Stoakes wrote: >>>>>> On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote: >>>>>>> Suppose a folio is under migration, and khugepaged is also trying to >>>>>>> collapse it. collapse_pte_mapped_thp() will retrieve the folio from the >>>>>>> page cache via filemap_lock_folio(), thus taking a reference on the folio >>>>>>> and sleeping on the folio lock, since the lock is held by the migration >>>>>>> path. Migration will then fail in >>>>>>> __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of >>>>>>> such a race happening (leading to migration failure) by bailing out >>>>>>> if we detect a PMD is marked with a migration entry. >>>>>>> >>>>>>> This fixes the migration-shared-anon-thp testcase failure on Apple M3. >>>>>> Hm is this related to the series at all? Seems somewhat unrelated? >>>>> Not related. >>>>> >>>>>> Is there a Fixes, Closes, etc.? Do we need something in stable? >>>>> We don't need anything. This is an "expected race" in the sense that >>>>> both migration and khugepaged collapse are best effort algorithms. >>>>> I am just seeing a test failure on my system because my system hits >>>>> the race more often. So this patch reduces the window for the race. >>>> Does it rely on previous patches? If not probably best to send this one >>>> separately :) >>> To prevent rebasing headaches for others (if any) I thought to send all together. >>> I'll send it separately if still that is the preference. >>> >>> >> Oh actually would it be a pain to rebase given the previous 2 patches? Maybe >> leave it then. And I can actually finally review it... :) > At the same time you'd have to update cover letter to mention and it's really > not good practice to include unrelated patches to a series. > > So probably best overall to send separately but _wait_ for this series to be > merged first :>) then you can based it on mm-new without problems. Sure.
On 26/06/25 10:32 am, Lorenzo Stoakes wrote: > On Thu, Jun 26, 2025 at 10:29:22AM +0530, Dev Jain wrote: >> On 26/06/25 10:27 am, Lorenzo Stoakes wrote: >>> On Thu, Jun 26, 2025 at 09:22:28AM +0530, Dev Jain wrote: >>>> On 25/06/25 6:58 pm, Lorenzo Stoakes wrote: >>>>> On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote: >>>>>> Suppose a folio is under migration, and khugepaged is also trying to >>>>>> collapse it. collapse_pte_mapped_thp() will retrieve the folio from the >>>>>> page cache via filemap_lock_folio(), thus taking a reference on the folio >>>>>> and sleeping on the folio lock, since the lock is held by the migration >>>>>> path. Migration will then fail in >>>>>> __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of >>>>>> such a race happening (leading to migration failure) by bailing out >>>>>> if we detect a PMD is marked with a migration entry. >>>>>> >>>>>> This fixes the migration-shared-anon-thp testcase failure on Apple M3. >>>>> Hm is this related to the series at all? Seems somewhat unrelated? >>>> Not related. >>>> >>>>> Is there a Fixes, Closes, etc.? Do we need something in stable? >>>> We don't need anything. This is an "expected race" in the sense that >>>> both migration and khugepaged collapse are best effort algorithms. >>>> I am just seeing a test failure on my system because my system hits >>>> the race more often. So this patch reduces the window for the race. >>> Does it rely on previous patches? If not probably best to send this one >>> separately :) >> To prevent rebasing headaches for others (if any) I thought to send all together. >> I'll send it separately if still that is the preference. >> >> > Oh actually would it be a pain to rebase given the previous 2 patches? Maybe Didn't take the time to actually check that, more of a "it *may* be a pain so let's send this together". > leave it then. And I can actually finally review it... :)
© 2016 - 2026 Red Hat, Inc.