Suppose a folio is under migration, and khugepaged is also trying to
collapse it. collapse_pte_mapped_thp() will retrieve the folio from the
page cache via filemap_lock_folio(), thus taking a reference on the folio
and sleeping on the folio lock, since the lock is held by the migration
path. Migration will then fail in
__folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of
such a race happening (leading to migration failure) by bailing out
if we detect a PMD is marked with a migration entry.
This fixes the migration-shared-anon-thp testcase failure on Apple M3.
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
mm/khugepaged.c | 12 ++++++++++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 4c8d33abfbd8..bc8774f62e86 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -31,6 +31,7 @@ enum scan_result {
SCAN_FAIL,
SCAN_SUCCEED,
SCAN_PMD_NULL,
+ SCAN_PMD_MIGRATION,
SCAN_PMD_NONE,
SCAN_PMD_MAPPED,
SCAN_EXCEED_NONE_PTE,
@@ -956,6 +957,8 @@ static inline int check_pmd_state(pmd_t *pmd)
if (pmd_none(pmde))
return SCAN_PMD_NONE;
+ if (is_pmd_migration_entry(pmde))
+ return SCAN_PMD_MIGRATION;
if (!pmd_present(pmde))
return SCAN_PMD_NULL;
if (pmd_trans_huge(pmde))
@@ -1518,9 +1521,12 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr,
!range_in_vma(vma, haddr, haddr + HPAGE_PMD_SIZE))
return SCAN_VMA_CHECK;
- /* Fast check before locking page if already PMD-mapped */
+ /*
+ * Fast check before locking folio if already PMD-mapped, or if the
+ * folio is under migration
+ */
result = find_pmd_or_thp_or_none(mm, haddr, &pmd);
- if (result == SCAN_PMD_MAPPED)
+ if (result == SCAN_PMD_MAPPED || result == SCAN_PMD_MIGRATION)
return result;
/*
@@ -2745,6 +2751,7 @@ static int madvise_collapse_errno(enum scan_result r)
case SCAN_PAGE_LRU:
case SCAN_DEL_PAGE_LRU:
case SCAN_PAGE_FILLED:
+ case SCAN_PMD_MIGRATION:
return -EAGAIN;
/*
* Other: Trying again likely not to succeed / error intrinsic to
@@ -2834,6 +2841,7 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev,
goto handle_result;
/* Whitelisted set of results where continuing OK */
case SCAN_PMD_NULL:
+ case SCAN_PMD_MIGRATION:
case SCAN_PTE_NON_PRESENT:
case SCAN_PTE_UFFD_WP:
case SCAN_PAGE_RO:
--
2.30.2
On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote: > Suppose a folio is under migration, and khugepaged is also trying to > collapse it. collapse_pte_mapped_thp() will retrieve the folio from the > page cache via filemap_lock_folio(), thus taking a reference on the folio > and sleeping on the folio lock, since the lock is held by the migration > path. Migration will then fail in > __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of > such a race happening (leading to migration failure) by bailing out > if we detect a PMD is marked with a migration entry. > > This fixes the migration-shared-anon-thp testcase failure on Apple M3. Hm is this related to the series at all? Seems somewhat unrelated? Is there a Fixes, Closes, etc.? Do we need something in stable? > > Signed-off-by: Dev Jain <dev.jain@arm.com> > --- > mm/khugepaged.c | 12 ++++++++++-- > 1 file changed, 10 insertions(+), 2 deletions(-) > > diff --git a/mm/khugepaged.c b/mm/khugepaged.c > index 4c8d33abfbd8..bc8774f62e86 100644 > --- a/mm/khugepaged.c > +++ b/mm/khugepaged.c > @@ -31,6 +31,7 @@ enum scan_result { > SCAN_FAIL, > SCAN_SUCCEED, > SCAN_PMD_NULL, > + SCAN_PMD_MIGRATION, > SCAN_PMD_NONE, > SCAN_PMD_MAPPED, > SCAN_EXCEED_NONE_PTE, > @@ -956,6 +957,8 @@ static inline int check_pmd_state(pmd_t *pmd) > > if (pmd_none(pmde)) > return SCAN_PMD_NONE; > + if (is_pmd_migration_entry(pmde)) > + return SCAN_PMD_MIGRATION; > if (!pmd_present(pmde)) > return SCAN_PMD_NULL; > if (pmd_trans_huge(pmde)) > @@ -1518,9 +1521,12 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, > !range_in_vma(vma, haddr, haddr + HPAGE_PMD_SIZE)) > return SCAN_VMA_CHECK; > > - /* Fast check before locking page if already PMD-mapped */ > + /* > + * Fast check before locking folio if already PMD-mapped, or if the > + * folio is under migration > + */ > result = find_pmd_or_thp_or_none(mm, haddr, &pmd); > - if (result == SCAN_PMD_MAPPED) > + if (result == SCAN_PMD_MAPPED || result == SCAN_PMD_MIGRATION) > return result; > > /* > @@ -2745,6 +2751,7 @@ static int madvise_collapse_errno(enum scan_result r) > case SCAN_PAGE_LRU: > case SCAN_DEL_PAGE_LRU: > case SCAN_PAGE_FILLED: > + case SCAN_PMD_MIGRATION: > return -EAGAIN; > /* > * Other: Trying again likely not to succeed / error intrinsic to > @@ -2834,6 +2841,7 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, > goto handle_result; > /* Whitelisted set of results where continuing OK */ > case SCAN_PMD_NULL: > + case SCAN_PMD_MIGRATION: > case SCAN_PTE_NON_PRESENT: > case SCAN_PTE_UFFD_WP: > case SCAN_PAGE_RO: > -- > 2.30.2 >
On 25/06/25 6:58 pm, Lorenzo Stoakes wrote: > On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote: >> Suppose a folio is under migration, and khugepaged is also trying to >> collapse it. collapse_pte_mapped_thp() will retrieve the folio from the >> page cache via filemap_lock_folio(), thus taking a reference on the folio >> and sleeping on the folio lock, since the lock is held by the migration >> path. Migration will then fail in >> __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of >> such a race happening (leading to migration failure) by bailing out >> if we detect a PMD is marked with a migration entry. >> >> This fixes the migration-shared-anon-thp testcase failure on Apple M3. > Hm is this related to the series at all? Seems somewhat unrelated? Not related. > > Is there a Fixes, Closes, etc.? Do we need something in stable? We don't need anything. This is an "expected race" in the sense that both migration and khugepaged collapse are best effort algorithms. I am just seeing a test failure on my system because my system hits the race more often. So this patch reduces the window for the race. For some previous context, you may look at: https://lore.kernel.org/all/20240801081657.1386743-1-dev.jain@arm.com/ and https://lore.kernel.org/all/20240830051609.4037834-1-dev.jain@arm.com/ > >> Signed-off-by: Dev Jain <dev.jain@arm.com> >> --- >> mm/khugepaged.c | 12 ++++++++++-- >> 1 file changed, 10 insertions(+), 2 deletions(-) >> >> diff --git a/mm/khugepaged.c b/mm/khugepaged.c >> index 4c8d33abfbd8..bc8774f62e86 100644 >> --- a/mm/khugepaged.c >> +++ b/mm/khugepaged.c >> @@ -31,6 +31,7 @@ enum scan_result { >> SCAN_FAIL, >> SCAN_SUCCEED, >> SCAN_PMD_NULL, >> + SCAN_PMD_MIGRATION, >> SCAN_PMD_NONE, >> SCAN_PMD_MAPPED, >> SCAN_EXCEED_NONE_PTE, >> @@ -956,6 +957,8 @@ static inline int check_pmd_state(pmd_t *pmd) >> >> if (pmd_none(pmde)) >> return SCAN_PMD_NONE; >> + if (is_pmd_migration_entry(pmde)) >> + return SCAN_PMD_MIGRATION; >> if (!pmd_present(pmde)) >> return SCAN_PMD_NULL; >> if (pmd_trans_huge(pmde)) >> @@ -1518,9 +1521,12 @@ int collapse_pte_mapped_thp(struct mm_struct *mm, unsigned long addr, >> !range_in_vma(vma, haddr, haddr + HPAGE_PMD_SIZE)) >> return SCAN_VMA_CHECK; >> >> - /* Fast check before locking page if already PMD-mapped */ >> + /* >> + * Fast check before locking folio if already PMD-mapped, or if the >> + * folio is under migration >> + */ >> result = find_pmd_or_thp_or_none(mm, haddr, &pmd); >> - if (result == SCAN_PMD_MAPPED) >> + if (result == SCAN_PMD_MAPPED || result == SCAN_PMD_MIGRATION) >> return result; >> >> /* >> @@ -2745,6 +2751,7 @@ static int madvise_collapse_errno(enum scan_result r) >> case SCAN_PAGE_LRU: >> case SCAN_DEL_PAGE_LRU: >> case SCAN_PAGE_FILLED: >> + case SCAN_PMD_MIGRATION: >> return -EAGAIN; >> /* >> * Other: Trying again likely not to succeed / error intrinsic to >> @@ -2834,6 +2841,7 @@ int madvise_collapse(struct vm_area_struct *vma, struct vm_area_struct **prev, >> goto handle_result; >> /* Whitelisted set of results where continuing OK */ >> case SCAN_PMD_NULL: >> + case SCAN_PMD_MIGRATION: >> case SCAN_PTE_NON_PRESENT: >> case SCAN_PTE_UFFD_WP: >> case SCAN_PAGE_RO: >> -- >> 2.30.2 >>
On Thu, Jun 26, 2025 at 09:22:28AM +0530, Dev Jain wrote: > > On 25/06/25 6:58 pm, Lorenzo Stoakes wrote: > > On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote: > > > Suppose a folio is under migration, and khugepaged is also trying to > > > collapse it. collapse_pte_mapped_thp() will retrieve the folio from the > > > page cache via filemap_lock_folio(), thus taking a reference on the folio > > > and sleeping on the folio lock, since the lock is held by the migration > > > path. Migration will then fail in > > > __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of > > > such a race happening (leading to migration failure) by bailing out > > > if we detect a PMD is marked with a migration entry. > > > > > > This fixes the migration-shared-anon-thp testcase failure on Apple M3. > > Hm is this related to the series at all? Seems somewhat unrelated? > > Not related. > > > > > Is there a Fixes, Closes, etc.? Do we need something in stable? > > We don't need anything. This is an "expected race" in the sense that > both migration and khugepaged collapse are best effort algorithms. > I am just seeing a test failure on my system because my system hits > the race more often. So this patch reduces the window for the race. Does it rely on previous patches? If not probably best to send this one separately :)
On 26/06/25 10:27 am, Lorenzo Stoakes wrote: > On Thu, Jun 26, 2025 at 09:22:28AM +0530, Dev Jain wrote: >> On 25/06/25 6:58 pm, Lorenzo Stoakes wrote: >>> On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote: >>>> Suppose a folio is under migration, and khugepaged is also trying to >>>> collapse it. collapse_pte_mapped_thp() will retrieve the folio from the >>>> page cache via filemap_lock_folio(), thus taking a reference on the folio >>>> and sleeping on the folio lock, since the lock is held by the migration >>>> path. Migration will then fail in >>>> __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of >>>> such a race happening (leading to migration failure) by bailing out >>>> if we detect a PMD is marked with a migration entry. >>>> >>>> This fixes the migration-shared-anon-thp testcase failure on Apple M3. >>> Hm is this related to the series at all? Seems somewhat unrelated? >> Not related. >> >>> Is there a Fixes, Closes, etc.? Do we need something in stable? >> We don't need anything. This is an "expected race" in the sense that >> both migration and khugepaged collapse are best effort algorithms. >> I am just seeing a test failure on my system because my system hits >> the race more often. So this patch reduces the window for the race. > Does it rely on previous patches? If not probably best to send this one > separately :) To prevent rebasing headaches for others (if any) I thought to send all together. I'll send it separately if still that is the preference.
On Thu, Jun 26, 2025 at 10:29:22AM +0530, Dev Jain wrote: > > On 26/06/25 10:27 am, Lorenzo Stoakes wrote: > > On Thu, Jun 26, 2025 at 09:22:28AM +0530, Dev Jain wrote: > > > On 25/06/25 6:58 pm, Lorenzo Stoakes wrote: > > > > On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote: > > > > > Suppose a folio is under migration, and khugepaged is also trying to > > > > > collapse it. collapse_pte_mapped_thp() will retrieve the folio from the > > > > > page cache via filemap_lock_folio(), thus taking a reference on the folio > > > > > and sleeping on the folio lock, since the lock is held by the migration > > > > > path. Migration will then fail in > > > > > __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of > > > > > such a race happening (leading to migration failure) by bailing out > > > > > if we detect a PMD is marked with a migration entry. > > > > > > > > > > This fixes the migration-shared-anon-thp testcase failure on Apple M3. > > > > Hm is this related to the series at all? Seems somewhat unrelated? > > > Not related. > > > > > > > Is there a Fixes, Closes, etc.? Do we need something in stable? > > > We don't need anything. This is an "expected race" in the sense that > > > both migration and khugepaged collapse are best effort algorithms. > > > I am just seeing a test failure on my system because my system hits > > > the race more often. So this patch reduces the window for the race. > > Does it rely on previous patches? If not probably best to send this one > > separately :) > > To prevent rebasing headaches for others (if any) I thought to send all together. > I'll send it separately if still that is the preference. > > Oh actually would it be a pain to rebase given the previous 2 patches? Maybe leave it then. And I can actually finally review it... :)
On Thu, Jun 26, 2025 at 06:02:25AM +0100, Lorenzo Stoakes wrote: > On Thu, Jun 26, 2025 at 10:29:22AM +0530, Dev Jain wrote: > > > > On 26/06/25 10:27 am, Lorenzo Stoakes wrote: > > > On Thu, Jun 26, 2025 at 09:22:28AM +0530, Dev Jain wrote: > > > > On 25/06/25 6:58 pm, Lorenzo Stoakes wrote: > > > > > On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote: > > > > > > Suppose a folio is under migration, and khugepaged is also trying to > > > > > > collapse it. collapse_pte_mapped_thp() will retrieve the folio from the > > > > > > page cache via filemap_lock_folio(), thus taking a reference on the folio > > > > > > and sleeping on the folio lock, since the lock is held by the migration > > > > > > path. Migration will then fail in > > > > > > __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of > > > > > > such a race happening (leading to migration failure) by bailing out > > > > > > if we detect a PMD is marked with a migration entry. > > > > > > > > > > > > This fixes the migration-shared-anon-thp testcase failure on Apple M3. > > > > > Hm is this related to the series at all? Seems somewhat unrelated? > > > > Not related. > > > > > > > > > Is there a Fixes, Closes, etc.? Do we need something in stable? > > > > We don't need anything. This is an "expected race" in the sense that > > > > both migration and khugepaged collapse are best effort algorithms. > > > > I am just seeing a test failure on my system because my system hits > > > > the race more often. So this patch reduces the window for the race. > > > Does it rely on previous patches? If not probably best to send this one > > > separately :) > > > > To prevent rebasing headaches for others (if any) I thought to send all together. > > I'll send it separately if still that is the preference. > > > > > > Oh actually would it be a pain to rebase given the previous 2 patches? Maybe > leave it then. And I can actually finally review it... :) At the same time you'd have to update cover letter to mention and it's really not good practice to include unrelated patches to a series. So probably best overall to send separately but _wait_ for this series to be merged first :>) then you can based it on mm-new without problems.
On 26/06/25 10:36 am, Lorenzo Stoakes wrote: > On Thu, Jun 26, 2025 at 06:02:25AM +0100, Lorenzo Stoakes wrote: >> On Thu, Jun 26, 2025 at 10:29:22AM +0530, Dev Jain wrote: >>> On 26/06/25 10:27 am, Lorenzo Stoakes wrote: >>>> On Thu, Jun 26, 2025 at 09:22:28AM +0530, Dev Jain wrote: >>>>> On 25/06/25 6:58 pm, Lorenzo Stoakes wrote: >>>>>> On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote: >>>>>>> Suppose a folio is under migration, and khugepaged is also trying to >>>>>>> collapse it. collapse_pte_mapped_thp() will retrieve the folio from the >>>>>>> page cache via filemap_lock_folio(), thus taking a reference on the folio >>>>>>> and sleeping on the folio lock, since the lock is held by the migration >>>>>>> path. Migration will then fail in >>>>>>> __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of >>>>>>> such a race happening (leading to migration failure) by bailing out >>>>>>> if we detect a PMD is marked with a migration entry. >>>>>>> >>>>>>> This fixes the migration-shared-anon-thp testcase failure on Apple M3. >>>>>> Hm is this related to the series at all? Seems somewhat unrelated? >>>>> Not related. >>>>> >>>>>> Is there a Fixes, Closes, etc.? Do we need something in stable? >>>>> We don't need anything. This is an "expected race" in the sense that >>>>> both migration and khugepaged collapse are best effort algorithms. >>>>> I am just seeing a test failure on my system because my system hits >>>>> the race more often. So this patch reduces the window for the race. >>>> Does it rely on previous patches? If not probably best to send this one >>>> separately :) >>> To prevent rebasing headaches for others (if any) I thought to send all together. >>> I'll send it separately if still that is the preference. >>> >>> >> Oh actually would it be a pain to rebase given the previous 2 patches? Maybe >> leave it then. And I can actually finally review it... :) > At the same time you'd have to update cover letter to mention and it's really > not good practice to include unrelated patches to a series. > > So probably best overall to send separately but _wait_ for this series to be > merged first :>) then you can based it on mm-new without problems. Sure.
On 26/06/25 10:32 am, Lorenzo Stoakes wrote: > On Thu, Jun 26, 2025 at 10:29:22AM +0530, Dev Jain wrote: >> On 26/06/25 10:27 am, Lorenzo Stoakes wrote: >>> On Thu, Jun 26, 2025 at 09:22:28AM +0530, Dev Jain wrote: >>>> On 25/06/25 6:58 pm, Lorenzo Stoakes wrote: >>>>> On Wed, Jun 25, 2025 at 11:28:06AM +0530, Dev Jain wrote: >>>>>> Suppose a folio is under migration, and khugepaged is also trying to >>>>>> collapse it. collapse_pte_mapped_thp() will retrieve the folio from the >>>>>> page cache via filemap_lock_folio(), thus taking a reference on the folio >>>>>> and sleeping on the folio lock, since the lock is held by the migration >>>>>> path. Migration will then fail in >>>>>> __folio_migrate_mapping -> folio_ref_freeze. Reduce the probability of >>>>>> such a race happening (leading to migration failure) by bailing out >>>>>> if we detect a PMD is marked with a migration entry. >>>>>> >>>>>> This fixes the migration-shared-anon-thp testcase failure on Apple M3. >>>>> Hm is this related to the series at all? Seems somewhat unrelated? >>>> Not related. >>>> >>>>> Is there a Fixes, Closes, etc.? Do we need something in stable? >>>> We don't need anything. This is an "expected race" in the sense that >>>> both migration and khugepaged collapse are best effort algorithms. >>>> I am just seeing a test failure on my system because my system hits >>>> the race more often. So this patch reduces the window for the race. >>> Does it rely on previous patches? If not probably best to send this one >>> separately :) >> To prevent rebasing headaches for others (if any) I thought to send all together. >> I'll send it separately if still that is the preference. >> >> > Oh actually would it be a pain to rebase given the previous 2 patches? Maybe Didn't take the time to actually check that, more of a "it *may* be a pain so let's send this together". > leave it then. And I can actually finally review it... :)
© 2016 - 2025 Red Hat, Inc.