From nobody Fri Dec 19 04:53:44 2025 Received: from mx0b-001b2d01.pphosted.com (mx0b-001b2d01.pphosted.com [148.163.158.5]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E21FF223C6A for ; Thu, 19 Dec 2024 12:47:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=148.163.158.5 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734612455; cv=none; b=HvbyUqgp80Lh+FG4gcHhlRR0GxlGZws6PSURWpKQq32aVGl9ZEPNv/nLxkDtQW+XbTbRqPgYRwMqlKUv/SDv306nqlSjh2QAf5/dVEfK05WeXIwyXx89bnabeWt478bQr8Z5Mi4spX9rxposdI8622KjCKE9nFycQpOoAdDUZ80= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1734612455; c=relaxed/simple; bh=XtTKI1HHUAWsCnCA1jtUVcjeJUBiGKMZN5my7D1olDg=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version; b=otlcAEhiI/Gno5JtgBq/UxusQnfqXhwefIk5kZAUB5Hcds1Kw14shk5WChYLEv/HbdJQCODo3mOJB3e1WdaLKQP7HRAbTuIii9hZIVYMC6g+PXwFhdizUaoJR2ttWqsyufybHxzZwh0SuCigzbVxpXbSLRw0vhnD0+/0pcmU43c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com; spf=pass smtp.mailfrom=linux.ibm.com; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b=Keg1u6Ik; arc=none smtp.client-ip=148.163.158.5 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linux.ibm.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ibm.com header.i=@ibm.com header.b="Keg1u6Ik" Received: from pps.filterd (m0356516.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.18.1.2/8.18.1.2) with ESMTP id 4BJB0SEg014352; Thu, 19 Dec 2024 12:47:23 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ibm.com; h=cc :content-transfer-encoding:date:from:message-id:mime-version :subject:to; s=pp1; bh=elaGijxkkLnrovQ0wbhS4K0PFXYRA3vhn0PBXG6wy w0=; b=Keg1u6Ik0ndS/sTH4+ZlrFuHAVAA1aT2K7B98QKgDXsVasf5rbSV4iL91 njd4Ghzn3QlJqX1MIy3PukxLqwtcVa9IvFW/2fQD5Hx2UFUzxoCIf+NAbLdcOkKN sDC0atFHbPa1kulxPSNCPd09mJuuF3ircxXedMNSEPmH+AWioUx/DqxPkpdjrpTR WBJC6HG5/krk+7UmBeE12ts6wnYKJWeG25mDye+bpaQaf0lIKlrA7OSdt8CCJa7G wl8JjpB1AxNp2vDVjRwkrcXrw6aqsz+lgENqrjWvKudVnlAUODeadTB79IgJHrgh By+dwvZtIFeThxArJaa+uksRwWbGw== Received: from pps.reinject (localhost [127.0.0.1]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 43mj808ebm-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 19 Dec 2024 12:47:23 +0000 (GMT) Received: from m0356516.ppops.net (m0356516.ppops.net [127.0.0.1]) by pps.reinject (8.18.0.8/8.18.0.8) with ESMTP id 4BJClMgx016110; Thu, 19 Dec 2024 12:47:22 GMT Received: from ppma22.wdc07v.mail.ibm.com (5c.69.3da9.ip4.static.sl-reverse.com [169.61.105.92]) by mx0a-001b2d01.pphosted.com (PPS) with ESMTPS id 43mj808ebh-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 19 Dec 2024 12:47:22 +0000 (GMT) Received: from pps.filterd (ppma22.wdc07v.mail.ibm.com [127.0.0.1]) by ppma22.wdc07v.mail.ibm.com (8.18.1.2/8.18.1.2) with ESMTP id 4BJBRpWS014320; Thu, 19 Dec 2024 12:47:22 GMT Received: from smtprelay04.fra02v.mail.ibm.com ([9.218.2.228]) by ppma22.wdc07v.mail.ibm.com (PPS) with ESMTPS id 43hmqyd98n-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 19 Dec 2024 12:47:22 +0000 Received: from smtpav01.fra02v.mail.ibm.com (smtpav01.fra02v.mail.ibm.com [10.20.54.100]) by smtprelay04.fra02v.mail.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id 4BJClKPC35127956 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Thu, 19 Dec 2024 12:47:20 GMT Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 094E220043; Thu, 19 Dec 2024 12:47:20 +0000 (GMT) Received: from smtpav01.fra02v.mail.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 7293520040; Thu, 19 Dec 2024 12:47:18 +0000 (GMT) Received: from ltczz402-lp1.aus.stglabs.ibm.com (unknown [9.40.194.31]) by smtpav01.fra02v.mail.ibm.com (Postfix) with ESMTP; Thu, 19 Dec 2024 12:47:18 +0000 (GMT) From: Donet Tom To: Andrew Morton , linux-mm@kvack.org, linux-kernel@vger.kernel.org Cc: Ritesh Harjani , Baolin Wang , "Aneesh Kumar K . V" , Zi Yan , David Hildenbrand , shuah Khan , Dev Jain Subject: [PATCH] mm: migration :shared anonymous migration test is failing Date: Thu, 19 Dec 2024 06:47:17 -0600 Message-ID: <20241219124717.4907-1-donettom@linux.ibm.com> X-Mailer: git-send-email 2.43.5 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-TM-AS-GCONF: 00 X-Proofpoint-GUID: yfdMwDDzJGX9PH5RVx8P9ejigjup5UHP X-Proofpoint-ORIG-GUID: xiV9b35L8M_EvD3qix6qkJKdK61Ddlax X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.293,Aquarius:18.0.1051,Hydra:6.0.680,FMLib:17.12.62.30 definitions=2024-10-15_01,2024-10-11_01,2024-09-30_01 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 impostorscore=0 lowpriorityscore=0 priorityscore=1501 malwarescore=0 phishscore=0 mlxlogscore=999 bulkscore=0 adultscore=0 suspectscore=0 clxscore=1011 mlxscore=0 spamscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.19.0-2411120000 definitions=main-2412190100 Content-Type: text/plain; charset="utf-8" The migration selftest is currently failing for shared anonymous mappings due to a race condition. During migration, the source folio's PTE is unmapped by nuking the PTE, flushing the TLB,and then marking the page for migration (by creating the swap entries). The issue arises when, immediately after the PTE is nuked and the TLB is flushed, but before the page is marked for migration, another thread accesses the page. This triggers a page fault, and the page fault handler invokes do_pte_missing() instead of do_swap_page(), as the page is not yet marked for migration. In the fault handling path, do_pte_missing() calls __do_fault() ->shmem_fault() -> shmem_get_folio_gfp() -> filemap_get_entry(). This eventually calls folio_try_get(), incrementing the reference count of the folio undergoing migration. The thread then blocks on folio_lock(), as the migration path holds the lock. This results in the migration failing in __migrate_folio(), which expects the folio's reference count to be 2. However, the reference count is incremented by the fault handler, leading to the failure. The issue arises because, after nuking the PTE and before marking the page for migration, the page is accessed. To address this, we have updated the logic to first nuke the PTE, then mark the page for migration, and only then flush the TLB. With this patch, If the page is accessed immediately after nuking the PTE, the TLB entry is still valid, so no fault occurs. After marking the page for migration, flushing the TLB ensures that the next page fault correctly triggers do_swap_page() and waits for the migration to complete. Test Result without this patch =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D # ./tools/testing/selftests/mm/migration TAP version 13 1..3 # Starting 3 tests from 1 test cases. # RUN migration.private_anon ... # OK migration.private_anon ok 1 migration.private_anon # RUN migration.shared_anon ... Didn't migrate 1 pages # migration.c:175:shared_anon:Expected migrate(ptr, self->n1, self->n2) (-2) =3D=3D 0 (0) # shared_anon: Test terminated by assertion # FAIL migration.shared_anon not ok 2 migration.shared_anon # RUN migration.private_anon_thp ... # OK migration.private_anon_thp ok 3 migration.private_anon_thp # FAILED: 2 / 3 tests passed. # Totals: pass:2 fail:1 xfail:0 xpass:0 skip:0 error:0 Test result with this patch =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D # ./tools/testing/selftests/mm/migration TAP version 13 1..3 # Starting 3 tests from 1 test cases. # RUN migration.private_anon ... # OK migration.private_anon ok 1 migration.private_anon # RUN migration.shared_anon ... # OK migration.shared_anon ok 2 migration.shared_anon # RUN migration.private_anon_thp ... # OK migration.private_anon_thp ok 3 migration.private_anon_thp # PASSED: 3 / 3 tests passed. # Totals: pass:3 fail:0 xfail:0 xpass:0 skip:0 error:0 Signed-off-by: Donet Tom --- mm/rmap.c | 12 ++++++++++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index c6c4d4ea29a7..920ae46e977f 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -2154,7 +2154,7 @@ static bool try_to_migrate_one(struct folio *folio, s= truct vm_area_struct *vma, hugetlb_vma_unlock_write(vma); } /* Nuke the hugetlb page table entry */ - pteval =3D huge_ptep_clear_flush(vma, address, pvmw.pte); + pteval =3D huge_ptep_get_and_clear(mm, address, pvmw.pte); } else { flush_cache_page(vma, address, pfn); /* Nuke the page table entry. */ @@ -2171,7 +2171,7 @@ static bool try_to_migrate_one(struct folio *folio, s= truct vm_area_struct *vma, =20 set_tlb_ubc_flush_pending(mm, pteval, address); } else { - pteval =3D ptep_clear_flush(vma, address, pvmw.pte); + pteval =3D ptep_get_and_clear(mm, address, pvmw.pte); } } =20 @@ -2320,6 +2320,14 @@ static bool try_to_migrate_one(struct folio *folio, = struct vm_area_struct *vma, folio_remove_rmap_pte(folio, subpage, vma); if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); + + if (!should_defer_flush(mm, flags)) { + if (folio_test_hugetlb(folio)) + flush_hugetlb_page(vma, address); + else + flush_tlb_page(vma, address); + } + folio_put(folio); } =20 --=20 2.37.2