From nobody Fri Oct 3 19:11:49 2025 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 181BB23D2BF; Tue, 26 Aug 2025 07:07:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756192067; cv=none; b=dcm4EVi/B8MCGCfxeZ8OVVoXaukKJrs8RgiYZDlQO8nLRs7ceJJMrcgAofxo4Dn2uUOiyJPjlYrq47Bh1nLP5K3/fYgpHXyxPhg/n1ADgLVHnU6/v3FEpzge+IEg4J/Q/cKOZPjUWoWdLPCguFfrP8wfdnrs0OtMfTo05YGlynU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756192067; c=relaxed/simple; bh=BVmqY9QUeW/Qju2Cab/0dyxPnnSllwBS7kEJmA3PgXQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=WdstXI965l+r4NrBD1mC9zio/nBSHM62j650yN+z+ns3BrwyEdKuOHREK7K5gF/5LP34Ns3s8R8nJLTz4OxBi+6vT40m35SFWuSsGuJjnbiwVSZ+KAWdARxxZgP7Aag+rdlguQxtJrXNryuzcwo/0/g7DzfifbcU+fonkX1FNKI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 56D9E1BF7; Tue, 26 Aug 2025 00:07:35 -0700 (PDT) Received: from localhost.localdomain (unknown [10.163.65.202]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id CD0153F63F; Tue, 26 Aug 2025 00:07:37 -0700 (PDT) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, shuah@kernel.org Cc: lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, npache@redhat.com, ryan.roberts@arm.com, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH 1/2] selftests/mm/uffd-stress: Make test operate on less hugetlb memory Date: Tue, 26 Aug 2025 12:37:04 +0530 Message-Id: <20250826070705.53841-2-dev.jain@arm.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250826070705.53841-1-dev.jain@arm.com> References: <20250826070705.53841-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" We observed uffd-stress selftest failure on arm64 and intermittent failures on x86 too: running ./uffd-stress hugetlb-private 128 32 bounces: 17, mode: rnd read, ERROR: UFFDIO_COPY error: -12 (errno=3D12, @uf= fd-common.c:617) [FAIL] not ok 18 uffd-stress hugetlb-private 128 32 # exit=3D1 For this particular case, the number of free hugepages from run_vmtests.sh will be 128, and the test will allocate 64 hugepages in the source location. The stress() function will start spawning threads which will operate on the destination location, triggering uffd-operations like UFFDIO_COPY from src to dst, which means that we will require 64 more hugepages for the dst location. Let us observe the locking_thread() function. It will lock the mutex kept at dst, triggering uffd-copy. Suppose that 127 (64 for src and 63 for dst) hugepages have been reserved. In case of BOUNCE_RANDOM, it may happen that two threads trying to lock the mutex at dst, try to do so at the same hugepage number. If one thread succeeds in reserving the last hugepage, then the other thread may fail in alloc_hugetlb_folio(), returning -ENOMEM. I can confirm that this is indeed the case by this hacky patch: diff --git a/mm/hugetlb.c b/mm/hugetlb.c index 753f99b4c718..39eb21d8a91b 100644 Tested-by: Ryan Roberts --- a/mm/hugetlb.c +++ b/mm/hugetlb.c @@ -6929,6 +6929,11 @@ int hugetlb_mfill_atomic_pte(pte_t *dst_pte, =20 folio =3D alloc_hugetlb_folio(dst_vma, dst_addr, false); if (IS_ERR(folio)) { + pte_t *actual_pte =3D hugetlb_walk(dst_vma, dst_addr, PMD_SIZE); + if (actual_pte) { + ret =3D -EEXIST; + goto out; + } ret =3D -ENOMEM; goto out; } This code path gets triggered indicating that the PMD at which one thread is trying to map a hugepage, gets filled by a racing thread. Therefore, instead of using freepgs to compute the amount of memory, use freepgs - 10, so that the test still has some extra hugepages to use. Note that, in case this value underflows, there is a check for the number of free hugepages in the test itself, which will fail, so we are safe. Signed-off-by: Dev Jain --- tools/testing/selftests/mm/run_vmtests.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/mm/run_vmtests.sh b/tools/testing/self= tests/mm/run_vmtests.sh index 471e539d82b8..6a9f435be7a1 100755 --- a/tools/testing/selftests/mm/run_vmtests.sh +++ b/tools/testing/selftests/mm/run_vmtests.sh @@ -326,7 +326,7 @@ CATEGORY=3D"userfaultfd" run_test ${uffd_stress_bin} an= on 20 16 # the size of the free pages we have, which is used for *each*. # uffd-stress expects a region expressed in MiB, so we adjust # half_ufd_size_MB accordingly. -half_ufd_size_MB=3D$(((freepgs * hpgsize_KB) / 1024 / 2)) +half_ufd_size_MB=3D$((((freepgs - 10) * hpgsize_KB) / 1024 / 2)) CATEGORY=3D"userfaultfd" run_test ${uffd_stress_bin} hugetlb "$half_ufd_si= ze_MB" 32 CATEGORY=3D"userfaultfd" run_test ${uffd_stress_bin} hugetlb-private "$hal= f_ufd_size_MB" 32 CATEGORY=3D"userfaultfd" run_test ${uffd_stress_bin} shmem 20 16 --=20 2.30.2 From nobody Fri Oct 3 19:11:49 2025 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A3FF62F6573; Tue, 26 Aug 2025 07:07:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756192072; cv=none; b=X/ho3xKoRTX/93nY/EyQp+tin2G/Iecb7v7lRjvXluAHfWw+mByphbdTpLTHS+QekTsipXovM55nsdSmfCrNxo+gS56FEr1cM4aLP9eZu/VBlrOVsHOtcuCI4ff+VoWQ7A5mdSlJC3Ua0vVV3X+A3tXqaH2DJXo3ywXT/uMWN3M= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1756192072; c=relaxed/simple; bh=WImtmxEHdhUmjAW+QhjHJ7SKooKqpRstSTXIbLo7+5s=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=r7oGHurWSyTw9DDVkfRQGgL3pLyPapFaDu3SNUTW3dTe2HbSGiCEobNLqkw1AL1szpHGGHXUMdhNJp0fEqkqjFgkdyZRQ0MJBkhkSuhbYi4eJ384m/yXxXU2fhspOQcXo7HGBGoXo9EZ9BSZgpZY+IPaGMQflTOdY8EdzVhOyF4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id B63031BC0; Tue, 26 Aug 2025 00:07:41 -0700 (PDT) Received: from localhost.localdomain (unknown [10.163.65.202]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPA id 4CB183F63F; Tue, 26 Aug 2025 00:07:43 -0700 (PDT) From: Dev Jain To: akpm@linux-foundation.org, david@redhat.com, shuah@kernel.org Cc: lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com, vbabka@suse.cz, rppt@kernel.org, surenb@google.com, mhocko@suse.com, npache@redhat.com, ryan.roberts@arm.com, linux-mm@kvack.org, linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org, Dev Jain Subject: [PATCH 2/2] selftests/mm/uffd-stress: Stricten constraint on free hugepages before the test Date: Tue, 26 Aug 2025 12:37:05 +0530 Message-Id: <20250826070705.53841-3-dev.jain@arm.com> X-Mailer: git-send-email 2.39.5 (Apple Git-154) In-Reply-To: <20250826070705.53841-1-dev.jain@arm.com> References: <20250826070705.53841-1-dev.jain@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" The test requires at least 2 * (bytes/page_size) hugetlb memory, since we require identical number of hugepages for src and dst location. Fix this. Along with the above, as explained in patch "selftests/mm/uffd-stress: Make test operate on less hugetlb memory", the racy nature of the test requires that we have some extra number of hugepages left beyond what is required. Therefore, stricten this constraint. Fixes: 5a6aa60d1823 ("selftests/mm: skip uffd hugetlb tests with insufficie= nt hugepages") Signed-off-by: Dev Jain --- tools/testing/selftests/mm/uffd-stress.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/testing/selftests/mm/uffd-stress.c b/tools/testing/selft= ests/mm/uffd-stress.c index 40af7f67c407..eb0b37f08061 100644 --- a/tools/testing/selftests/mm/uffd-stress.c +++ b/tools/testing/selftests/mm/uffd-stress.c @@ -449,7 +449,7 @@ int main(int argc, char **argv) bytes =3D atol(argv[2]) * 1024 * 1024; =20 if (test_type =3D=3D TEST_HUGETLB && - get_free_hugepages() < bytes / page_size) { + get_free_hugepages() < 2 * (bytes / page_size) + 10) { printf("skip: Skipping userfaultfd... not enough hugepages\n"); return KSFT_SKIP; } --=20 2.30.2