From nobody Tue Apr 7 04:46:05 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8D96B34B43F for ; Mon, 16 Mar 2026 06:24:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773642273; cv=none; b=PFBu6JJG9PnLtSBGptBo66w4XyKucBfhYrFc26UG0FNfFiXZQk9mJzhCvN8MvOdLIV74vv2IvI9rCpSUNjyu4b3+8fZukSx4M1BR54GLgPI9Oe44TBQDusMPl308SjDAm3nTIcDSHsJUXbXSxe26K10B24+Zvw1f+NCDP32Ya3E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773642273; c=relaxed/simple; bh=U2Hjqgu3ZGQ61gLeyT2aXhTxCmTtzoPb+7DYSd7bGbc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=EC6GX8KKt0S/k+700rpqOXGMRs3KybG36w6j+Sc9K6IwY1JK5BLz2erCmob5RfLLwkQaVrEBkSMGkTKyu2thYPdH/+8/Znxm0MfzKI4tvOf/Gs1dmZkKi8cyhDQ1OnVDIBAtjYwEAzToNbr6RUVMGoEqTJvZmwuyh68/SQU+QYU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=Ub3rtVNd; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=hCnvzPFS; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="Ub3rtVNd"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="hCnvzPFS" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1773642271; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CIIH/Efdgoi6aCf5h+6YrrdzeWxtIaKwFEFPrl5xCCs=; b=Ub3rtVNd/12m95CSiwtESaKWhbnfBhNBYdPSsTGkqzlKT0Mklewxv3GJe2F0oH0z6B+n/3 U9xenwUeorhwniGAbsXpmRC32AqulL0D1UXta6aFedk9EBjQcpACw86f0D93G7+EqdQIqi QRY4VfZQeGKqgVLheBkx1Oqt4W+OinY= Received: from mail-lf1-f71.google.com (mail-lf1-f71.google.com [209.85.167.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-228-akymOwptPCSry5kMsnmlZA-1; Mon, 16 Mar 2026 02:24:30 -0400 X-MC-Unique: akymOwptPCSry5kMsnmlZA-1 X-Mimecast-MFC-AGG-ID: akymOwptPCSry5kMsnmlZA_1773642269 Received: by mail-lf1-f71.google.com with SMTP id 2adb3069b0e04-5a12514a844so3369441e87.2 for ; Sun, 15 Mar 2026 23:24:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1773642269; x=1774247069; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=CIIH/Efdgoi6aCf5h+6YrrdzeWxtIaKwFEFPrl5xCCs=; b=hCnvzPFSb3a1Fmf59RqaTOeeB4qH1ncl6c9VsOsJvGDH0hGAeeCLTpE/LBtbZ2Vs18 aXAWQm3LLOaYMfDFAtrLT2vYYLsg1WMJF+xJEtrSlneSogNJYRxU47hZoWvcccDH1TXG AzR4tLIjMZFeOFjstZ7Ct9EQmQo8nG9u+C8wjkIofn9Kfbyeq0McL7/3RtFZO4oLE5UU 3VGoqlRe1iWl3BA7aAu0lU701PWlOgtmFf8SQsnQE40NzeaIrfW4aOq4Wqujtm+qkNVc GygtIdrrD8SZXLEI7pBWMdM/I17PokiyLZzIwIxupdVynEDF+IffNicn+KPpQhwTiIQ/ XowQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773642269; x=1774247069; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=CIIH/Efdgoi6aCf5h+6YrrdzeWxtIaKwFEFPrl5xCCs=; b=EhaHso5aBboG0Xr4expV3IKdbuibZgb6wiXIB6hsYTO8oQrFKhXKGicMPRLoasfMfM usYI4OMzUgxrJqDaeihrhotlHH+w3oc8Q5ED2kzOVJqRuefIvlWGUD4xhQ49n/iJFxEj 8jNJYL3BrtLNmjONB9udL9WlEbFZeqN7uL2J/42r7Zz5LY41ZeTrrYmtc9zNTcQ5BeMX ivHk3z5LRvayIshw3gKG4G/40a3UnxJdlmzdKiUSpPOh3CJauXJlJeajwkKk1ZOpcRp4 LMXpyOVeIBshQQgu/vDzZ0HLBV9MhBfN8gfABjc8AYbNuesVSfTRIUd7d+z/Li1Jsor0 W/Fg== X-Gm-Message-State: AOJu0YwQHzmCfdAgIYuein2VFKw2FySCOnOjeZFmwKrNTBemnqMM0Gjq 9qAJSdzrTX892JP+Dbr/FQj3LS2VeCdliA5Jhf1ZTvTdg5aLxWo4pEILUf36ZAczRNOP1larpUA K4IUIVd90NL4qfei2QDBP5d1w+gFZGsED2Pi6p0RRG9dPYh4wqXpjJ/kP/3yBKTdZ X-Gm-Gg: ATEYQzzq33WaPDvMNsMKQfJzmbjW5hP/kHOHmrRccACwzilrJW1HnsWwBBEE/NgUJWu Niv13Ob40aoV+CHLChjl1i12htUcZknou4CXBQHxNnAboHz2v1Qf+nfaYb9ZzRkA2M1wVzoNbXh tSYouMhFylj/royrpNdak8F3hQM5GNBcabnB15wmCQ0j4QNAz0quS4bRvpSMg9VyS1cQrbaMr62 hv2tNiyF6EXgLECGdJOhrpso3INbW6Fwk+9dWwXHsLzNvwdIK+gulZLPudZj4jfKPXF+LzGIYoG ykiErjXfe9xVsZCI2pVzJ/kLMHRU9YPYktp6QtAEQJEIf7fcTCJRmQGvPNvwRmTON+7BRTAvzQf 1iObkWbIqYvp6vNHxxuhTaQseMVtux37FnKy+ X-Received: by 2002:ac2:5968:0:b0:5a1:3ebd:d53c with SMTP id 2adb3069b0e04-5a1626ffa87mr2464749e87.13.1773642268661; Sun, 15 Mar 2026 23:24:28 -0700 (PDT) X-Received: by 2002:ac2:5968:0:b0:5a1:3ebd:d53c with SMTP id 2adb3069b0e04-5a1626ffa87mr2464739e87.13.1773642268214; Sun, 15 Mar 2026 23:24:28 -0700 (PDT) Received: from fedora (85-23-51-1.bb.dnainternet.fi. [85.23.51.1]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5a15602e692sm3263397e87.30.2026.03.15.23.24.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 15 Mar 2026 23:24:27 -0700 (PDT) From: mpenttil@redhat.com To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, =?UTF-8?q?Mika=20Penttil=C3=A4?= , Andrew Morton , David Hildenbrand , Lorenzo Stoakes , "Liam R. Howlett" , Vlastimil Babka , Mike Rapoport , Suren Baghdasaryan , Michal Hocko Subject: [PATCH v6 1/6] mm:/Kconfig changes for migrate on fault for device pages Date: Mon, 16 Mar 2026 08:24:02 +0200 Message-ID: <20260316062407.3354636-2-mpenttil@redhat.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20260316062407.3354636-1-mpenttil@redhat.com> References: <20260316062407.3354636-1-mpenttil@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Mika Penttil=C3=A4 With the unified HMM/migrate_device page table walk migrate_device needs HMM enabled and HMM needs MMU notifiers. Enable them explicitly to avoid breaking random configs. Cc: Andrew Morton Cc: David Hildenbrand Cc: Lorenzo Stoakes Cc: "Liam R. Howlett" Cc: Vlastimil Babka Cc: Mike Rapoport Cc: Suren Baghdasaryan Cc: Michal Hocko Signed-off-by: Mika Penttil=C3=A4 --- mm/Kconfig | 2 ++ 1 file changed, 2 insertions(+) diff --git a/mm/Kconfig b/mm/Kconfig index ebd8ea353687..583d92bba2e8 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -647,6 +647,7 @@ config MIGRATION =20 config DEVICE_MIGRATION def_bool MIGRATION && ZONE_DEVICE + select HMM_MIRROR =20 config ARCH_ENABLE_HUGEPAGE_MIGRATION bool @@ -1222,6 +1223,7 @@ config ZONE_DEVICE config HMM_MIRROR bool depends on MMU + select MMU_NOTIFIER =20 config GET_FREE_REGION bool --=20 2.50.0 From nobody Tue Apr 7 04:46:06 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 47FDE34B19A for ; Mon, 16 Mar 2026 06:24:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773642276; cv=none; b=IssOt5y6ZRU4X0Sf+S0vUljrGmOuSIq2PxQRxsCc4UJj3J8SDc8dX9rxKovtCx1KWDbiuvorZl5GFF1AMHVI6V+L+BBVqKPD3AF17yC2NF+kaZCxrYQnmktAOWZSglCuJhvLiWBgPkEo88Rr0rnYVbFi1XVOZsKCUtQ6NQfPaZg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773642276; c=relaxed/simple; bh=pVHhEcusuuHDgpJzGX22+WCORZquXgG2fAg1htUWvlY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=Hailmqldp7ps0JyAA7NNYTc6GXRtEkdCXNH1ZifC6T0bbrdJCJyYC0SpfAZIshHKslmaYGvPFW+Q77qtVK3G8tB7YK6PKiHVi/WcQpTGHTLVaFit/WqCL/wFUecLFBxpX/SUigg8r1lsAs1dGP9fWsSq2HaKryJKh85tevd3yK8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=ClJE6nu6; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=TGZFBU5y; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="ClJE6nu6"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="TGZFBU5y" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1773642274; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=V/BrJfOK0p3FatgnyxIgVoOTkPBZ9GV0QiAZnnNuU/g=; b=ClJE6nu61Pn9OPorw9vUhoTr2uRVvEScfHjy/gwM6m1/Dnmfq+UZfDny9TMlsuT5YqsUdm Y+Vmbiz5TVbB4ENPvIjxDgmSneAZzK4UtVH7Tk+w9k71fblQu6U685nYs7NnSI9xmyCoD0 N2y8pREdBzMy/sz1zj/XggvwK7ux2Ys= Received: from mail-lf1-f72.google.com (mail-lf1-f72.google.com [209.85.167.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-260-tcDMECgHO3OZNWrkKnCPww-1; Mon, 16 Mar 2026 02:24:32 -0400 X-MC-Unique: tcDMECgHO3OZNWrkKnCPww-1 X-Mimecast-MFC-AGG-ID: tcDMECgHO3OZNWrkKnCPww_1773642271 Received: by mail-lf1-f72.google.com with SMTP id 2adb3069b0e04-59e0abad4fbso3198412e87.2 for ; Sun, 15 Mar 2026 23:24:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1773642271; x=1774247071; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=V/BrJfOK0p3FatgnyxIgVoOTkPBZ9GV0QiAZnnNuU/g=; b=TGZFBU5yR5LkV6U6h2yVAmGNUPhS5Q5t2qUY37qbbH6//bxKJXBFzcYhFo4QN7+qrO gIRDWYTRC1ERHT6Nh5IyJUA+IOLxR+PNZaanaN36uMYNT4Uo0AlnzJAIHDd1MZSqyBhG koCEAYyB7EPzVdWRMkYXUzJ5bI7jVXZZrRbrpBM3LBYL1mAJ0WFgQbiCqDNJKVr3KXO9 HTYbMFzXBl/0ZkHNAEE0csx8jp0+k5Bc7AHIslLOZSlD5XE2oIGEICig8X6+SXRCU1uU bIRQlNFynDNxpuvKz1kuPpcjL1+959DcVpkhBUw0r+Vn0iEY4/MXijzVAzOreCmkQdSG Mf/w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773642271; x=1774247071; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=V/BrJfOK0p3FatgnyxIgVoOTkPBZ9GV0QiAZnnNuU/g=; b=BHdKw62GcsD5mGfhvBUL720MmOJxCNuWGcCEvQ7ZBbKDc7L2caGQRHrHhMNWf2z1T4 +xdJ833tGHBDqJebioYJ1NrkA4FZjs/xZspA3desXxR91FOOPYGCeoxIX8i3B67SUoM7 JCptLLjDqNsyTxiFbH/vvAGAkjDWjX41LSjiaxFJYHK0jTHU1gt/LP7fQ/zO0x1vURi4 sjWixXTAwloNoyMkuN31vclPOKJC6FRcQl0JxAgifGbSx7L+s07GEuDfNhbbu3wQwSa8 OYrp/O3sJX7zwA/fIZRSl4qKacEoUq7uHRvFYH23DwZHtfFW8WUUAIkSDvVgaBlLjHGp qj5A== X-Gm-Message-State: AOJu0YykAVZy446RwRuX6TCMbD92mj0j1pfK9ZJBB3fF2SNQJLgwz4yQ KjNqSZBDtNZvQZQfZXIXmRDoVF8veHvJxdLfiCj80r+8Sei7LJ2XwHgljTQfAyAVfANGhzZkyN4 lMI/vBlsHqq4fC+9+2phqhLVJ/bwT67HkA9v49SymqToSAMoGOqnObYgDrcVZNaSW X-Gm-Gg: ATEYQzzU7/amoniXaMOFGMMLJFG+zLHoDdN8RFDCpGQPsgAw7OYQXNaYBkQufuXeLtf CsDFAQRTKdYYFUDcEfW9xmn9669WsInh3JCCh1yYZYk1Og8nScJ1xJ/2j7P5GaaUjBe8cSGo1kP ptdoqdknAVOt+Je2cBVDZ+f2Z2useMkFM+BANT0EXBHmACXyM5xxMByL14Azxh3rlH3ecNyxmy/ MthqdC6BGV8lJGn6XFXw2/fMhWwOO0nAgiyC8Vtuv1Vm8kWAsopctFeWBB+AneBz4MghpX4YZg0 Piy1+JItAV2zmIp850h8fWY9zO/jygMerAd1xHCCeVgl7CsaAT2vUFI4pNbWUb6BglF/jBJbedr uGhUN1n+flq11g9LjDpfjG8lo3g7Rwu4SqAiN X-Received: by 2002:a05:6512:124b:b0:5a1:1e44:8682 with SMTP id 2adb3069b0e04-5a162b0d510mr4190300e87.34.1773642271083; Sun, 15 Mar 2026 23:24:31 -0700 (PDT) X-Received: by 2002:a05:6512:124b:b0:5a1:1e44:8682 with SMTP id 2adb3069b0e04-5a162b0d510mr4190282e87.34.1773642270601; Sun, 15 Mar 2026 23:24:30 -0700 (PDT) Received: from fedora (85-23-51-1.bb.dnainternet.fi. [85.23.51.1]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5a15602e692sm3263397e87.30.2026.03.15.23.24.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 15 Mar 2026 23:24:30 -0700 (PDT) From: mpenttil@redhat.com To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, =?UTF-8?q?Mika=20Penttil=C3=A4?= , David Hildenbrand , Jason Gunthorpe , Leon Romanovsky , Alistair Popple , Balbir Singh , Zi Yan , Matthew Brost Subject: [PATCH v6 2/6] mm: Add helper to convert HMM pfn to migrate pfn Date: Mon, 16 Mar 2026 08:24:03 +0200 Message-ID: <20260316062407.3354636-3-mpenttil@redhat.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20260316062407.3354636-1-mpenttil@redhat.com> References: <20260316062407.3354636-1-mpenttil@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Mika Penttil=C3=A4 The unified HMM/migrate_device pagewalk does the "collecting" in HMM side, so need a helper to transfer pfns to migrate_vma world. Cc: David Hildenbrand Cc: Jason Gunthorpe Cc: Leon Romanovsky Cc: Alistair Popple Cc: Balbir Singh Cc: Zi Yan Cc: Matthew Brost Suggested-by: Alistair Popple Signed-off-by: Mika Penttil=C3=A4 --- include/linux/hmm.h | 18 +++++++++-- include/linux/migrate.h | 3 +- mm/hmm.c | 6 ---- mm/migrate_device.c | 69 +++++++++++++++++++++++++++++++++++++++++ 4 files changed, 87 insertions(+), 9 deletions(-) diff --git a/include/linux/hmm.h b/include/linux/hmm.h index db75ffc949a7..b5418e318782 100644 --- a/include/linux/hmm.h +++ b/include/linux/hmm.h @@ -12,7 +12,7 @@ #include =20 struct mmu_interval_notifier; - +struct migrate_vma; /* * On output: * 0 - The page is faultable and a future call with=20 @@ -27,6 +27,7 @@ struct mmu_interval_notifier; * HMM_PFN_P2PDMA_BUS - Bus mapped P2P transfer * HMM_PFN_DMA_MAPPED - Flag preserved on input-to-output transformation * to mark that page is already DMA mapped + * HMM_PFN_MIGRATE - Migrate PTE installed * * On input: * 0 - Return the current state of the page, do not fault = it. @@ -34,6 +35,7 @@ struct mmu_interval_notifier; * will fail * HMM_PFN_REQ_WRITE - The output must have HMM_PFN_WRITE or hmm_range_fau= lt() * will fail. Must be combined with HMM_PFN_REQ_FAULT. + * HMM_PFN_REQ_MIGRATE - For default_flags, request to migrate to device */ enum hmm_pfn_flags { /* Output fields and flags */ @@ -48,15 +50,25 @@ enum hmm_pfn_flags { HMM_PFN_P2PDMA =3D 1UL << (BITS_PER_LONG - 5), HMM_PFN_P2PDMA_BUS =3D 1UL << (BITS_PER_LONG - 6), =20 - HMM_PFN_ORDER_SHIFT =3D (BITS_PER_LONG - 11), + /* Migrate request */ + HMM_PFN_MIGRATE =3D 1UL << (BITS_PER_LONG - 7), + HMM_PFN_COMPOUND =3D 1UL << (BITS_PER_LONG - 8), + HMM_PFN_ORDER_SHIFT =3D (BITS_PER_LONG - 13), =20 /* Input flags */ HMM_PFN_REQ_FAULT =3D HMM_PFN_VALID, HMM_PFN_REQ_WRITE =3D HMM_PFN_WRITE, + HMM_PFN_REQ_MIGRATE =3D HMM_PFN_MIGRATE, =20 HMM_PFN_FLAGS =3D ~((1UL << HMM_PFN_ORDER_SHIFT) - 1), }; =20 +enum { + /* These flags are carried from input-to-output */ + HMM_PFN_INOUT_FLAGS =3D HMM_PFN_DMA_MAPPED | HMM_PFN_P2PDMA | + HMM_PFN_P2PDMA_BUS, +}; + /* * hmm_pfn_to_page() - return struct page pointed to by a device entry * @@ -107,6 +119,7 @@ static inline unsigned int hmm_pfn_to_map_order(unsigne= d long hmm_pfn) * @default_flags: default flags for the range (write, read, ... see hmm d= oc) * @pfn_flags_mask: allows to mask pfn flags so that only default_flags ma= tter * @dev_private_owner: owner of device private pages + * @migrate: structure for migrating the associated vma */ struct hmm_range { struct mmu_interval_notifier *notifier; @@ -117,6 +130,7 @@ struct hmm_range { unsigned long default_flags; unsigned long pfn_flags_mask; void *dev_private_owner; + struct migrate_vma *migrate; }; =20 /* diff --git a/include/linux/migrate.h b/include/linux/migrate.h index d5af2b7f577b..425ab5242da0 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -3,6 +3,7 @@ #define _LINUX_MIGRATE_H =20 #include +#include #include #include #include @@ -200,7 +201,7 @@ void migrate_device_pages(unsigned long *src_pfns, unsi= gned long *dst_pfns, unsigned long npages); void migrate_device_finalize(unsigned long *src_pfns, unsigned long *dst_pfns, unsigned long npages); - +void migrate_hmm_range_setup(struct hmm_range *range); #endif /* CONFIG_MIGRATION */ =20 #endif /* _LINUX_MIGRATE_H */ diff --git a/mm/hmm.c b/mm/hmm.c index f6c4ddff4bd6..44699e28c551 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -41,12 +41,6 @@ enum { HMM_NEED_ALL_BITS =3D HMM_NEED_FAULT | HMM_NEED_WRITE_FAULT, }; =20 -enum { - /* These flags are carried from input-to-output */ - HMM_PFN_INOUT_FLAGS =3D HMM_PFN_DMA_MAPPED | HMM_PFN_P2PDMA | - HMM_PFN_P2PDMA_BUS, -}; - static int hmm_pfns_fill(unsigned long addr, unsigned long end, struct hmm_range *range, unsigned long cpu_flags) { diff --git a/mm/migrate_device.c b/mm/migrate_device.c index 8079676c8f1f..b320ea3736b4 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -1489,3 +1489,72 @@ int migrate_device_coherent_folio(struct folio *foli= o) return 0; return -EBUSY; } + +/** + * migrate_hmm_range_setup() - prepare to migrate a range of memory + * @range: contains pointer to migrate_vma to be populated + * + * When collecting happens by hmm_range_fault(), this populates + * the migrate->src[] and migrate->dst[] using range->hmm_pfns[]. + * Also, migrate->cpages and migrate->npages get initialized. + */ +void migrate_hmm_range_setup(struct hmm_range *range) +{ + + struct migrate_vma *migrate =3D range->migrate; + + if (!migrate) + return; + + migrate->npages =3D (migrate->end - migrate->start) >> PAGE_SHIFT; + migrate->cpages =3D 0; + + for (unsigned long i =3D 0; i < migrate->npages; i++) { + + unsigned long pfn =3D range->hmm_pfns[i]; + + pfn &=3D ~HMM_PFN_INOUT_FLAGS; + + /* + * + * Don't do migration if valid and migrate flags are not both set. + * + */ + if ((pfn & (HMM_PFN_VALID | HMM_PFN_MIGRATE)) !=3D + (HMM_PFN_VALID | HMM_PFN_MIGRATE)) { + migrate->src[i] =3D 0; + migrate->dst[i] =3D 0; + continue; + } + + migrate->cpages++; + + /* + * + * The zero page is encoded in a special way, valid and migrate is + * set, and pfn part is zero. Encode specially for migrate also. + * + */ + if (pfn =3D=3D (HMM_PFN_VALID|HMM_PFN_MIGRATE)) { + migrate->src[i] =3D MIGRATE_PFN_MIGRATE; + migrate->dst[i] =3D 0; + continue; + } + if (pfn =3D=3D (HMM_PFN_VALID|HMM_PFN_MIGRATE|HMM_PFN_COMPOUND)) { + migrate->src[i] =3D MIGRATE_PFN_MIGRATE|MIGRATE_PFN_COMPOUND; + migrate->dst[i] =3D 0; + continue; + } + + migrate->src[i] =3D migrate_pfn(page_to_pfn(hmm_pfn_to_page(pfn))) + | MIGRATE_PFN_MIGRATE; + migrate->src[i] |=3D (pfn & HMM_PFN_WRITE) ? MIGRATE_PFN_WRITE : 0; + migrate->src[i] |=3D (pfn & HMM_PFN_COMPOUND) ? MIGRATE_PFN_COMPOUND : 0; + migrate->dst[i] =3D 0; + } + + if (migrate->cpages) + migrate_vma_unmap(migrate); + +} +EXPORT_SYMBOL(migrate_hmm_range_setup); --=20 2.50.0 From nobody Tue Apr 7 04:46:06 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3168934BA42 for ; Mon, 16 Mar 2026 06:24:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773642282; cv=none; b=lSdeyUI2CA+AO49/9E2R8ErNZ9bBozgcupFCcHrYkZ32mA7Ghsyqz9y3238CgMi79yEiWVnEkoXFJ35thKQ+le8mtJFbsGi/up29mZulavVYnjdtixm9XsuPukQx+60P97abb7pl0QYQUE5NFGRTVpPi2xiowkFfknCE8CAzP6w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773642282; c=relaxed/simple; bh=CdWZSC8FmSbeeitB949mHg9AgfYxp0X5WAbbtkgpVi4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=A7ykwDWzMDhJjg4IvbfB9An719hHS8VROPAiHzPFVIaFr9XX13xLUT0fjDa6xhyCLzAybKr3FMBdLoWDomdDS1zz0wWIG/66MN7p9f0B9rOMokZw1Rv2rCTJmqYsRTEbWfA2b4lDedAEHa7TLM1eAyL/DGoMwFR22o5tsQgDGtY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=UcinWX4T; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=PjsFjkZA; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="UcinWX4T"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="PjsFjkZA" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1773642279; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=8jNgZiFL/8z6mVypv1MVTrHmDX0tNmtlCQ+pCeQcJaY=; b=UcinWX4TAruLExPiqtwjUJ4jMK6EqwGLSZ49KAsRBz6hPtH2WyCeHXe382Oslo1+pr5Gl3 /5KSrdWQbi4EhPJrvkEpjjTxvteFuFyH/6iZgarqbRutDGxzl70opt0vxaj/fkck+kVBig iTIXcBS31w03iRoU3B//xu2qijfK4XY= Received: from mail-lf1-f71.google.com (mail-lf1-f71.google.com [209.85.167.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-639-4bNV3kIvOsePjgBk7QJIww-1; Mon, 16 Mar 2026 02:24:35 -0400 X-MC-Unique: 4bNV3kIvOsePjgBk7QJIww-1 X-Mimecast-MFC-AGG-ID: 4bNV3kIvOsePjgBk7QJIww_1773642274 Received: by mail-lf1-f71.google.com with SMTP id 2adb3069b0e04-5a143864ad3so2783743e87.1 for ; Sun, 15 Mar 2026 23:24:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1773642274; x=1774247074; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=8jNgZiFL/8z6mVypv1MVTrHmDX0tNmtlCQ+pCeQcJaY=; b=PjsFjkZAvz7qfCKWTqz+Z6I9/18qEe1psnXRTciAk5RTy+HxPssPSdBIMWupN8lIiA Q8nvy3cnU4aaZfGHKhHba05BPqKZmN2/cGm9wvZquEwkARBWbVjH/FNaegPP+w/IBPTQ RxhRPKXrKQy+sbQrdzjsfAoMJ/FyZv4uEuVXSSzozTm9T68JvdnFtTu/PVkECFchIRg/ 5ASOrUduUZK/b+IK+WpZtTP2DJpuhF3XD+9eW9RtMiUH519B59O92ETeYAjpIY9V0xcl gyvEV+wIdzVEPlWxw/vXl4NWWAgxaclEkh1JABgpFOFI+XF5Jynivi9RvnPBtWPAEcGS qGVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773642274; x=1774247074; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=8jNgZiFL/8z6mVypv1MVTrHmDX0tNmtlCQ+pCeQcJaY=; b=J1rsff81i4/eG6VV+EckJaJsVfE6u/DVWrypoIgdLuG28wC9Q8TUYqmEKYJ6B8aa9m QbiJWMH9sennvnLde1aYU2V4XF7l9o0RHF79duCwQdIgBwEI3nMn0Sl45OY5nZdoyB+p r/zIsaXWToZPhxg45gbkhpyL2z6kKGYD64Cjhs8fzBbroUGMf5CZO4azuOzfzVPJkQ2w bOcDnETt1FDhy2GgaUmb+/nu0aC+Rq2PSM/JwlXguM5nIa0z/07JFgws1JjOHhdyo2m2 6toP+0ko9apkeyNR/WttyAROxAZ7aPW/yj2hKA3dF0AhDvl+BimgeZM8UMrmjwPsjr09 kiRA== X-Gm-Message-State: AOJu0YzxgJW5cmddwu7j8+41q55k2TTNWdxucuYEmLpmRvnd+dbzMvV/ L+uNuhE+p+I+PE41geSudbR19C7vFnQJRBJRxwTeM45EudSjsutflT/9nLXJ2qsa7pe0ls5KHb+ tzuuKacMFFpqrMzffZe7lkMzOCkzPtxegwFTwdeB8r5c52TdVwqCJzO4ek77joWzw X-Gm-Gg: ATEYQzyM3ThwUf5Yvb1JYVkToipnVg0OrnpkVsNUkua5f5T1C3gaInfxxZ/o+3+Zexs MNwbRjnzhtR0Fi1+ojh0MKNkU4JzJDm3PyHB27QNAe0RfKzttWbBZItKn2yqJR+/LiHiqG26COj Rlha13Z3sj7Am+ERvp0K84Rl0ERkia5tPTFIp+02A8NH/jDDm1bheX3PQ2RHOMX5G9CDj1vpBoh CMhnBrB6McTIA/fyl7FwodesCrqir7VNL/BGfcKKeSVUbuv7xI+FJjCEVzTk5x63CpJaHNZ8OUa hyjj7KnAa21YSTKsbp9JeBtlFh8nK/4ake3Qp82u5Ya8sjg0wj1pIoVfqZ1By85UyW7Y3ZC/xVF /g1U2rTmvOE7PadtSktRCp6WfsnGVpaIrKM7h X-Received: by 2002:a05:6512:63d2:20b0:5a1:3d7f:8fbb with SMTP id 2adb3069b0e04-5a1627039e1mr2723915e87.2.1773642273750; Sun, 15 Mar 2026 23:24:33 -0700 (PDT) X-Received: by 2002:a05:6512:63d2:20b0:5a1:3d7f:8fbb with SMTP id 2adb3069b0e04-5a1627039e1mr2723903e87.2.1773642273220; Sun, 15 Mar 2026 23:24:33 -0700 (PDT) Received: from fedora (85-23-51-1.bb.dnainternet.fi. [85.23.51.1]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5a15602e692sm3263397e87.30.2026.03.15.23.24.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 15 Mar 2026 23:24:32 -0700 (PDT) From: mpenttil@redhat.com To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, =?UTF-8?q?Mika=20Penttil=C3=A4?= , David Hildenbrand , Jason Gunthorpe , Leon Romanovsky , Alistair Popple , Balbir Singh , Zi Yan , Matthew Brost Subject: [PATCH v6 3/6] mm/hmm: do the plumbing for HMM to participate in migration Date: Mon, 16 Mar 2026 08:24:04 +0200 Message-ID: <20260316062407.3354636-4-mpenttil@redhat.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20260316062407.3354636-1-mpenttil@redhat.com> References: <20260316062407.3354636-1-mpenttil@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Mika Penttil=C3=A4 Do the preparations in hmm_range_fault() and pagewalk callbacks to do the "collecting" part of migration, needed for migration on fault. These steps include locking for pmd/pte if migrating, capturing the vma for further migrate actions, and calling the still dummy hmm_vma_handle_migrate_prepare_pmd() and hmm_vma_handle_migrate_prepare() functions in the pagewalk. Cc: David Hildenbrand Cc: Jason Gunthorpe Cc: Leon Romanovsky Cc: Alistair Popple Cc: Balbir Singh Cc: Zi Yan Cc: Matthew Brost Suggested-by: Alistair Popple Signed-off-by: Mika Penttil=C3=A4 --- include/linux/migrate.h | 17 +- lib/test_hmm.c | 2 +- mm/hmm.c | 423 +++++++++++++++++++++++++++++++++++----- 3 files changed, 387 insertions(+), 55 deletions(-) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 425ab5242da0..037e7430edb9 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -104,6 +104,15 @@ static inline void softleaf_entry_wait_on_locked(softl= eaf_t entry, spinlock_t *p WARN_ON_ONCE(1); =20 spin_unlock(ptl); + +enum migrate_vma_info { + MIGRATE_VMA_SELECT_NONE =3D 0, + MIGRATE_VMA_SELECT_COMPOUND =3D MIGRATE_VMA_SELECT_NONE, +}; + +static inline enum migrate_vma_info hmm_select_migrate(struct hmm_range *r= ange) +{ + return MIGRATE_VMA_SELECT_NONE; } =20 #endif /* CONFIG_MIGRATION */ @@ -149,7 +158,7 @@ static inline unsigned long migrate_pfn(unsigned long p= fn) return (pfn << MIGRATE_PFN_SHIFT) | MIGRATE_PFN_VALID; } =20 -enum migrate_vma_direction { +enum migrate_vma_info { MIGRATE_VMA_SELECT_SYSTEM =3D 1 << 0, MIGRATE_VMA_SELECT_DEVICE_PRIVATE =3D 1 << 1, MIGRATE_VMA_SELECT_DEVICE_COHERENT =3D 1 << 2, @@ -191,6 +200,12 @@ struct migrate_vma { struct page *fault_page; }; =20 +// TODO: enable migration +static inline enum migrate_vma_info hmm_select_migrate(struct hmm_range *r= ange) +{ + return 0; +} + int migrate_vma_setup(struct migrate_vma *args); void migrate_vma_pages(struct migrate_vma *migrate); void migrate_vma_finalize(struct migrate_vma *migrate); diff --git a/lib/test_hmm.c b/lib/test_hmm.c index 0964d53365e6..01aa0b60df2f 100644 --- a/lib/test_hmm.c +++ b/lib/test_hmm.c @@ -145,7 +145,7 @@ static bool dmirror_is_private_zone(struct dmirror_devi= ce *mdevice) HMM_DMIRROR_MEMORY_DEVICE_PRIVATE); } =20 -static enum migrate_vma_direction +static enum migrate_vma_info dmirror_select_device(struct dmirror *dmirror) { return (dmirror->mdevice->zone_device_type =3D=3D diff --git a/mm/hmm.c b/mm/hmm.c index 44699e28c551..c302de5b67d9 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -20,6 +20,7 @@ #include #include #include +#include #include #include #include @@ -27,14 +28,44 @@ #include #include #include +#include =20 #include "internal.h" =20 struct hmm_vma_walk { - struct hmm_range *range; - unsigned long last; + struct mmu_notifier_range mmu_range; + struct vm_area_struct *vma; + struct hmm_range *range; + unsigned long start; + unsigned long end; + unsigned long last; + /* + * For migration we need pte/pmd + * locked for the handle_* and + * prepare_* regions. While faulting + * we have to drop the locks and + * start again. + * ptelocked and pmdlocked + * hold the state and tells if need + * to drop locks before faulting. + * ptl is the lock held for pte or pmd. + * + */ + bool ptelocked; + bool pmdlocked; + spinlock_t *ptl; }; =20 +#define HMM_ASSERT_PTE_LOCKED(hmm_vma_walk, locked) \ + WARN_ON_ONCE(hmm_vma_walk->ptelocked !=3D locked) + +#define HMM_ASSERT_PMD_LOCKED(hmm_vma_walk, locked) \ + WARN_ON_ONCE(hmm_vma_walk->pmdlocked !=3D locked) + +#define HMM_ASSERT_UNLOCKED(hmm_vma_walk) \ + WARN_ON_ONCE(hmm_vma_walk->ptelocked || \ + hmm_vma_walk->pmdlocked) + enum { HMM_NEED_FAULT =3D 1 << 0, HMM_NEED_WRITE_FAULT =3D 1 << 1, @@ -42,14 +73,37 @@ enum { }; =20 static int hmm_pfns_fill(unsigned long addr, unsigned long end, - struct hmm_range *range, unsigned long cpu_flags) + struct hmm_vma_walk *hmm_vma_walk, unsigned long cpu_flags) { + struct hmm_range *range =3D hmm_vma_walk->range; unsigned long i =3D (addr - range->start) >> PAGE_SHIFT; + enum migrate_vma_info minfo; + bool migrate =3D false; + + minfo =3D hmm_select_migrate(range); + if (cpu_flags !=3D HMM_PFN_ERROR) { + if (minfo && (vma_is_anonymous(hmm_vma_walk->vma))) { + cpu_flags |=3D (HMM_PFN_VALID | HMM_PFN_MIGRATE); + migrate =3D true; + } + } + + if (migrate && thp_migration_supported() && + (minfo & MIGRATE_VMA_SELECT_COMPOUND) && + IS_ALIGNED(addr, HPAGE_PMD_SIZE) && + IS_ALIGNED(end, HPAGE_PMD_SIZE)) { + range->hmm_pfns[i] &=3D HMM_PFN_INOUT_FLAGS; + range->hmm_pfns[i] |=3D cpu_flags | HMM_PFN_COMPOUND; + addr +=3D PAGE_SIZE; + i++; + cpu_flags =3D 0; + } =20 for (; addr < end; addr +=3D PAGE_SIZE, i++) { range->hmm_pfns[i] &=3D HMM_PFN_INOUT_FLAGS; range->hmm_pfns[i] |=3D cpu_flags; } + return 0; } =20 @@ -72,6 +126,7 @@ static int hmm_vma_fault(unsigned long addr, unsigned lo= ng end, unsigned int fault_flags =3D FAULT_FLAG_REMOTE; =20 WARN_ON_ONCE(!required_fault); + HMM_ASSERT_UNLOCKED(hmm_vma_walk); hmm_vma_walk->last =3D addr; =20 if (required_fault & HMM_NEED_WRITE_FAULT) { @@ -165,11 +220,11 @@ static int hmm_vma_walk_hole(unsigned long addr, unsi= gned long end, if (!walk->vma) { if (required_fault) return -EFAULT; - return hmm_pfns_fill(addr, end, range, HMM_PFN_ERROR); + return hmm_pfns_fill(addr, end, hmm_vma_walk, HMM_PFN_ERROR); } if (required_fault) return hmm_vma_fault(addr, end, required_fault, walk); - return hmm_pfns_fill(addr, end, range, 0); + return hmm_pfns_fill(addr, end, hmm_vma_walk, 0); } =20 static inline unsigned long hmm_pfn_flags_order(unsigned long order) @@ -202,8 +257,13 @@ static int hmm_vma_handle_pmd(struct mm_walk *walk, un= signed long addr, cpu_flags =3D pmd_to_hmm_pfn_flags(range, pmd); required_fault =3D hmm_range_need_fault(hmm_vma_walk, hmm_pfns, npages, cpu_flags); - if (required_fault) + if (required_fault) { + if (hmm_vma_walk->pmdlocked) { + spin_unlock(hmm_vma_walk->ptl); + hmm_vma_walk->pmdlocked =3D false; + } return hmm_vma_fault(addr, end, required_fault, walk); + } =20 pfn =3D pmd_pfn(pmd) + ((addr & ~PMD_MASK) >> PAGE_SHIFT); for (i =3D 0; addr < end; addr +=3D PAGE_SIZE, i++, pfn++) { @@ -283,14 +343,23 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, u= nsigned long addr, goto fault; =20 if (softleaf_is_migration(entry)) { - pte_unmap(ptep); - hmm_vma_walk->last =3D addr; - migration_entry_wait(walk->mm, pmdp, addr); - return -EBUSY; + if (!hmm_select_migrate(range)) { + HMM_ASSERT_UNLOCKED(hmm_vma_walk); + hmm_vma_walk->last =3D addr; + migration_entry_wait(walk->mm, pmdp, addr); + return -EBUSY; + } else + goto out; } =20 /* Report error for everything else */ - pte_unmap(ptep); + + if (hmm_vma_walk->ptelocked) { + pte_unmap_unlock(ptep, hmm_vma_walk->ptl); + hmm_vma_walk->ptelocked =3D false; + } else + pte_unmap(ptep); + return -EFAULT; } =20 @@ -307,7 +376,12 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, un= signed long addr, if (!vm_normal_page(walk->vma, addr, pte) && !is_zero_pfn(pte_pfn(pte))) { if (hmm_pte_need_fault(hmm_vma_walk, pfn_req_flags, 0)) { - pte_unmap(ptep); + if (hmm_vma_walk->ptelocked) { + pte_unmap_unlock(ptep, hmm_vma_walk->ptl); + hmm_vma_walk->ptelocked =3D false; + } else + pte_unmap(ptep); + return -EFAULT; } new_pfn_flags =3D HMM_PFN_ERROR; @@ -320,7 +394,11 @@ static int hmm_vma_handle_pte(struct mm_walk *walk, un= signed long addr, return 0; =20 fault: - pte_unmap(ptep); + if (hmm_vma_walk->ptelocked) { + pte_unmap_unlock(ptep, hmm_vma_walk->ptl); + hmm_vma_walk->ptelocked =3D false; + } else + pte_unmap(ptep); /* Fault any virtual address we were asked to fault */ return hmm_vma_fault(addr, end, required_fault, walk); } @@ -364,13 +442,18 @@ static int hmm_vma_handle_absent_pmd(struct mm_walk *= walk, unsigned long start, required_fault =3D hmm_range_need_fault(hmm_vma_walk, hmm_pfns, npages, 0); if (required_fault) { - if (softleaf_is_device_private(entry)) + if (softleaf_is_device_private(entry)) { + if (hmm_vma_walk->pmdlocked) { + spin_unlock(hmm_vma_walk->ptl); + hmm_vma_walk->pmdlocked =3D false; + } return hmm_vma_fault(addr, end, required_fault, walk); + } else return -EFAULT; } =20 - return hmm_pfns_fill(start, end, range, HMM_PFN_ERROR); + return hmm_pfns_fill(start, end, hmm_vma_walk, HMM_PFN_ERROR); } #else static int hmm_vma_handle_absent_pmd(struct mm_walk *walk, unsigned long s= tart, @@ -378,15 +461,100 @@ static int hmm_vma_handle_absent_pmd(struct mm_walk = *walk, unsigned long start, pmd_t pmd) { struct hmm_vma_walk *hmm_vma_walk =3D walk->private; - struct hmm_range *range =3D hmm_vma_walk->range; unsigned long npages =3D (end - start) >> PAGE_SHIFT; =20 if (hmm_range_need_fault(hmm_vma_walk, hmm_pfns, npages, 0)) return -EFAULT; - return hmm_pfns_fill(start, end, range, HMM_PFN_ERROR); + return hmm_pfns_fill(start, end, hmm_vma_walk, HMM_PFN_ERROR); } #endif /* CONFIG_ARCH_ENABLE_THP_MIGRATION */ =20 +#ifdef CONFIG_DEVICE_MIGRATION +static int hmm_vma_handle_migrate_prepare_pmd(const struct mm_walk *walk, + pmd_t *pmdp, + unsigned long start, + unsigned long end, + unsigned long *hmm_pfn) +{ + // TODO: implement migration entry insertion + return 0; +} + +static int hmm_vma_handle_migrate_prepare(const struct mm_walk *walk, + pmd_t *pmdp, + pte_t *pte, + unsigned long addr, + unsigned long *hmm_pfn) +{ + // TODO: implement migration entry insertion + return 0; +} + +static int hmm_vma_walk_split(pmd_t *pmdp, + unsigned long addr, + struct mm_walk *walk) +{ + // TODO : implement split + return 0; +} + +#else +static int hmm_vma_handle_migrate_prepare_pmd(const struct mm_walk *walk, + pmd_t *pmdp, + unsigned long start, + unsigned long end, + unsigned long *hmm_pfn) +{ + return 0; +} + +static int hmm_vma_handle_migrate_prepare(const struct mm_walk *walk, + pmd_t *pmdp, + pte_t *pte, + unsigned long addr, + unsigned long *hmm_pfn) +{ + return 0; +} + +static int hmm_vma_walk_split(pmd_t *pmdp, + unsigned long addr, + struct mm_walk *walk) +{ + return 0; +} +#endif + +static int hmm_vma_capture_migrate_range(unsigned long start, + unsigned long end, + struct mm_walk *walk) +{ + struct hmm_vma_walk *hmm_vma_walk =3D walk->private; + struct hmm_range *range =3D hmm_vma_walk->range; + + if (!hmm_select_migrate(range)) + return 0; + + if (hmm_vma_walk->vma && (hmm_vma_walk->vma !=3D walk->vma)) + return -ERANGE; + + hmm_vma_walk->vma =3D walk->vma; + hmm_vma_walk->start =3D start; + hmm_vma_walk->end =3D end; + + if (end - start > range->end - range->start) + return -ERANGE; + + if (!hmm_vma_walk->mmu_range.owner) { + mmu_notifier_range_init_owner(&hmm_vma_walk->mmu_range, MMU_NOTIFY_MIGRA= TE, 0, + walk->vma->vm_mm, start, end, + range->dev_private_owner); + mmu_notifier_invalidate_range_start(&hmm_vma_walk->mmu_range); + } + + return 0; +} + static int hmm_vma_walk_pmd(pmd_t *pmdp, unsigned long start, unsigned long end, @@ -394,46 +562,130 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, { struct hmm_vma_walk *hmm_vma_walk =3D walk->private; struct hmm_range *range =3D hmm_vma_walk->range; - unsigned long *hmm_pfns =3D - &range->hmm_pfns[(start - range->start) >> PAGE_SHIFT]; unsigned long npages =3D (end - start) >> PAGE_SHIFT; + struct mm_struct *mm =3D walk->vma->vm_mm; + enum migrate_vma_info minfo; unsigned long addr =3D start; + unsigned long *hmm_pfns; + unsigned long i; pte_t *ptep; pmd_t pmd; + int r =3D 0; + + minfo =3D hmm_select_migrate(range); =20 again: - pmd =3D pmdp_get_lockless(pmdp); - if (pmd_none(pmd)) - return hmm_vma_walk_hole(start, end, -1, walk); + hmm_pfns =3D &range->hmm_pfns[(addr - range->start) >> PAGE_SHIFT]; + hmm_vma_walk->ptelocked =3D false; + hmm_vma_walk->pmdlocked =3D false; + + if (minfo) { + hmm_vma_walk->ptl =3D pmd_lock(mm, pmdp); + hmm_vma_walk->pmdlocked =3D true; + pmd =3D pmdp_get(pmdp); + } else + pmd =3D pmdp_get_lockless(pmdp); + + if (pmd_none(pmd)) { + r =3D hmm_vma_walk_hole(start, end, -1, walk); + + if (hmm_vma_walk->pmdlocked) { + spin_unlock(hmm_vma_walk->ptl); + hmm_vma_walk->pmdlocked =3D false; + } + return r; + } =20 if (thp_migration_supported() && pmd_is_migration_entry(pmd)) { - if (hmm_range_need_fault(hmm_vma_walk, hmm_pfns, npages, 0)) { + if (!minfo) { + if (hmm_range_need_fault(hmm_vma_walk, hmm_pfns, npages, 0)) { + hmm_vma_walk->last =3D addr; + pmd_migration_entry_wait(walk->mm, pmdp); + return -EBUSY; + } + } + for (i =3D 0; addr < end; addr +=3D PAGE_SIZE, i++) + hmm_pfns[i] &=3D HMM_PFN_INOUT_FLAGS; + + if (hmm_vma_walk->pmdlocked) { + spin_unlock(hmm_vma_walk->ptl); + hmm_vma_walk->pmdlocked =3D false; + } + + return 0; + } + + if (pmd_trans_huge(pmd) || !pmd_present(pmd)) { + + if (!pmd_present(pmd)) { + r =3D hmm_vma_handle_absent_pmd(walk, start, end, hmm_pfns, + pmd); + // If not migrating we are done + if (r || !minfo) { + if (hmm_vma_walk->pmdlocked) { + spin_unlock(hmm_vma_walk->ptl); + hmm_vma_walk->pmdlocked =3D false; + } + return r; + } + } + + if (pmd_trans_huge(pmd)) { + + /* + * No need to take pmd_lock here if not migrating, + * even if some other thread is splitting the huge + * pmd we will get that event through mmu_notifier callback. + * + * So just read pmd value and check again it's a transparent + * huge or device mapping one and compute corresponding pfn + * values. + */ + + if (!minfo) { + pmd =3D pmdp_get_lockless(pmdp); + if (!pmd_trans_huge(pmd)) + goto again; + } + + r =3D hmm_vma_handle_pmd(walk, addr, end, hmm_pfns, pmd); + + // If not migrating we are done + if (r || !minfo) { + if (hmm_vma_walk->pmdlocked) { + spin_unlock(hmm_vma_walk->ptl); + hmm_vma_walk->pmdlocked =3D false; + } + return r; + } + } + + r =3D hmm_vma_handle_migrate_prepare_pmd(walk, pmdp, start, end, hmm_pfn= s); + + if (hmm_vma_walk->pmdlocked) { + spin_unlock(hmm_vma_walk->ptl); + hmm_vma_walk->pmdlocked =3D false; + } + + if (r =3D=3D -ENOENT) { + r =3D hmm_vma_walk_split(pmdp, addr, walk); + if (r) { + /* Split not successful, skip */ + return hmm_pfns_fill(start, end, hmm_vma_walk, HMM_PFN_ERROR); + } + + /* Split successful, reloop */ hmm_vma_walk->last =3D addr; - pmd_migration_entry_wait(walk->mm, pmdp); return -EBUSY; } - return hmm_pfns_fill(start, end, range, 0); - } =20 - if (!pmd_present(pmd)) - return hmm_vma_handle_absent_pmd(walk, start, end, hmm_pfns, - pmd); + return r; =20 - if (pmd_trans_huge(pmd)) { - /* - * No need to take pmd_lock here, even if some other thread - * is splitting the huge pmd we will get that event through - * mmu_notifier callback. - * - * So just read pmd value and check again it's a transparent - * huge or device mapping one and compute corresponding pfn - * values. - */ - pmd =3D pmdp_get_lockless(pmdp); - if (!pmd_trans_huge(pmd)) - goto again; + } =20 - return hmm_vma_handle_pmd(walk, addr, end, hmm_pfns, pmd); + if (hmm_vma_walk->pmdlocked) { + spin_unlock(hmm_vma_walk->ptl); + hmm_vma_walk->pmdlocked =3D false; } =20 /* @@ -445,22 +697,43 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, if (pmd_bad(pmd)) { if (hmm_range_need_fault(hmm_vma_walk, hmm_pfns, npages, 0)) return -EFAULT; - return hmm_pfns_fill(start, end, range, HMM_PFN_ERROR); + return hmm_pfns_fill(start, end, hmm_vma_walk, HMM_PFN_ERROR); } =20 - ptep =3D pte_offset_map(pmdp, addr); + if (minfo) { + ptep =3D pte_offset_map_lock(mm, pmdp, addr, &hmm_vma_walk->ptl); + if (ptep) + hmm_vma_walk->ptelocked =3D true; + } else + ptep =3D pte_offset_map(pmdp, addr); if (!ptep) goto again; + for (; addr < end; addr +=3D PAGE_SIZE, ptep++, hmm_pfns++) { - int r; =20 r =3D hmm_vma_handle_pte(walk, addr, end, pmdp, ptep, hmm_pfns); if (r) { - /* hmm_vma_handle_pte() did pte_unmap() */ + /* hmm_vma_handle_pte() did pte_unmap() / pte_unmap_unlock */ return r; } + + r =3D hmm_vma_handle_migrate_prepare(walk, pmdp, ptep, addr, hmm_pfns); + if (r =3D=3D -EAGAIN) { + HMM_ASSERT_UNLOCKED(hmm_vma_walk); + goto again; + } + if (r) { + hmm_pfns_fill(addr, end, hmm_vma_walk, HMM_PFN_ERROR); + break; + } } - pte_unmap(ptep - 1); + + if (hmm_vma_walk->ptelocked) { + pte_unmap_unlock(ptep - 1, hmm_vma_walk->ptl); + hmm_vma_walk->ptelocked =3D false; + } else + pte_unmap(ptep - 1); + return 0; } =20 @@ -594,6 +867,11 @@ static int hmm_vma_walk_test(unsigned long start, unsi= gned long end, struct hmm_vma_walk *hmm_vma_walk =3D walk->private; struct hmm_range *range =3D hmm_vma_walk->range; struct vm_area_struct *vma =3D walk->vma; + int r; + + r =3D hmm_vma_capture_migrate_range(start, end, walk); + if (r) + return r; =20 if (!(vma->vm_flags & (VM_IO | VM_PFNMAP)) && vma->vm_flags & VM_READ) @@ -616,7 +894,7 @@ static int hmm_vma_walk_test(unsigned long start, unsig= ned long end, (end - start) >> PAGE_SHIFT, 0)) return -EFAULT; =20 - hmm_pfns_fill(start, end, range, HMM_PFN_ERROR); + hmm_pfns_fill(start, end, hmm_vma_walk, HMM_PFN_ERROR); =20 /* Skip this vma and continue processing the next vma. */ return 1; @@ -646,9 +924,17 @@ static const struct mm_walk_ops hmm_walk_ops =3D { * the invalidation to finish. * -EFAULT: A page was requested to be valid and could not be made val= id * ie it has no backing VMA or it is illegal to access + * -ERANGE: The range crosses multiple VMAs, or space for hmm_pfns arr= ay + * is too low. * * This is similar to get_user_pages(), except that it can read the page t= ables * without mutating them (ie causing faults). + * + * If want to do migrate after faulting, call hmm_range_fault() with + * HMM_PFN_REQ_MIGRATE and initialize range.migrate field. + * After hmm_range_fault() call migrate_hmm_range_setup() instead of + * migrate_vma_setup() and after that follow normal migrate calls path. + * */ int hmm_range_fault(struct hmm_range *range) { @@ -656,16 +942,34 @@ int hmm_range_fault(struct hmm_range *range) .range =3D range, .last =3D range->start, }; - struct mm_struct *mm =3D range->notifier->mm; + struct mm_struct *mm; + bool is_fault_path; int ret; =20 + /* + * + * Could be serving a device fault or come from migrate + * entry point. For the former we have not resolved the vma + * yet, and the latter we don't have a notifier (but have a vma). + * + */ +#ifdef CONFIG_DEVICE_MIGRATION + is_fault_path =3D !!range->notifier; + mm =3D is_fault_path ? range->notifier->mm : range->migrate->vma->vm_mm; +#else + is_fault_path =3D true; + mm =3D range->notifier->mm; +#endif mmap_assert_locked(mm); =20 do { /* If range is no longer valid force retry. */ - if (mmu_interval_check_retry(range->notifier, - range->notifier_seq)) - return -EBUSY; + if (is_fault_path && mmu_interval_check_retry(range->notifier, + range->notifier_seq)) { + ret =3D -EBUSY; + break; + } + ret =3D walk_page_range(mm, hmm_vma_walk.last, range->end, &hmm_walk_ops, &hmm_vma_walk); /* @@ -675,6 +979,19 @@ int hmm_range_fault(struct hmm_range *range) * output, and all >=3D are still at their input values. */ } while (ret =3D=3D -EBUSY); + +#ifdef CONFIG_DEVICE_MIGRATION + if (hmm_select_migrate(range) && range->migrate && + hmm_vma_walk.mmu_range.owner) { + // The migrate_vma path has the following initialized + if (is_fault_path) { + range->migrate->vma =3D hmm_vma_walk.vma; + range->migrate->start =3D range->start; + range->migrate->end =3D hmm_vma_walk.end; + } + mmu_notifier_invalidate_range_end(&hmm_vma_walk.mmu_range); + } +#endif return ret; } EXPORT_SYMBOL(hmm_range_fault); --=20 2.50.0 From nobody Tue Apr 7 04:46:06 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E71634CFDE for ; Mon, 16 Mar 2026 06:24:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.133.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773642281; cv=none; b=cwf2RHYjO0hqEGvZv8fEmjWMHVeXXAEPLgNfv7xJnnHIiUyVtxDcWmGGGj4WjNrN0iJehNjwiD9NPM80l3wTkaSAiT6oqhgZFrYSBu2W5oP+GWaHh8W9onw/e+ai2sSttjzJdV9QrlX0OhUW4o43Soz9gPBR8G88iYz3MzvByrM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773642281; c=relaxed/simple; bh=JN8K0Bq3yovvDgK4XMOmpvF2vTElQ1Wkys3W1bzDKYw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=fOTKXM2TIyi2nVzJxx7DlAfntqLw7rMfGu++3WDxtMI/QXQk+z3tgo+EAVdrTZY0uVIlAhNnh+NMeLPvYh+1v29FckAlnbM+SNvbQ0Ugme3kyq453c5N+lNvo7i1LZ1lT+RFzhPJcyaoKt96sHIHlnt46+30P0l++UvaMOktecg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=f9T/h+XS; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=qYQfErTF; arc=none smtp.client-ip=170.10.133.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="f9T/h+XS"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="qYQfErTF" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1773642278; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=h9TXgzky6xu4x+J40zN7QzLkU2EMyumfkOS9oZD4YE0=; b=f9T/h+XS672qx7zOsvLZv6klcmULAw7owfOg5fDjYMBAYmmuVAYdGECBwnnXYwqkOlCdrF sn2EddTA25qQ7ArNSz+nMPEN+vAyb8Y9hnj8w+TSqERPbDiQsO/srOCWkkoKh1iyNAAEbI sXDOJ+iWRtFcptb8uERLPSkGZrpfAZo= Received: from mail-lf1-f71.google.com (mail-lf1-f71.google.com [209.85.167.71]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-544-I8d0loicOy2NNpgW68Tplg-1; Mon, 16 Mar 2026 02:24:37 -0400 X-MC-Unique: I8d0loicOy2NNpgW68Tplg-1 X-Mimecast-MFC-AGG-ID: I8d0loicOy2NNpgW68Tplg_1773642276 Received: by mail-lf1-f71.google.com with SMTP id 2adb3069b0e04-5a143864ad3so2783759e87.1 for ; Sun, 15 Mar 2026 23:24:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1773642276; x=1774247076; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=h9TXgzky6xu4x+J40zN7QzLkU2EMyumfkOS9oZD4YE0=; b=qYQfErTF3Gtz7E2OhzZEGEsQK6sh4hsYIXMf6L0aj7NqL/wpOlu3ymHY/LplN0++B+ 4e7ITU/HORc4efFCqG87wEz1yVYT0F1XoQlJyDz07qoGkWcgMnqrMi9vmHBa1LbNlGN5 M3D/C9ooJg2F5dZRBpJ6cuIM7nJN1rOTlGOJ3893Q6cWp+4KKwMQF1MAr8y+w9+MfA/M xFtEiP3aF6QdEeFzRgxhjE1LQpOqNIUc8FMFwBrnX00xdXeQhAe4BfPd2TCoi5rsRPbt 2uUJ4F6ieGeAwjksTyh50RIg62t5uYNrYKRZ0KnVwDlj0Y+VceV/flUjbb7iMYG7G7pg US9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773642276; x=1774247076; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=h9TXgzky6xu4x+J40zN7QzLkU2EMyumfkOS9oZD4YE0=; b=XAxtniu9FFhcoVDJggbrQbtCbCGvGF3Evmnr/xDZA8Bzr1lk4LavQrgFx5rvh5+0Gp oEv1rSO7mcw16M2+xnYOkr0HGq0C/z9FzmJU5qpDNW5msUkfu3FRwhhUAYc3nKfysTE2 w2XNjMxDem5Z254Y44ggZ0K95/bCd8dDCqFj+z93yojqVsHtChjkIo2lEr9LvjRO4T8C eOJ5Gm1qK2PY6uG4fpK2wUVgg0D6D3BZRZ1F6wey079u8PVyXO31bqXnuE0e7hwUJ5tN P0F2a5et8ulFsdzzMr4vM5lPKyE+L8cNHn/7DhYwgNwmEN4zEmDQlX4w3hTajrdMFT24 CPDw== X-Gm-Message-State: AOJu0Ywye7WwuDKfMMqjLwDkTqTLH9jF8p8n0h1G25G/6pU/be9pwhU3 y9qLaQYKydCXopEvId+vE8Zx43KMgtfwIzUi8gLheh7zUBSoqMJpYhKB4VYog148YEgdPfFzZ68 lpLhRSxjAvSfSc4oMWDoPuGhlS3373Mf8ey0Az1t8oVUqWGcd0flcbCV+waOgsCW4 X-Gm-Gg: ATEYQzwTjIaYx+mU5FHuABt27Hc0xkp49ZkOUw1y+zYE5DYD94C9VSzW7ul1IWh5oe7 GLIBg12/YLqwyXpk4oHwKkiAmBhd905/5nBofsovZgb0bNC/LoRLNyUr4dGBWsYSWzCBXklM4rq vmL3Ybm8J1RjhduG9OX9k8dp3kn8ra2d7Vcx70ir0GgQV5ZNoR+ZcyvQ2u+Lc7bcguirTzwrfRs Pp67LUKHW3t6aFblA+PWP3hF4b+8/7nZRGrbY6Y7J+ar24hgDf7S2o/NOhrkix3cywxdmnRsF3I ahdRobGro3b5CX/N1ocVJoYM9pB6cBVQ6IlIAG7Y0j7uVVdTid1lFvfvstOn+Lr3lLLqCTABP3I L6oIUe6YdDUR7ZsBuJFM38KrnejTw+n+NBy58 X-Received: by 2002:a05:6512:1393:b0:5a1:2e14:1343 with SMTP id 2adb3069b0e04-5a162b2185emr4513262e87.44.1773642275614; Sun, 15 Mar 2026 23:24:35 -0700 (PDT) X-Received: by 2002:a05:6512:1393:b0:5a1:2e14:1343 with SMTP id 2adb3069b0e04-5a162b2185emr4513237e87.44.1773642275076; Sun, 15 Mar 2026 23:24:35 -0700 (PDT) Received: from fedora (85-23-51-1.bb.dnainternet.fi. [85.23.51.1]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5a15602e692sm3263397e87.30.2026.03.15.23.24.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 15 Mar 2026 23:24:34 -0700 (PDT) From: mpenttil@redhat.com To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, =?UTF-8?q?Mika=20Penttil=C3=A4?= , David Hildenbrand , Jason Gunthorpe , Leon Romanovsky , Alistair Popple , Balbir Singh , Zi Yan , Matthew Brost Subject: [PATCH v6 4/6] mm: setup device page migration in HMM pagewalk Date: Mon, 16 Mar 2026 08:24:05 +0200 Message-ID: <20260316062407.3354636-5-mpenttil@redhat.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20260316062407.3354636-1-mpenttil@redhat.com> References: <20260316062407.3354636-1-mpenttil@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Mika Penttil=C3=A4 Implement the needed hmm_vma_handle_migrate_prepare_pmd() and hmm_vma_handle_migrate_prepare() functions which are mostly carried over from migrate_device.c, as well as the needed split functions. Make migrate_device take use of HMM pagewalk for collecting part of migration. Cc: David Hildenbrand Cc: Jason Gunthorpe Cc: Leon Romanovsky Cc: Alistair Popple Cc: Balbir Singh Cc: Zi Yan Cc: Matthew Brost Suggested-by: Alistair Popple Signed-off-by: Mika Penttil=C3=A4 --- include/linux/migrate.h | 9 +- mm/hmm.c | 420 ++++++++++++++++++++++++++++++++++++++-- mm/migrate_device.c | 26 ++- 3 files changed, 438 insertions(+), 17 deletions(-) diff --git a/include/linux/migrate.h b/include/linux/migrate.h index 037e7430edb9..9e1081847d1f 100644 --- a/include/linux/migrate.h +++ b/include/linux/migrate.h @@ -163,6 +163,7 @@ enum migrate_vma_info { MIGRATE_VMA_SELECT_DEVICE_PRIVATE =3D 1 << 1, MIGRATE_VMA_SELECT_DEVICE_COHERENT =3D 1 << 2, MIGRATE_VMA_SELECT_COMPOUND =3D 1 << 3, + MIGRATE_VMA_FAULT =3D 1 << 4, }; =20 struct migrate_vma { @@ -200,10 +201,14 @@ struct migrate_vma { struct page *fault_page; }; =20 -// TODO: enable migration static inline enum migrate_vma_info hmm_select_migrate(struct hmm_range *r= ange) { - return 0; + enum migrate_vma_info minfo; + + minfo =3D (range->default_flags & HMM_PFN_REQ_MIGRATE) ? + range->migrate->flags : 0; + + return minfo; } =20 int migrate_vma_setup(struct migrate_vma *args); diff --git a/mm/hmm.c b/mm/hmm.c index c302de5b67d9..69d88fe16882 100644 --- a/mm/hmm.c +++ b/mm/hmm.c @@ -470,34 +470,424 @@ static int hmm_vma_handle_absent_pmd(struct mm_walk = *walk, unsigned long start, #endif /* CONFIG_ARCH_ENABLE_THP_MIGRATION */ =20 #ifdef CONFIG_DEVICE_MIGRATION +/** + * migrate_vma_split_folio() - Helper function to split a THP folio + * @folio: the folio to split + * @fault_page: struct page associated with the fault if any + * + * Returns 0 on success + */ +static int migrate_vma_split_folio(struct folio *folio, + struct page *fault_page) +{ + int ret; + struct folio *fault_folio =3D fault_page ? page_folio(fault_page) : NULL; + struct folio *new_fault_folio =3D NULL; + + if (folio !=3D fault_folio) { + folio_get(folio); + folio_lock(folio); + } + + ret =3D split_folio(folio); + if (ret) { + if (folio !=3D fault_folio) { + folio_unlock(folio); + folio_put(folio); + } + return ret; + } + + new_fault_folio =3D fault_page ? page_folio(fault_page) : NULL; + + /* + * Ensure the lock is held on the correct + * folio after the split + */ + if (!new_fault_folio) { + folio_unlock(folio); + folio_put(folio); + } else if (folio !=3D new_fault_folio) { + if (new_fault_folio !=3D fault_folio) { + folio_get(new_fault_folio); + folio_lock(new_fault_folio); + } + folio_unlock(folio); + folio_put(folio); + } + + return 0; +} + static int hmm_vma_handle_migrate_prepare_pmd(const struct mm_walk *walk, pmd_t *pmdp, unsigned long start, unsigned long end, unsigned long *hmm_pfn) { - // TODO: implement migration entry insertion - return 0; + struct hmm_vma_walk *hmm_vma_walk =3D walk->private; + struct hmm_range *range =3D hmm_vma_walk->range; + struct migrate_vma *migrate =3D range->migrate; + struct folio *fault_folio =3D NULL; + struct folio *folio; + enum migrate_vma_info minfo; + unsigned long i; + int r =3D 0; + + minfo =3D hmm_select_migrate(range); + if (!minfo) + return r; + + WARN_ON_ONCE(!migrate); + HMM_ASSERT_PMD_LOCKED(hmm_vma_walk, true); + + fault_folio =3D migrate->fault_page ? + page_folio(migrate->fault_page) : NULL; + + if (pmd_none(*pmdp)) + return hmm_pfns_fill(start, end, hmm_vma_walk, 0); + + if (!(hmm_pfn[0] & HMM_PFN_VALID)) + goto out; + + if (pmd_trans_huge(*pmdp)) { + if (!(minfo & MIGRATE_VMA_SELECT_SYSTEM)) + goto out; + + folio =3D pmd_folio(*pmdp); + if (is_huge_zero_folio(folio)) + return hmm_pfns_fill(start, end, hmm_vma_walk, 0); + + } else if (!pmd_present(*pmdp)) { + const softleaf_t entry =3D softleaf_from_pmd(*pmdp); + + folio =3D softleaf_to_folio(entry); + + if (!softleaf_is_device_private(entry)) + goto out; + + if (!(minfo & MIGRATE_VMA_SELECT_DEVICE_PRIVATE)) + goto out; + + if (folio->pgmap->owner !=3D migrate->pgmap_owner) + goto out; + + } else { + hmm_vma_walk->last =3D start; + return -EBUSY; + } + + folio_get(folio); + + if (folio !=3D fault_folio && unlikely(!folio_trylock(folio))) { + folio_put(folio); + hmm_pfns_fill(start, end, hmm_vma_walk, HMM_PFN_ERROR); + return 0; + } + + if (thp_migration_supported() && + (migrate->flags & MIGRATE_VMA_SELECT_COMPOUND) && + (IS_ALIGNED(start, HPAGE_PMD_SIZE) && + IS_ALIGNED(end, HPAGE_PMD_SIZE))) { + + struct page_vma_mapped_walk pvmw =3D { + .ptl =3D hmm_vma_walk->ptl, + .address =3D start, + .pmd =3D pmdp, + .vma =3D walk->vma, + }; + + hmm_pfn[0] |=3D HMM_PFN_MIGRATE | HMM_PFN_COMPOUND; + + r =3D set_pmd_migration_entry(&pvmw, folio_page(folio, 0)); + if (r) { + hmm_pfn[0] &=3D ~(HMM_PFN_MIGRATE | HMM_PFN_COMPOUND); + r =3D -ENOENT; // fallback + goto unlock_out; + } + for (i =3D 1, start +=3D PAGE_SIZE; start < end; start +=3D PAGE_SIZE, i= ++) + hmm_pfn[i] &=3D HMM_PFN_INOUT_FLAGS; + + } else { + r =3D -ENOENT; // fallback + goto unlock_out; + } + + +out: + return r; + +unlock_out: + if (folio !=3D fault_folio) + folio_unlock(folio); + folio_put(folio); + goto out; } =20 +/* + * Install migration entries if migration requested, either from fault + * or migrate paths. + * + */ static int hmm_vma_handle_migrate_prepare(const struct mm_walk *walk, pmd_t *pmdp, - pte_t *pte, + pte_t *ptep, unsigned long addr, - unsigned long *hmm_pfn) + unsigned long *hmm_pfn, + bool *unmapped) { - // TODO: implement migration entry insertion + struct hmm_vma_walk *hmm_vma_walk =3D walk->private; + struct hmm_range *range =3D hmm_vma_walk->range; + struct migrate_vma *migrate =3D range->migrate; + struct mm_struct *mm =3D walk->vma->vm_mm; + struct folio *fault_folio =3D NULL; + enum migrate_vma_info minfo; + struct dev_pagemap *pgmap; + bool anon_exclusive; + struct folio *folio; + unsigned long pfn; + struct page *page; + softleaf_t entry; + pte_t pte, swp_pte; + bool writable =3D false; + + // Do we want to migrate at all? + minfo =3D hmm_select_migrate(range); + if (!minfo) + return 0; + + WARN_ON_ONCE(!migrate); + HMM_ASSERT_PTE_LOCKED(hmm_vma_walk, true); + + fault_folio =3D migrate->fault_page ? + page_folio(migrate->fault_page) : NULL; + + pte =3D ptep_get(ptep); + + if (pte_none(pte)) { + // migrate without faulting case + if (vma_is_anonymous(walk->vma)) { + *hmm_pfn &=3D HMM_PFN_INOUT_FLAGS; + *hmm_pfn |=3D HMM_PFN_MIGRATE | HMM_PFN_VALID; + goto out; + } + } + + if (!(hmm_pfn[0] & HMM_PFN_VALID)) + goto out; + + if (!pte_present(pte)) { + /* + * Only care about unaddressable device page special + * page table entry. Other special swap entries are not + * migratable, and we ignore regular swapped page. + */ + entry =3D softleaf_from_pte(pte); + if (!softleaf_is_device_private(entry)) + goto out; + + if (!(minfo & MIGRATE_VMA_SELECT_DEVICE_PRIVATE)) + goto out; + + page =3D softleaf_to_page(entry); + folio =3D page_folio(page); + if (folio->pgmap->owner !=3D migrate->pgmap_owner) + goto out; + + if (folio_test_large(folio)) { + int ret; + + pte_unmap_unlock(ptep, hmm_vma_walk->ptl); + hmm_vma_walk->ptelocked =3D false; + ret =3D migrate_vma_split_folio(folio, + migrate->fault_page); + if (ret) + goto out_error; + return -EAGAIN; + } + + pfn =3D page_to_pfn(page); + if (softleaf_is_device_private_write(entry)) + writable =3D true; + } else { + pfn =3D pte_pfn(pte); + if (is_zero_pfn(pfn) && + (minfo & MIGRATE_VMA_SELECT_SYSTEM)) { + *hmm_pfn =3D HMM_PFN_MIGRATE|HMM_PFN_VALID; + goto out; + } + page =3D vm_normal_page(walk->vma, addr, pte); + if (page && !is_zone_device_page(page) && + !(minfo & MIGRATE_VMA_SELECT_SYSTEM)) { + goto out; + } else if (page && is_device_coherent_page(page)) { + pgmap =3D page_pgmap(page); + + if (!(minfo & + MIGRATE_VMA_SELECT_DEVICE_COHERENT) || + pgmap->owner !=3D migrate->pgmap_owner) + goto out; + } + + folio =3D page ? page_folio(page) : NULL; + if (folio && folio_test_large(folio)) { + int ret; + + pte_unmap_unlock(ptep, hmm_vma_walk->ptl); + hmm_vma_walk->ptelocked =3D false; + + ret =3D migrate_vma_split_folio(folio, + migrate->fault_page); + if (ret) + goto out_error; + return -EAGAIN; + } + + writable =3D pte_write(pte); + } + + if (!page || !page->mapping) + goto out; + + /* + * By getting a reference on the folio we pin it and that blocks + * any kind of migration. Side effect is that it "freezes" the + * pte. + * + * We drop this reference after isolating the folio from the lru + * for non device folio (device folio are not on the lru and thus + * can't be dropped from it). + */ + folio =3D page_folio(page); + folio_get(folio); + + /* + * We rely on folio_trylock() to avoid deadlock between + * concurrent migrations where each is waiting on the others + * folio lock. If we can't immediately lock the folio we fail this + * migration as it is only best effort anyway. + * + * If we can lock the folio it's safe to set up a migration entry + * now. In the common case where the folio is mapped once in a + * single process setting up the migration entry now is an + * optimisation to avoid walking the rmap later with + * try_to_migrate(). + */ + + if (fault_folio =3D=3D folio || folio_trylock(folio)) { + anon_exclusive =3D folio_test_anon(folio) && + PageAnonExclusive(page); + + flush_cache_page(walk->vma, addr, pfn); + + if (anon_exclusive) { + pte =3D ptep_clear_flush(walk->vma, addr, ptep); + + if (folio_try_share_anon_rmap_pte(folio, page)) { + set_pte_at(mm, addr, ptep, pte); + folio_unlock(folio); + folio_put(folio); + goto out; + } + } else { + pte =3D ptep_get_and_clear(mm, addr, ptep); + } + + if (pte_dirty(pte)) + folio_mark_dirty(folio); + + /* Setup special migration page table entry */ + if (writable) + entry =3D make_writable_migration_entry(pfn); + else if (anon_exclusive) + entry =3D make_readable_exclusive_migration_entry(pfn); + else + entry =3D make_readable_migration_entry(pfn); + + if (pte_present(pte)) { + if (pte_young(pte)) + entry =3D make_migration_entry_young(entry); + if (pte_dirty(pte)) + entry =3D make_migration_entry_dirty(entry); + } + + swp_pte =3D swp_entry_to_pte(entry); + if (pte_present(pte)) { + if (pte_soft_dirty(pte)) + swp_pte =3D pte_swp_mksoft_dirty(swp_pte); + if (pte_uffd_wp(pte)) + swp_pte =3D pte_swp_mkuffd_wp(swp_pte); + } else { + if (pte_swp_soft_dirty(pte)) + swp_pte =3D pte_swp_mksoft_dirty(swp_pte); + if (pte_swp_uffd_wp(pte)) + swp_pte =3D pte_swp_mkuffd_wp(swp_pte); + } + + set_pte_at(mm, addr, ptep, swp_pte); + folio_remove_rmap_pte(folio, page, walk->vma); + folio_put(folio); + *hmm_pfn |=3D HMM_PFN_MIGRATE; + + if (pte_present(pte)) + *unmapped =3D true; + } else + folio_put(folio); +out: return 0; +out_error: + return -EFAULT; } =20 static int hmm_vma_walk_split(pmd_t *pmdp, unsigned long addr, struct mm_walk *walk) { - // TODO : implement split - return 0; -} + struct hmm_vma_walk *hmm_vma_walk =3D walk->private; + struct hmm_range *range =3D hmm_vma_walk->range; + struct migrate_vma *migrate =3D range->migrate; + struct folio *folio, *fault_folio; + spinlock_t *ptl; + int ret =3D 0; =20 + HMM_ASSERT_UNLOCKED(hmm_vma_walk); + + fault_folio =3D (migrate && migrate->fault_page) ? + page_folio(migrate->fault_page) : NULL; + + ptl =3D pmd_lock(walk->mm, pmdp); + if (unlikely(!pmd_trans_huge(*pmdp))) { + spin_unlock(ptl); + goto out; + } + + folio =3D pmd_folio(*pmdp); + if (is_huge_zero_folio(folio)) { + spin_unlock(ptl); + split_huge_pmd(walk->vma, pmdp, addr); + } else { + folio_get(folio); + spin_unlock(ptl); + + if (folio !=3D fault_folio) { + if (unlikely(!folio_trylock(folio))) { + folio_put(folio); + ret =3D -EBUSY; + goto out; + } + } else + folio_put(folio); + + ret =3D split_folio(folio); + if (fault_folio !=3D folio) { + folio_unlock(folio); + folio_put(folio); + } + + } +out: + return ret; +} #else static int hmm_vma_handle_migrate_prepare_pmd(const struct mm_walk *walk, pmd_t *pmdp, @@ -512,7 +902,8 @@ static int hmm_vma_handle_migrate_prepare(const struct = mm_walk *walk, pmd_t *pmdp, pte_t *pte, unsigned long addr, - unsigned long *hmm_pfn) + unsigned long *hmm_pfn, + bool *unmapped) { return 0; } @@ -567,6 +958,7 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, enum migrate_vma_info minfo; unsigned long addr =3D start; unsigned long *hmm_pfns; + bool unmapped =3D false; unsigned long i; pte_t *ptep; pmd_t pmd; @@ -648,7 +1040,7 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, goto again; } =20 - r =3D hmm_vma_handle_pmd(walk, addr, end, hmm_pfns, pmd); + r =3D hmm_vma_handle_pmd(walk, start, end, hmm_pfns, pmd); =20 // If not migrating we are done if (r || !minfo) { @@ -717,9 +1109,13 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, return r; } =20 - r =3D hmm_vma_handle_migrate_prepare(walk, pmdp, ptep, addr, hmm_pfns); + r =3D hmm_vma_handle_migrate_prepare(walk, pmdp, ptep, addr, hmm_pfns, &= unmapped); if (r =3D=3D -EAGAIN) { HMM_ASSERT_UNLOCKED(hmm_vma_walk); + if (unmapped) { + flush_tlb_range(walk->vma, start, addr); + unmapped =3D false; + } goto again; } if (r) { @@ -727,6 +1123,8 @@ static int hmm_vma_walk_pmd(pmd_t *pmdp, break; } } + if (unmapped) + flush_tlb_range(walk->vma, start, addr); =20 if (hmm_vma_walk->ptelocked) { pte_unmap_unlock(ptep - 1, hmm_vma_walk->ptl); diff --git a/mm/migrate_device.c b/mm/migrate_device.c index b320ea3736b4..cef9b644d31f 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -734,7 +734,17 @@ static void migrate_vma_unmap(struct migrate_vma *migr= ate) */ int migrate_vma_setup(struct migrate_vma *args) { + int ret; long nr_pages =3D (args->end - args->start) >> PAGE_SHIFT; + struct hmm_range range =3D { + .notifier =3D NULL, + .start =3D args->start, + .end =3D args->end, + .hmm_pfns =3D args->src, + .dev_private_owner =3D args->pgmap_owner, + .migrate =3D args, + .default_flags =3D HMM_PFN_REQ_MIGRATE + }; =20 args->start &=3D PAGE_MASK; args->end &=3D PAGE_MASK; @@ -759,17 +769,25 @@ int migrate_vma_setup(struct migrate_vma *args) args->cpages =3D 0; args->npages =3D 0; =20 - migrate_vma_collect(args); + if (args->flags & MIGRATE_VMA_FAULT) + range.default_flags |=3D HMM_PFN_REQ_FAULT; + + ret =3D hmm_range_fault(&range); =20 - if (args->cpages) - migrate_vma_unmap(args); + migrate_hmm_range_setup(&range); + + /* Remove migration PTEs */ + if (ret) { + migrate_vma_pages(args); + migrate_vma_finalize(args); + } =20 /* * At this point pages are locked and unmapped, and thus they have * stable content and can safely be copied to destination memory that * is allocated by the drivers. */ - return 0; + return ret; =20 } EXPORT_SYMBOL(migrate_vma_setup); --=20 2.50.0 From nobody Tue Apr 7 04:46:06 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8FF734C806 for ; Mon, 16 Mar 2026 06:24:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773642283; cv=none; b=NjeA2J8DBc0OG2xTnfS1PJ0yroMHP6NHYbTIS8Bvuj1Wnucma4cwmgj32XbBvNorYoLTtTXSFOkK489/flBA+ik1hqv/RTou9UqfRpqu0MB3FY88oNFAx8wF06sciq7gpNzUqxTLeXT1OYK9vRZODIwXwPgYCLycOtQR47CcnAo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773642283; c=relaxed/simple; bh=/ZcVUHWjt0HLPeNT9dH2CPu5sMZ/Ui+l/KWWse+jnBA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=JN6tupGNR7uxniTjAR6e1VwfnXG7W7c/ixTRW8rzqK8e1D6adwXj3U4hN4EHUA5rjR8rJ/6zVE+s4gqILEyKOfXABrG7U3TOThIqmy7VrDYPmU+3c2YmWlDxHMo2DAf8P8kK4sShKOQpqzpUjHBEcdTiDBKTKDsVOY3WGpoNnN8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=etYrX6lC; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=kQJtr7eP; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="etYrX6lC"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="kQJtr7eP" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1773642281; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ixwVgPxIz/n/uYpN1Wu+O4jklsKJ4ob7dTgnI6cFXT0=; b=etYrX6lCPcLwS0erlwjIty+OCNiNL8ZZQ2aUQRraYMUKjbinN/V09GvPAClHlwQcqWsTc/ HOnFK6WhWQrRBCxkDYzcO1QRaViMnpXRWApdJt0UcQKAZjBA8a2PYqmeF0GdsArz384I2h 7gT3RQIelv/GUla14/mIrVJOP9mYyIA= Received: from mail-lf1-f69.google.com (mail-lf1-f69.google.com [209.85.167.69]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-461-grDkuKftNi636UgVmGK10A-1; Mon, 16 Mar 2026 02:24:39 -0400 X-MC-Unique: grDkuKftNi636UgVmGK10A-1 X-Mimecast-MFC-AGG-ID: grDkuKftNi636UgVmGK10A_1773642278 Received: by mail-lf1-f69.google.com with SMTP id 2adb3069b0e04-59e28288067so2698215e87.2 for ; Sun, 15 Mar 2026 23:24:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1773642278; x=1774247078; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ixwVgPxIz/n/uYpN1Wu+O4jklsKJ4ob7dTgnI6cFXT0=; b=kQJtr7eP8pO3NqdDtjzsA131L3/zg6Ow+YvmEy8Fe62hgAOkqZxkfkUizYSYU2BIg+ wlnk1Ejm+3YQrRhiyX+zPUINJI3Fat8qcfoJH4WheHOl4T5h0zNSJwlhqR19ysciIwfr uhgpPoVb7KPWC3Wcce4kuxTGC0oOPRr4GIoaHfHjWyBBjqP591Fuwnhgylz3eH908y9Z 4OkTIBkJqQ/Nu9+OEL39bIrn53uF4SNI6lPFEn84eR/sS2hsLx+P07lH1PC31Z6RQE79 tD1xemaxYsETk1eK+1hnpkeLwZsa/dqe8tcZcj0jsCzCryh8S3QWZ22vKXlukqG4l1Lk JVvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773642278; x=1774247078; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=ixwVgPxIz/n/uYpN1Wu+O4jklsKJ4ob7dTgnI6cFXT0=; b=qLb9aOxTGcQIgq8Tu6HgBH0NYXplBnCPgz75EeDTeIJ/pURXl9W7L4Ajt96SKlZnRr QECvLMhne4ltpiW7dfE/WaEwfJ+6vekxtnF9UPctogBmbVPW971EVNccQNEjA4zvJbtF AQ5BMYQLGboqLfbtI48jhqWx4TF+/x6lRPfbjPs6opHYO7WIjh1yn8NSAXCUfC7vzn7m qh/mO7DX8FgvJfxit0FwEbcWgYc4AnQB14alBVLvCCd3F35pFdzGFkKU07hDvQa60mfR gkmEb9BVXULEF94o+lQFhvI9SUthBzPoioFCvJynfcCjx0aVGi9rAt22TG1LGzJ01KPl inPQ== X-Gm-Message-State: AOJu0YxciSy+ZhMqvjBNMmwDRWTsIZj5H8cuQAIOHR7PGe28eKU3WHwK 2N64fn5EpvvlpFiE/afp/miRy6wRwVUbSPMT0HxVgo/GVk2+/neUWt9orHsFoVsbWoXkVXObduf 3cB75CcAPT5qwryo9wSaGNU7Dxr07HMtxAYtwlJrxa+HG8VFmREoKyrK2bDh8binL X-Gm-Gg: ATEYQzwUqulDmm5kn7xY3Bcd0hkmOHWGFU7RlPv+R8CYAPXRLlX3V0G+/aOiIeMMYbX hS6S8DF6MeyY9ayk0bExSx7HzwITFrxTdrJOXgrV1UGfSe2T0QbnepnUtFZUSVL7IYaeLc9NvQU cFeJYAoWsU61ycFYcFoEcKhVMwnv+vDEfpKxsnR6BHLTBz6YBc5fSCtudSzOJwfvPL0XlEQkO0v k05wbSFt85+pWedKU46SzKRejTM6+ITlu4HrHd9ObbHI+R7/NZ9MISgAbEEuVQ/wixa+bOGimyU +JTYp0bXFpEtGb98Hbvc1RRRN+DdsU629orJv8I5cHpbbkS/unU6P1esR1PPvK2v5cEOCXTcNSj 3GFidXDVcdH//u9WwS3EkFi6sKX/pZ93KBg6R X-Received: by 2002:ac2:43a3:0:b0:5a1:37ab:312b with SMTP id 2adb3069b0e04-5a16270fe48mr2994956e87.15.1773642277995; Sun, 15 Mar 2026 23:24:37 -0700 (PDT) X-Received: by 2002:ac2:43a3:0:b0:5a1:37ab:312b with SMTP id 2adb3069b0e04-5a16270fe48mr2994943e87.15.1773642277527; Sun, 15 Mar 2026 23:24:37 -0700 (PDT) Received: from fedora (85-23-51-1.bb.dnainternet.fi. [85.23.51.1]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5a15602e692sm3263397e87.30.2026.03.15.23.24.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 15 Mar 2026 23:24:36 -0700 (PDT) From: mpenttil@redhat.com To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, =?UTF-8?q?Mika=20Penttil=C3=A4?= , David Hildenbrand , Jason Gunthorpe , Leon Romanovsky , Alistair Popple , Balbir Singh , Zi Yan , Matthew Brost , Marco Pagani Subject: [PATCH v6 5/6] mm: add new testcase for the migrate on fault case Date: Mon, 16 Mar 2026 08:24:06 +0200 Message-ID: <20260316062407.3354636-6-mpenttil@redhat.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20260316062407.3354636-1-mpenttil@redhat.com> References: <20260316062407.3354636-1-mpenttil@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Mika Penttil=C3=A4 Cc: David Hildenbrand Cc: Jason Gunthorpe Cc: Leon Romanovsky Cc: Alistair Popple Cc: Balbir Singh Cc: Zi Yan Cc: Matthew Brost Signed-off-by: Marco Pagani Signed-off-by: Mika Penttil=C3=A4 --- lib/test_hmm.c | 99 ++++++++++++++++++++++++++ lib/test_hmm_uapi.h | 19 ++--- tools/testing/selftests/mm/hmm-tests.c | 54 ++++++++++++++ 3 files changed, 163 insertions(+), 9 deletions(-) diff --git a/lib/test_hmm.c b/lib/test_hmm.c index 01aa0b60df2f..5ddcec056dfb 100644 --- a/lib/test_hmm.c +++ b/lib/test_hmm.c @@ -36,6 +36,7 @@ #define DMIRROR_RANGE_FAULT_TIMEOUT 1000 #define DEVMEM_CHUNK_SIZE (256 * 1024 * 1024U) #define DEVMEM_CHUNKS_RESERVE 16 +#define PFNS_ARRAY_SIZE 64 =20 /* * For device_private pages, dpage is just a dummy struct page @@ -1196,6 +1197,100 @@ static int dmirror_migrate_to_device(struct dmirror= *dmirror, return ret; } =20 +static int do_fault_and_migrate(struct dmirror *dmirror, struct hmm_range = *range) +{ + struct migrate_vma *migrate =3D range->migrate; + int ret; + + mmap_read_lock(dmirror->notifier.mm); + + /* Fault-in pages for migration and update device page table */ + ret =3D dmirror_range_fault(dmirror, range); + + pr_debug("Migrating from sys mem to device mem\n"); + migrate_hmm_range_setup(range); + + dmirror_migrate_alloc_and_copy(migrate, dmirror); + migrate_vma_pages(migrate); + dmirror_migrate_finalize_and_map(migrate, dmirror); + migrate_vma_finalize(migrate); + + mmap_read_unlock(dmirror->notifier.mm); + return ret; +} + +static int dmirror_fault_and_migrate_to_device(struct dmirror *dmirror, + struct hmm_dmirror_cmd *cmd) +{ + unsigned long start, size, end, next; + unsigned long src_pfns[PFNS_ARRAY_SIZE] =3D { 0 }; + unsigned long dst_pfns[PFNS_ARRAY_SIZE] =3D { 0 }; + struct migrate_vma migrate =3D { 0 }; + struct hmm_range range =3D { 0 }; + struct dmirror_bounce bounce; + int ret =3D 0; + + /* Whole range */ + start =3D cmd->addr; + size =3D cmd->npages << PAGE_SHIFT; + end =3D start + size; + + if (!mmget_not_zero(dmirror->notifier.mm)) { + ret =3D -EFAULT; + goto out; + } + + migrate.pgmap_owner =3D dmirror->mdevice; + migrate.src =3D src_pfns; + migrate.dst =3D dst_pfns; + migrate.flags =3D MIGRATE_VMA_SELECT_SYSTEM; + + range.migrate =3D &migrate; + range.hmm_pfns =3D src_pfns; + range.pfn_flags_mask =3D 0; + range.default_flags =3D HMM_PFN_REQ_FAULT | HMM_PFN_REQ_MIGRATE; + range.dev_private_owner =3D dmirror->mdevice; + range.notifier =3D &dmirror->notifier; + + for (next =3D start; next < end; next =3D range.end) { + range.start =3D next; + range.end =3D min(end, next + (PFNS_ARRAY_SIZE << PAGE_SHIFT)); + + pr_debug("Fault and migrate range start:%#lx end:%#lx\n", + range.start, range.end); + + ret =3D do_fault_and_migrate(dmirror, &range); + if (ret) + goto out_mmput; + } + + /* + * Return the migrated data for verification. + * Only for pages in device zone + */ + ret =3D dmirror_bounce_init(&bounce, start, size); + if (ret) + goto out_mmput; + + mutex_lock(&dmirror->mutex); + ret =3D dmirror_do_read(dmirror, start, end, &bounce); + mutex_unlock(&dmirror->mutex); + if (ret =3D=3D 0) { + ret =3D copy_to_user(u64_to_user_ptr(cmd->ptr), bounce.ptr, bounce.size); + if (ret) + ret =3D -EFAULT; + } + + cmd->cpages =3D bounce.cpages; + dmirror_bounce_fini(&bounce); + + +out_mmput: + mmput(dmirror->notifier.mm); +out: + return ret; +} + static void dmirror_mkentry(struct dmirror *dmirror, struct hmm_range *ran= ge, unsigned char *perm, unsigned long entry) { @@ -1512,6 +1607,10 @@ static long dmirror_fops_unlocked_ioctl(struct file = *filp, ret =3D dmirror_migrate_to_device(dmirror, &cmd); break; =20 + case HMM_DMIRROR_MIGRATE_ON_FAULT_TO_DEV: + ret =3D dmirror_fault_and_migrate_to_device(dmirror, &cmd); + break; + case HMM_DMIRROR_MIGRATE_TO_SYS: ret =3D dmirror_migrate_to_system(dmirror, &cmd); break; diff --git a/lib/test_hmm_uapi.h b/lib/test_hmm_uapi.h index f94c6d457338..0b6e7a419e36 100644 --- a/lib/test_hmm_uapi.h +++ b/lib/test_hmm_uapi.h @@ -29,15 +29,16 @@ struct hmm_dmirror_cmd { }; =20 /* Expose the address space of the calling process through hmm device file= */ -#define HMM_DMIRROR_READ _IOWR('H', 0x00, struct hmm_dmirror_cmd) -#define HMM_DMIRROR_WRITE _IOWR('H', 0x01, struct hmm_dmirror_cmd) -#define HMM_DMIRROR_MIGRATE_TO_DEV _IOWR('H', 0x02, struct hmm_dmirror_cmd) -#define HMM_DMIRROR_MIGRATE_TO_SYS _IOWR('H', 0x03, struct hmm_dmirror_cmd) -#define HMM_DMIRROR_SNAPSHOT _IOWR('H', 0x04, struct hmm_dmirror_cmd) -#define HMM_DMIRROR_EXCLUSIVE _IOWR('H', 0x05, struct hmm_dmirror_cmd) -#define HMM_DMIRROR_CHECK_EXCLUSIVE _IOWR('H', 0x06, struct hmm_dmirror_cm= d) -#define HMM_DMIRROR_RELEASE _IOWR('H', 0x07, struct hmm_dmirror_cmd) -#define HMM_DMIRROR_FLAGS _IOWR('H', 0x08, struct hmm_dmirror_cmd) +#define HMM_DMIRROR_READ _IOWR('H', 0x00, struct hmm_dmirror_cmd) +#define HMM_DMIRROR_WRITE _IOWR('H', 0x01, struct hmm_dmirror_cmd) +#define HMM_DMIRROR_MIGRATE_TO_DEV _IOWR('H', 0x02, struct hmm_dmirror_cm= d) +#define HMM_DMIRROR_MIGRATE_ON_FAULT_TO_DEV _IOWR('H', 0x03, struct hmm_dm= irror_cmd) +#define HMM_DMIRROR_MIGRATE_TO_SYS _IOWR('H', 0x04, struct hmm_dmirror_cm= d) +#define HMM_DMIRROR_SNAPSHOT _IOWR('H', 0x05, struct hmm_dmirror_cmd) +#define HMM_DMIRROR_EXCLUSIVE _IOWR('H', 0x06, struct hmm_dmirror_cmd) +#define HMM_DMIRROR_CHECK_EXCLUSIVE _IOWR('H', 0x07, struct hmm_dmirror_c= md) +#define HMM_DMIRROR_RELEASE _IOWR('H', 0x08, struct hmm_dmirror_cmd) +#define HMM_DMIRROR_FLAGS _IOWR('H', 0x09, struct hmm_dmirror_cmd) =20 #define HMM_DMIRROR_FLAG_FAIL_ALLOC (1ULL << 0) =20 diff --git a/tools/testing/selftests/mm/hmm-tests.c b/tools/testing/selftes= ts/mm/hmm-tests.c index e8328c89d855..c75616875c9e 100644 --- a/tools/testing/selftests/mm/hmm-tests.c +++ b/tools/testing/selftests/mm/hmm-tests.c @@ -277,6 +277,13 @@ static int hmm_migrate_sys_to_dev(int fd, return hmm_dmirror_cmd(fd, HMM_DMIRROR_MIGRATE_TO_DEV, buffer, npages); } =20 +static int hmm_migrate_on_fault_sys_to_dev(int fd, + struct hmm_buffer *buffer, + unsigned long npages) +{ + return hmm_dmirror_cmd(fd, HMM_DMIRROR_MIGRATE_ON_FAULT_TO_DEV, buffer, n= pages); +} + static int hmm_migrate_dev_to_sys(int fd, struct hmm_buffer *buffer, unsigned long npages) @@ -1034,6 +1041,53 @@ TEST_F(hmm, migrate) hmm_buffer_free(buffer); } =20 + +/* + * Fault and migrate anonymous memory to device private memory. + */ +TEST_F(hmm, migrate_on_fault) +{ + struct hmm_buffer *buffer; + unsigned long npages; + unsigned long size; + unsigned long i; + int *ptr; + int ret; + + npages =3D ALIGN(HMM_BUFFER_SIZE, self->page_size) >> self->page_shift; + ASSERT_NE(npages, 0); + size =3D npages << self->page_shift; + + buffer =3D malloc(sizeof(*buffer)); + ASSERT_NE(buffer, NULL); + + buffer->fd =3D -1; + buffer->size =3D size; + buffer->mirror =3D malloc(size); + ASSERT_NE(buffer->mirror, NULL); + + buffer->ptr =3D mmap(NULL, size, + PROT_READ | PROT_WRITE, + MAP_PRIVATE | MAP_ANONYMOUS, + buffer->fd, 0); + ASSERT_NE(buffer->ptr, MAP_FAILED); + + /* Initialize buffer in system memory. */ + for (i =3D 0, ptr =3D buffer->ptr; i < size / sizeof(*ptr); ++i) + ptr[i] =3D i; + + /* Fault and migrate memory to device. */ + ret =3D hmm_migrate_on_fault_sys_to_dev(self->fd, buffer, npages); + ASSERT_EQ(ret, 0); + ASSERT_EQ(buffer->cpages, npages); + + /* Check what the device read. */ + for (i =3D 0, ptr =3D buffer->mirror; i < size / sizeof(*ptr); ++i) + ASSERT_EQ(ptr[i], i); + + hmm_buffer_free(buffer); +} + /* * Migrate anonymous memory to device private memory and fault some of it = back * to system memory, then try migrating the resulting mix of system and de= vice --=20 2.50.0 From nobody Tue Apr 7 04:46:06 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C079134D4DC for ; Mon, 16 Mar 2026 06:24:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773642285; cv=none; b=abQYVMxfGPmbc4f+xVVVyiBl85pOwppCShq89mpkGipLwk4KCu2q+P3dmTWBseRaGopkKJ1+g8B+LOCXr13xnWggGyoTRHwyNPBbS9+jBSBIwcNhv9xI8f7caYLn2AbmXdG6qQlbZBFvK+BQnA1wBC4oJfLe8w3OiJ02pzobz2Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1773642285; c=relaxed/simple; bh=LdkCrRmIoAH7wAqh57Vj9mg1mKNfcRw7SINlTC+cSkc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=itdkP6jOlYjXQrR2nOrIjoy7Dh8he3edUdyXnfI5FMUA+DypxEjA8mmmdYWu63ZFXM/b5ZkkQgyXgiibWchFobVDKXkcUWNyIiBWDCgZ3IqZpo4Cqu50Q3sh/ZpasDJEBcCx3y7SKFQj95Ux+XAlDrkbZTytarGx9XDtwPCyCJY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=iw6U4LD7; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=RV57BH+m; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="iw6U4LD7"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="RV57BH+m" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1773642283; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MGACvuFJ5UYqfqr90BbtcdzNeAM6aEtmTBdaV9q4LRE=; b=iw6U4LD7rIhXL2ZpqX5GLxJc6FE47rTd1NuLYCLVC59EACS3D+7Tt0amlke8V4d8e+6dlc MJBbj8nzlWDDP5mxE1icUpPpxytz8dDNjPnv2Gsm8Hq9vMKdAZkxjOkITL+5wNDL9Gqmx8 e5iDiKB184vdLHIiwMAak9KAEXYdaW8= Received: from mail-lj1-f197.google.com (mail-lj1-f197.google.com [209.85.208.197]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-623-99r9gJ7ENaWA-sTsXoHmyg-1; Mon, 16 Mar 2026 02:24:41 -0400 X-MC-Unique: 99r9gJ7ENaWA-sTsXoHmyg-1 X-Mimecast-MFC-AGG-ID: 99r9gJ7ENaWA-sTsXoHmyg_1773642280 Received: by mail-lj1-f197.google.com with SMTP id 38308e7fff4ca-38a4b4cd643so15401231fa.3 for ; Sun, 15 Mar 2026 23:24:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1773642280; x=1774247080; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MGACvuFJ5UYqfqr90BbtcdzNeAM6aEtmTBdaV9q4LRE=; b=RV57BH+mxgEgmMKrnDHLBmFJsZBKq4qTFwVXcqsWzAVJ2UJdETcmrmp9BmZHnZHhEb QZ2grqO/LWiKX2CaL/7LTn2o3LiV7psSDmlzm+DomZ7jogSCR5In3nkxpWwjEef0i+H7 PPywq4wa4oYGclsVOWv5Emn0iwD7gRL6+2YWebXfJjy1X9pdfTt3RNygFfEHK9GLxtd6 455N3POwKv/5kL0RmmIl8mrQSyVvOSdsWh1X8BsxPFGxwVfMcbSC6376+ajrE7JwvsSi tU6Wq2tSScE4KbTpUGh3+soWsWY9Hn+d4UOehpX8bnyog8pvwYF3ikbX0ktAJWCh5NcE 3mWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773642280; x=1774247080; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=MGACvuFJ5UYqfqr90BbtcdzNeAM6aEtmTBdaV9q4LRE=; b=DSlObU/t0T6Sz8GKvWpU99wBSlEK+ouQwlMW2ULeL384oHFSUFCaWx4PH/r3d1FwCd 1y/x/T/ibWuc2G/PInHLm6oiUzTl8FdqszWmPfjKNLAvqukA7X4N98CSpo8XtGKpJMU8 4xgDmy1ZKtNRB8T8iTsxY782qtLGNjaen/E9uOoalzp+eTuxD+4J9hwLTNm0jkbAotvK ePal/HqRnIfdqmSBqZ5EuBjEtNcO72h+EtwPChjnXQqlEZyfufbTSv3PKt8CGxIpm/ku a/vyMpttbEQ1H0C/Zm3sOU0olGT4Mc3mANDxitmELzSE2xQ7HZsHlgeckKOxfKSJXYub Bh1g== X-Gm-Message-State: AOJu0YylWYtZArJfrp0pQhRWVOcXN1C/vjR/ZXqh3Z/IZYa0tErjppLl /+VfLdoeyWXBZ8i0vqKiemx9zY/0eKfczBxTOsL/tLDD3udwqRHbGFxJper+6TgpkLgN/FDTAhh OfhnfFtEzPirhmLPdrznr7DCzqkU12dqHg/Ch3JEJGvtLmCCNlG2rDOgJ8cOf0OzP X-Gm-Gg: ATEYQzyKPqBIZTvu9gWKb7oQO1cULx7Olgc+lfKqnNH7vMAIKqR9OyuvDvi0Gb7knJi YLSxwd2ntqtYsKHJjFgt6/u7uwA8ketjlqxb5mRogCTSlIPv6WoVi8Ztdc/C3RORaoZ7lYBy52W u91H5gh7h4ksz2RKb351IServ+Ny2dwFSoL8E/lgH4iwiZ8NQewvCH4D6xOY5hRd9Vl3MO+uaep Dy9H2zuj0yp+IDuaIRfSGXRhAK3K2He/76IuQrTx7ag3BGVieKfTRKCrB89VnY0apbhC+2mkBRk NxielRyHYhhfKpVChblYUh0DmmOh9xYh4BK2zEEAcaGr2pLcRDYS6fqwG/OE8ET431bV8It+uoN ZyVYVhzOmXz9UFj/OieFCatZZtLja1RYHdpPs X-Received: by 2002:ac2:5602:0:b0:59e:65ff:e57e with SMTP id 2adb3069b0e04-5a162af6686mr3259817e87.12.1773642279909; Sun, 15 Mar 2026 23:24:39 -0700 (PDT) X-Received: by 2002:ac2:5602:0:b0:59e:65ff:e57e with SMTP id 2adb3069b0e04-5a162af6686mr3259805e87.12.1773642279344; Sun, 15 Mar 2026 23:24:39 -0700 (PDT) Received: from fedora (85-23-51-1.bb.dnainternet.fi. [85.23.51.1]) by smtp.gmail.com with ESMTPSA id 2adb3069b0e04-5a15602e692sm3263397e87.30.2026.03.15.23.24.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 15 Mar 2026 23:24:39 -0700 (PDT) From: mpenttil@redhat.com To: linux-mm@kvack.org Cc: linux-kernel@vger.kernel.org, =?UTF-8?q?Mika=20Penttil=C3=A4?= , David Hildenbrand , Jason Gunthorpe , Leon Romanovsky , Alistair Popple , Balbir Singh , Zi Yan , Matthew Brost Subject: [PATCH v6 6/6] mm:/migrate_device.c: remove migrate_vma_collect_*() Date: Mon, 16 Mar 2026 08:24:07 +0200 Message-ID: <20260316062407.3354636-7-mpenttil@redhat.com> X-Mailer: git-send-email 2.50.0 In-Reply-To: <20260316062407.3354636-1-mpenttil@redhat.com> References: <20260316062407.3354636-1-mpenttil@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable From: Mika Penttil=C3=A4 With the unified fault handling and migrate path, the migrate_vma_collect_*() functions are unused, let's remove them. Cc: David Hildenbrand Cc: Jason Gunthorpe Cc: Leon Romanovsky Cc: Alistair Popple Cc: Balbir Singh Cc: Zi Yan Cc: Matthew Brost Signed-off-by: Mika Penttil=C3=A4 --- mm/migrate_device.c | 508 -------------------------------------------- 1 file changed, 508 deletions(-) diff --git a/mm/migrate_device.c b/mm/migrate_device.c index cef9b644d31f..c9f23e6c519b 100644 --- a/mm/migrate_device.c +++ b/mm/migrate_device.c @@ -18,514 +18,6 @@ #include #include "internal.h" =20 -static int migrate_vma_collect_skip(unsigned long start, - unsigned long end, - struct mm_walk *walk) -{ - struct migrate_vma *migrate =3D walk->private; - unsigned long addr; - - for (addr =3D start; addr < end; addr +=3D PAGE_SIZE) { - migrate->dst[migrate->npages] =3D 0; - migrate->src[migrate->npages++] =3D 0; - } - - return 0; -} - -static int migrate_vma_collect_hole(unsigned long start, - unsigned long end, - __always_unused int depth, - struct mm_walk *walk) -{ - struct migrate_vma *migrate =3D walk->private; - unsigned long addr; - - /* Only allow populating anonymous memory. */ - if (!vma_is_anonymous(walk->vma)) - return migrate_vma_collect_skip(start, end, walk); - - if (thp_migration_supported() && - (migrate->flags & MIGRATE_VMA_SELECT_COMPOUND) && - (IS_ALIGNED(start, HPAGE_PMD_SIZE) && - IS_ALIGNED(end, HPAGE_PMD_SIZE))) { - migrate->src[migrate->npages] =3D MIGRATE_PFN_MIGRATE | - MIGRATE_PFN_COMPOUND; - migrate->dst[migrate->npages] =3D 0; - migrate->npages++; - migrate->cpages++; - - /* - * Collect the remaining entries as holes, in case we - * need to split later - */ - return migrate_vma_collect_skip(start + PAGE_SIZE, end, walk); - } - - for (addr =3D start; addr < end; addr +=3D PAGE_SIZE) { - migrate->src[migrate->npages] =3D MIGRATE_PFN_MIGRATE; - migrate->dst[migrate->npages] =3D 0; - migrate->npages++; - migrate->cpages++; - } - - return 0; -} - -/** - * migrate_vma_split_folio() - Helper function to split a THP folio - * @folio: the folio to split - * @fault_page: struct page associated with the fault if any - * - * Returns 0 on success - */ -static int migrate_vma_split_folio(struct folio *folio, - struct page *fault_page) -{ - int ret; - struct folio *fault_folio =3D fault_page ? page_folio(fault_page) : NULL; - struct folio *new_fault_folio =3D NULL; - - if (folio !=3D fault_folio) { - folio_get(folio); - folio_lock(folio); - } - - ret =3D split_folio(folio); - if (ret) { - if (folio !=3D fault_folio) { - folio_unlock(folio); - folio_put(folio); - } - return ret; - } - - new_fault_folio =3D fault_page ? page_folio(fault_page) : NULL; - - /* - * Ensure the lock is held on the correct - * folio after the split - */ - if (!new_fault_folio) { - folio_unlock(folio); - folio_put(folio); - } else if (folio !=3D new_fault_folio) { - if (new_fault_folio !=3D fault_folio) { - folio_get(new_fault_folio); - folio_lock(new_fault_folio); - } - folio_unlock(folio); - folio_put(folio); - } - - return 0; -} - -/** migrate_vma_collect_huge_pmd - collect THP pages without splitting the - * folio for device private pages. - * @pmdp: pointer to pmd entry - * @start: start address of the range for migration - * @end: end address of the range for migration - * @walk: mm_walk callback structure - * @fault_folio: folio associated with the fault if any - * - * Collect the huge pmd entry at @pmdp for migration and set the - * MIGRATE_PFN_COMPOUND flag in the migrate src entry to indicate that - * migration will occur at HPAGE_PMD granularity - */ -static int migrate_vma_collect_huge_pmd(pmd_t *pmdp, unsigned long start, - unsigned long end, struct mm_walk *walk, - struct folio *fault_folio) -{ - struct mm_struct *mm =3D walk->mm; - struct folio *folio; - struct migrate_vma *migrate =3D walk->private; - spinlock_t *ptl; - int ret; - unsigned long write =3D 0; - - ptl =3D pmd_lock(mm, pmdp); - if (pmd_none(*pmdp)) { - spin_unlock(ptl); - return migrate_vma_collect_hole(start, end, -1, walk); - } - - if (pmd_trans_huge(*pmdp)) { - if (!(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM)) { - spin_unlock(ptl); - return migrate_vma_collect_skip(start, end, walk); - } - - folio =3D pmd_folio(*pmdp); - if (is_huge_zero_folio(folio)) { - spin_unlock(ptl); - return migrate_vma_collect_hole(start, end, -1, walk); - } - if (pmd_write(*pmdp)) - write =3D MIGRATE_PFN_WRITE; - } else if (!pmd_present(*pmdp)) { - const softleaf_t entry =3D softleaf_from_pmd(*pmdp); - - folio =3D softleaf_to_folio(entry); - - if (!softleaf_is_device_private(entry) || - !(migrate->flags & MIGRATE_VMA_SELECT_DEVICE_PRIVATE) || - (folio->pgmap->owner !=3D migrate->pgmap_owner)) { - spin_unlock(ptl); - return migrate_vma_collect_skip(start, end, walk); - } - - if (softleaf_is_migration(entry)) { - softleaf_entry_wait_on_locked(entry, ptl); - spin_unlock(ptl); - return -EAGAIN; - } - - if (softleaf_is_device_private_write(entry)) - write =3D MIGRATE_PFN_WRITE; - } else { - spin_unlock(ptl); - return -EAGAIN; - } - - folio_get(folio); - if (folio !=3D fault_folio && unlikely(!folio_trylock(folio))) { - spin_unlock(ptl); - folio_put(folio); - return migrate_vma_collect_skip(start, end, walk); - } - - if (thp_migration_supported() && - (migrate->flags & MIGRATE_VMA_SELECT_COMPOUND) && - (IS_ALIGNED(start, HPAGE_PMD_SIZE) && - IS_ALIGNED(end, HPAGE_PMD_SIZE))) { - - struct page_vma_mapped_walk pvmw =3D { - .ptl =3D ptl, - .address =3D start, - .pmd =3D pmdp, - .vma =3D walk->vma, - }; - - unsigned long pfn =3D page_to_pfn(folio_page(folio, 0)); - - migrate->src[migrate->npages] =3D migrate_pfn(pfn) | write - | MIGRATE_PFN_MIGRATE - | MIGRATE_PFN_COMPOUND; - migrate->dst[migrate->npages++] =3D 0; - migrate->cpages++; - ret =3D set_pmd_migration_entry(&pvmw, folio_page(folio, 0)); - if (ret) { - migrate->npages--; - migrate->cpages--; - migrate->src[migrate->npages] =3D 0; - migrate->dst[migrate->npages] =3D 0; - goto fallback; - } - migrate_vma_collect_skip(start + PAGE_SIZE, end, walk); - spin_unlock(ptl); - return 0; - } - -fallback: - spin_unlock(ptl); - if (!folio_test_large(folio)) - goto done; - ret =3D split_folio(folio); - if (fault_folio !=3D folio) - folio_unlock(folio); - folio_put(folio); - if (ret) - return migrate_vma_collect_skip(start, end, walk); - if (pmd_none(pmdp_get_lockless(pmdp))) - return migrate_vma_collect_hole(start, end, -1, walk); - -done: - return -ENOENT; -} - -static int migrate_vma_collect_pmd(pmd_t *pmdp, - unsigned long start, - unsigned long end, - struct mm_walk *walk) -{ - struct migrate_vma *migrate =3D walk->private; - struct vm_area_struct *vma =3D walk->vma; - struct mm_struct *mm =3D vma->vm_mm; - unsigned long addr =3D start, unmapped =3D 0; - spinlock_t *ptl; - struct folio *fault_folio =3D migrate->fault_page ? - page_folio(migrate->fault_page) : NULL; - pte_t *ptep; - -again: - if (pmd_trans_huge(*pmdp) || !pmd_present(*pmdp)) { - int ret =3D migrate_vma_collect_huge_pmd(pmdp, start, end, walk, fault_f= olio); - - if (ret =3D=3D -EAGAIN) - goto again; - if (ret =3D=3D 0) - return 0; - } - - ptep =3D pte_offset_map_lock(mm, pmdp, start, &ptl); - if (!ptep) - goto again; - lazy_mmu_mode_enable(); - ptep +=3D (addr - start) / PAGE_SIZE; - - for (; addr < end; addr +=3D PAGE_SIZE, ptep++) { - struct dev_pagemap *pgmap; - unsigned long mpfn =3D 0, pfn; - struct folio *folio; - struct page *page; - softleaf_t entry; - pte_t pte; - - pte =3D ptep_get(ptep); - - if (pte_none(pte)) { - if (vma_is_anonymous(vma)) { - mpfn =3D MIGRATE_PFN_MIGRATE; - migrate->cpages++; - } - goto next; - } - - if (!pte_present(pte)) { - /* - * Only care about unaddressable device page special - * page table entry. Other special swap entries are not - * migratable, and we ignore regular swapped page. - */ - entry =3D softleaf_from_pte(pte); - if (!softleaf_is_device_private(entry)) - goto next; - - page =3D softleaf_to_page(entry); - pgmap =3D page_pgmap(page); - if (!(migrate->flags & - MIGRATE_VMA_SELECT_DEVICE_PRIVATE) || - pgmap->owner !=3D migrate->pgmap_owner) - goto next; - - folio =3D page_folio(page); - if (folio_test_large(folio)) { - int ret; - - lazy_mmu_mode_disable(); - pte_unmap_unlock(ptep, ptl); - ret =3D migrate_vma_split_folio(folio, - migrate->fault_page); - - if (ret) { - if (unmapped) - flush_tlb_range(walk->vma, start, end); - - return migrate_vma_collect_skip(addr, end, walk); - } - - goto again; - } - - mpfn =3D migrate_pfn(page_to_pfn(page)) | - MIGRATE_PFN_MIGRATE; - if (softleaf_is_device_private_write(entry)) - mpfn |=3D MIGRATE_PFN_WRITE; - } else { - pfn =3D pte_pfn(pte); - if (is_zero_pfn(pfn) && - (migrate->flags & MIGRATE_VMA_SELECT_SYSTEM)) { - mpfn =3D MIGRATE_PFN_MIGRATE; - migrate->cpages++; - goto next; - } - page =3D vm_normal_page(migrate->vma, addr, pte); - if (page && !is_zone_device_page(page) && - !(migrate->flags & MIGRATE_VMA_SELECT_SYSTEM)) { - goto next; - } else if (page && is_device_coherent_page(page)) { - pgmap =3D page_pgmap(page); - - if (!(migrate->flags & - MIGRATE_VMA_SELECT_DEVICE_COHERENT) || - pgmap->owner !=3D migrate->pgmap_owner) - goto next; - } - folio =3D page ? page_folio(page) : NULL; - if (folio && folio_test_large(folio)) { - int ret; - - lazy_mmu_mode_disable(); - pte_unmap_unlock(ptep, ptl); - ret =3D migrate_vma_split_folio(folio, - migrate->fault_page); - - if (ret) { - if (unmapped) - flush_tlb_range(walk->vma, start, end); - - return migrate_vma_collect_skip(addr, end, walk); - } - - goto again; - } - mpfn =3D migrate_pfn(pfn) | MIGRATE_PFN_MIGRATE; - mpfn |=3D pte_write(pte) ? MIGRATE_PFN_WRITE : 0; - } - - if (!page || !page->mapping) { - mpfn =3D 0; - goto next; - } - - /* - * By getting a reference on the folio we pin it and that blocks - * any kind of migration. Side effect is that it "freezes" the - * pte. - * - * We drop this reference after isolating the folio from the lru - * for non device folio (device folio are not on the lru and thus - * can't be dropped from it). - */ - folio =3D page_folio(page); - folio_get(folio); - - /* - * We rely on folio_trylock() to avoid deadlock between - * concurrent migrations where each is waiting on the others - * folio lock. If we can't immediately lock the folio we fail this - * migration as it is only best effort anyway. - * - * If we can lock the folio it's safe to set up a migration entry - * now. In the common case where the folio is mapped once in a - * single process setting up the migration entry now is an - * optimisation to avoid walking the rmap later with - * try_to_migrate(). - */ - if (fault_folio =3D=3D folio || folio_trylock(folio)) { - bool anon_exclusive; - pte_t swp_pte; - - flush_cache_page(vma, addr, pte_pfn(pte)); - anon_exclusive =3D folio_test_anon(folio) && - PageAnonExclusive(page); - if (anon_exclusive) { - pte =3D ptep_clear_flush(vma, addr, ptep); - - if (folio_try_share_anon_rmap_pte(folio, page)) { - set_pte_at(mm, addr, ptep, pte); - if (fault_folio !=3D folio) - folio_unlock(folio); - folio_put(folio); - mpfn =3D 0; - goto next; - } - } else { - pte =3D ptep_get_and_clear(mm, addr, ptep); - } - - migrate->cpages++; - - /* Set the dirty flag on the folio now the pte is gone. */ - if (pte_dirty(pte)) - folio_mark_dirty(folio); - - /* Setup special migration page table entry */ - if (mpfn & MIGRATE_PFN_WRITE) - entry =3D make_writable_migration_entry( - page_to_pfn(page)); - else if (anon_exclusive) - entry =3D make_readable_exclusive_migration_entry( - page_to_pfn(page)); - else - entry =3D make_readable_migration_entry( - page_to_pfn(page)); - if (pte_present(pte)) { - if (pte_young(pte)) - entry =3D make_migration_entry_young(entry); - if (pte_dirty(pte)) - entry =3D make_migration_entry_dirty(entry); - } - swp_pte =3D swp_entry_to_pte(entry); - if (pte_present(pte)) { - if (pte_soft_dirty(pte)) - swp_pte =3D pte_swp_mksoft_dirty(swp_pte); - if (pte_uffd_wp(pte)) - swp_pte =3D pte_swp_mkuffd_wp(swp_pte); - } else { - if (pte_swp_soft_dirty(pte)) - swp_pte =3D pte_swp_mksoft_dirty(swp_pte); - if (pte_swp_uffd_wp(pte)) - swp_pte =3D pte_swp_mkuffd_wp(swp_pte); - } - set_pte_at(mm, addr, ptep, swp_pte); - - /* - * This is like regular unmap: we remove the rmap and - * drop the folio refcount. The folio won't be freed, as - * we took a reference just above. - */ - folio_remove_rmap_pte(folio, page, vma); - folio_put(folio); - - if (pte_present(pte)) - unmapped++; - } else { - folio_put(folio); - mpfn =3D 0; - } - -next: - migrate->dst[migrate->npages] =3D 0; - migrate->src[migrate->npages++] =3D mpfn; - } - - /* Only flush the TLB if we actually modified any entries */ - if (unmapped) - flush_tlb_range(walk->vma, start, end); - - lazy_mmu_mode_disable(); - pte_unmap_unlock(ptep - 1, ptl); - - return 0; -} - -static const struct mm_walk_ops migrate_vma_walk_ops =3D { - .pmd_entry =3D migrate_vma_collect_pmd, - .pte_hole =3D migrate_vma_collect_hole, - .walk_lock =3D PGWALK_RDLOCK, -}; - -/* - * migrate_vma_collect() - collect pages over a range of virtual addresses - * @migrate: migrate struct containing all migration information - * - * This will walk the CPU page table. For each virtual address backed by a - * valid page, it updates the src array and takes a reference on the page,= in - * order to pin the page until we lock it and unmap it. - */ -static void migrate_vma_collect(struct migrate_vma *migrate) -{ - struct mmu_notifier_range range; - - /* - * Note that the pgmap_owner is passed to the mmu notifier callback so - * that the registered device driver can skip invalidating device - * private page mappings that won't be migrated. - */ - mmu_notifier_range_init_owner(&range, MMU_NOTIFY_MIGRATE, 0, - migrate->vma->vm_mm, migrate->start, migrate->end, - migrate->pgmap_owner); - mmu_notifier_invalidate_range_start(&range); - - walk_page_range(migrate->vma->vm_mm, migrate->start, migrate->end, - &migrate_vma_walk_ops, migrate); - - mmu_notifier_invalidate_range_end(&range); - migrate->end =3D migrate->start + (migrate->npages << PAGE_SHIFT); -} - /* * migrate_vma_check_page() - check if page is pinned or not * @page: struct page to check --=20 2.50.0