From nobody Thu Dec 18 07:11:24 2025 Received: from mail-io1-f45.google.com (mail-io1-f45.google.com [209.85.166.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C78093211 for ; Fri, 14 Jun 2024 01:52:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718329940; cv=none; b=f/07EsJPpm7qfjFQ51Wv+U+0rBQfJN7l5nDve5qcuBkcW8vH4B0a6mTNsozvcUAUejHoUb0njnicdNo4QDno7ug3C/iMopbpYc4ofCfLIErlTnYTiVgqFirV3XQo5SuZil1Iyu/FYHWcr2F9mzPb9+srXO5LoKPig+kuGWQKskY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718329940; c=relaxed/simple; bh=VZqa4zADlu9+pC2waDrwt0SmYu8//eJvgaKcQLCz2Bw=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=foMLm+sStf+p7UJxsfWzlPjt6GYexhfBSmMSmWr6IaTkmy7eBR9+n5OS/8lTtBbux7Xj8mpMeEgzbp2jhde+aFZZN0EWeUlPBPiSf5ZvCo2yjqkIneVvMpwYU2uSsTngIiwDnqQJgc5fIDRkz8/26knFFUe4WXa4UGG59tH+KLU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=cFVpE8XM; arc=none smtp.client-ip=209.85.166.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cFVpE8XM" Received: by mail-io1-f45.google.com with SMTP id ca18e2360f4ac-7ebe019b9cdso41511739f.0 for ; Thu, 13 Jun 2024 18:52:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718329938; x=1718934738; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WFmbQRPvNrnsz2JY0mzqRjOemINJxJOX/iTkOKnCHs0=; b=cFVpE8XM/73NqG8mwUEQKIkazEzcIUfXbvpkElJHrKjs2Ww8CqL9ZsO8RnKbTlB1C4 lFyASEx5tV5ZdN93ChNu3Jdg/X/71AGkizg6RL8Z+7VA4Y0R7PXC/Ppz2FsWRorgNALG SRa5MliPoia4WUEVSJk+kSfWp1XvFHJ6jh4aT16/iFNbklu3ojIG0jnffA3uYGoJPcIm d7BcninxRk5HL7YGFcN+7tsqfNgmHCdNuSPwd4dESvx7LESgGiRezx1yTtArEhoezMeY xcVCON7lUv33BwAudQjyN0eAOSpCuq7WlHmya6hk28t7QxNNpI2bmINxrEnAO0UtC8sU NTXQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718329938; x=1718934738; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WFmbQRPvNrnsz2JY0mzqRjOemINJxJOX/iTkOKnCHs0=; b=YrOpuijin1A9SSZOk8cfegbZKVB+M7AgIK4DV7mW2o662X7AybH22dDtWN6CvYvBM9 I1f8acozUpJAUlQIfyBQ3T2xJdRMhOzKRY31DSNtuu8qQYAhVnLVf846J/0B1mIzfOY+ ROZX1k1tFfyduNYVN4TZ+OmZ83lCljBaqDeAa044tpjVoS2iij6r82Pla1mMcb1l+jXq MvkuDqtktErLZZIK4u3DTiR/YEXBue6xhMd2Ueu7UUCYu0COSRV1AQFlPYBkxUysLHvA 7y6+E426PsOt7hrdiys+NEW168Xn5M4I+linjFYCCMS1MLFf1Ex+PU6ba2WwiOm1u0e5 ksww== X-Forwarded-Encrypted: i=1; AJvYcCU5YW903+tP3PxHjSrH/mW8yA4/VuJgK7LBP2o7t/8yUBMVh0Hv9/90GA8fipXe7obpDBCbWoMYff3NRAserrDeAEMC2qWLtpR07vfH X-Gm-Message-State: AOJu0Yy0w5iHHHACC0swOJK9Uox225kIQjGzIn1R5TqI3N4ojyJ9l1x7 0QBX4TY8P16yPwkmfIHumASqFognaqrGfBC3TNVQ7y70dPtwd0Pw X-Google-Smtp-Source: AGHT+IFDhw5CyPF+647vuIacpDoCF7sesz2n+pQxSAMpRsRTA2HDFMORH760KNbesaRHRf86R06Y5w== X-Received: by 2002:a05:6602:2d81:b0:7eb:6cbc:8856 with SMTP id ca18e2360f4ac-7ebeb49bc4bmr137386739f.2.1718329937806; Thu, 13 Jun 2024 18:52:17 -0700 (PDT) Received: from LancedeMBP.lan.lan ([2403:2c80:6::304f]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-705cc96d4c3sm2000912b3a.59.2024.06.13.18.52.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jun 2024 18:52:17 -0700 (PDT) From: Lance Yang To: akpm@linux-foundation.org Cc: willy@infradead.org, sj@kernel.org, baolin.wang@linux.alibaba.com, maskray@google.com, ziy@nvidia.com, ryan.roberts@arm.com, david@redhat.com, 21cnbao@gmail.com, mhocko@suse.com, fengwei.yin@intel.com, zokeefe@google.com, shy828301@gmail.com, xiehuan09@gmail.com, libang.li@antgroup.com, wangkefeng.wang@huawei.com, songmuchun@bytedance.com, peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lance Yang , Barry Song Subject: [PATCH v8 1/3] mm/rmap: remove duplicated exit code in pagewalk loop Date: Fri, 14 Jun 2024 09:51:36 +0800 Message-Id: <20240614015138.31461-2-ioworker0@gmail.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20240614015138.31461-1-ioworker0@gmail.com> References: <20240614015138.31461-1-ioworker0@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Introduce the labels walk_done and walk_abort as exit points to eliminate duplicated exit code in the pagewalk loop. Reviewed-by: Zi Yan Reviewed-by: Baolin Wang Reviewed-by: David Hildenbrand Reviewed-by: Barry Song Signed-off-by: Lance Yang --- mm/rmap.c | 40 +++++++++++++++------------------------- 1 file changed, 15 insertions(+), 25 deletions(-) diff --git a/mm/rmap.c b/mm/rmap.c index ae250b2b4d55..2d778725e4f5 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1681,9 +1681,7 @@ static bool try_to_unmap_one(struct folio *folio, str= uct vm_area_struct *vma, /* Restore the mlock which got missed */ if (!folio_test_large(folio)) mlock_vma_folio(folio, vma); - page_vma_mapped_walk_done(&pvmw); - ret =3D false; - break; + goto walk_abort; } =20 pfn =3D pte_pfn(ptep_get(pvmw.pte)); @@ -1721,11 +1719,8 @@ static bool try_to_unmap_one(struct folio *folio, st= ruct vm_area_struct *vma, */ if (!anon) { VM_BUG_ON(!(flags & TTU_RMAP_LOCKED)); - if (!hugetlb_vma_trylock_write(vma)) { - page_vma_mapped_walk_done(&pvmw); - ret =3D false; - break; - } + if (!hugetlb_vma_trylock_write(vma)) + goto walk_abort; if (huge_pmd_unshare(mm, vma, address, pvmw.pte)) { hugetlb_vma_unlock_write(vma); flush_tlb_range(vma, @@ -1740,8 +1735,7 @@ static bool try_to_unmap_one(struct folio *folio, str= uct vm_area_struct *vma, * actual page and drop map count * to zero. */ - page_vma_mapped_walk_done(&pvmw); - break; + goto walk_done; } hugetlb_vma_unlock_write(vma); } @@ -1813,9 +1807,7 @@ static bool try_to_unmap_one(struct folio *folio, str= uct vm_area_struct *vma, if (unlikely(folio_test_swapbacked(folio) !=3D folio_test_swapcache(folio))) { WARN_ON_ONCE(1); - ret =3D false; - page_vma_mapped_walk_done(&pvmw); - break; + goto walk_abort; } =20 /* MADV_FREE page check */ @@ -1854,23 +1846,17 @@ static bool try_to_unmap_one(struct folio *folio, s= truct vm_area_struct *vma, */ set_pte_at(mm, address, pvmw.pte, pteval); folio_set_swapbacked(folio); - ret =3D false; - page_vma_mapped_walk_done(&pvmw); - break; + goto walk_abort; } =20 if (swap_duplicate(entry) < 0) { set_pte_at(mm, address, pvmw.pte, pteval); - ret =3D false; - page_vma_mapped_walk_done(&pvmw); - break; + goto walk_abort; } if (arch_unmap_one(mm, vma, address, pteval) < 0) { swap_free(entry); set_pte_at(mm, address, pvmw.pte, pteval); - ret =3D false; - page_vma_mapped_walk_done(&pvmw); - break; + goto walk_abort; } =20 /* See folio_try_share_anon_rmap(): clear PTE first. */ @@ -1878,9 +1864,7 @@ static bool try_to_unmap_one(struct folio *folio, str= uct vm_area_struct *vma, folio_try_share_anon_rmap_pte(folio, subpage)) { swap_free(entry); set_pte_at(mm, address, pvmw.pte, pteval); - ret =3D false; - page_vma_mapped_walk_done(&pvmw); - break; + goto walk_abort; } if (list_empty(&mm->mmlist)) { spin_lock(&mmlist_lock); @@ -1920,6 +1904,12 @@ static bool try_to_unmap_one(struct folio *folio, st= ruct vm_area_struct *vma, if (vma->vm_flags & VM_LOCKED) mlock_drain_local(); folio_put(folio); + continue; +walk_abort: + ret =3D false; +walk_done: + page_vma_mapped_walk_done(&pvmw); + break; } =20 mmu_notifier_invalidate_range_end(&range); --=20 2.33.1 From nobody Thu Dec 18 07:11:24 2025 Received: from mail-pf1-f176.google.com (mail-pf1-f176.google.com [209.85.210.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 441A7183086 for ; Fri, 14 Jun 2024 01:52:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718329946; cv=none; b=EBVjNm6Ht6ggnNtrWKjCklYeIqm8Z0IdzW3uU+3grUHvCufJBkTBojb1c4KpOt4IV7vVvZvC9ft8Z0imp1M1wZPAvbZAjibhotzM44m/kDe6b/arqFCZ3kajXrgK5E5cx67FHNINrOv41G6gPQeFAmsye+M0rAaVGTY0P25r6eg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718329946; c=relaxed/simple; bh=t646EUoWg9Y2eZj/rbYb/go7sM5ZSweo3vqKLyVYdZ0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=rOCOVylYu+wC+je73Vz6kP5fhM5MnorbT6sDcmsb43W4/PUMaUc3orFBuucpRliivA2GYWpvEWcfx6S632hhFg3UR/eCHy8bwkN7Q2+AmyYTkalr4BcaCIxe4AWkKM2EEfoFEXTmKDIveXxkLjDbSqBJ/exP5VFdKxjpuCBFpdQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=ITWv/ljT; arc=none smtp.client-ip=209.85.210.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ITWv/ljT" Received: by mail-pf1-f176.google.com with SMTP id d2e1a72fcca58-7024d571d8eso1368361b3a.0 for ; Thu, 13 Jun 2024 18:52:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718329944; x=1718934744; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=oHYSOK8Z4m4YeRSSbyiRoNr7hsC5jb5TLSRL1KlNI3w=; b=ITWv/ljTggXzPCDpr+UQcE6xbfDHYMQpZFADVNbYEKCjHQzU48QJj2B+kFSKTiiBh8 pIhwpw/+dNLUTwYZnePzDzKzGxXpHUeGlRWFI4xxQ9VsUTd1VHdAOpUCEhhF7P1/xRaK dRwOVnpzWBgE/gjniRrFC6tLWgYUd8Ncd1bP+Qsr06lVgMfyhUZ4n00BoSVGDKh+tj0B OXEl7IqWgEF9/idJM7ZeBZuAd9FTFmEmDt3KMR3Q4iub4HHcSo9jus2KHA8/tdjugWxe FOvklg5HWZQ8cg3hip39pa4XSXBFLdgXN8njbGPACSr1UsfgY0PFfU70uTPwv6xX6+Tn 8eKA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718329944; x=1718934744; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=oHYSOK8Z4m4YeRSSbyiRoNr7hsC5jb5TLSRL1KlNI3w=; b=U+iFJLbMgcEdDGculqKOk4YDGLujOugRoicdZqJHqnn02RRNlMhHkHx3lHyYRLNGXn qjF/aFW/Qmz26CQIbZuvm/3VIZRe+cj3zhFp9E4tPZ6PVSM4/WU1nL19QH/qSRn8KjxF OUjMnhts1GU84foO1NfvMzlQMGfR1M+ovk3Azf99YIv1DS3cwE3xdtxJvQXFX9pbhiAg WjV+CSx3Qu5/gwkGCWAljWy9A7nRKFz0RZXn8g0cHxZsnLEMFYdF2KNWEyT4sNAa/CEo fclrl+w4Jot5TB+AjcK3fjyz99iDOtYRPDz0YIu5ygu9A2+khI/oX6r9Y6uI7bD7PA/r qQzA== X-Forwarded-Encrypted: i=1; AJvYcCWegjL6Tw8fzlQLLgiEfN10zqnviWQ8/PvdDB9ut9BHwX79LwEH8YAv64xib8pC8NShGVP/HcLIKI6a8qLnXntPl/zJD8KZQ43mAHc1 X-Gm-Message-State: AOJu0YyoeAJ7F/u5BPB6fmQTX8yVOfhLyXo2PVYYyVSSiX+3WjPF2HeZ /dYQBxuOZPp6NuD3vDMrMgebDcJq1aF3Arv2Y+YByRAZu5VWyV/M X-Google-Smtp-Source: AGHT+IETJKQrVb03Zu4RAVjG/Fn0yAZeYe0tcBOczpJ9Oj2zKKIy6GQvO0xxxOPWMdvyD7IL203kLg== X-Received: by 2002:a05:6a20:3d89:b0:1b5:d10a:1b70 with SMTP id adf61e73a8af0-1bae7d840cfmr2048147637.11.1718329943752; Thu, 13 Jun 2024 18:52:23 -0700 (PDT) Received: from LancedeMBP.lan.lan ([2403:2c80:6::304f]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-705cc96d4c3sm2000912b3a.59.2024.06.13.18.52.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jun 2024 18:52:23 -0700 (PDT) From: Lance Yang To: akpm@linux-foundation.org Cc: willy@infradead.org, sj@kernel.org, baolin.wang@linux.alibaba.com, maskray@google.com, ziy@nvidia.com, ryan.roberts@arm.com, david@redhat.com, 21cnbao@gmail.com, mhocko@suse.com, fengwei.yin@intel.com, zokeefe@google.com, shy828301@gmail.com, xiehuan09@gmail.com, libang.li@antgroup.com, wangkefeng.wang@huawei.com, songmuchun@bytedance.com, peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lance Yang Subject: [PATCH v8 2/3] mm/rmap: integrate PMD-mapped folio splitting into pagewalk loop Date: Fri, 14 Jun 2024 09:51:37 +0800 Message-Id: <20240614015138.31461-3-ioworker0@gmail.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20240614015138.31461-1-ioworker0@gmail.com> References: <20240614015138.31461-1-ioworker0@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In preparation for supporting try_to_unmap_one() to unmap PMD-mapped folios, start the pagewalk first, then call split_huge_pmd_address() to split the folio. Suggested-by: David Hildenbrand Suggested-by: Baolin Wang Signed-off-by: Lance Yang Acked-by: David Hildenbrand Acked-by: Zi Yan --- include/linux/huge_mm.h | 6 ++++++ include/linux/rmap.h | 24 +++++++++++++++++++++++ mm/huge_memory.c | 42 +++++++++++++++++++++-------------------- mm/rmap.c | 21 +++++++++++++++------ 4 files changed, 67 insertions(+), 26 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 7ad41de5eaea..9f720b0731c4 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -428,6 +428,9 @@ static inline bool thp_migration_supported(void) return IS_ENABLED(CONFIG_ARCH_ENABLE_THP_MIGRATION); } =20 +void split_huge_pmd_locked(struct vm_area_struct *vma, unsigned long addre= ss, + pmd_t *pmd, bool freeze, struct folio *folio); + #else /* CONFIG_TRANSPARENT_HUGEPAGE */ =20 static inline bool folio_test_pmd_mappable(struct folio *folio) @@ -490,6 +493,9 @@ static inline void __split_huge_pmd(struct vm_area_stru= ct *vma, pmd_t *pmd, unsigned long address, bool freeze, struct folio *folio) {} static inline void split_huge_pmd_address(struct vm_area_struct *vma, unsigned long address, bool freeze, struct folio *folio) {} +static inline void split_huge_pmd_locked(struct vm_area_struct *vma, + unsigned long address, pmd_t *pmd, + bool freeze, struct folio *folio) {} =20 #define split_huge_pud(__vma, __pmd, __address) \ do { } while (0) diff --git a/include/linux/rmap.h b/include/linux/rmap.h index 0fd9bebce54c..d1c5e2d694b2 100644 --- a/include/linux/rmap.h +++ b/include/linux/rmap.h @@ -703,6 +703,30 @@ static inline void page_vma_mapped_walk_done(struct pa= ge_vma_mapped_walk *pvmw) spin_unlock(pvmw->ptl); } =20 +/** + * page_vma_mapped_walk_restart - Restart the page table walk. + * @pvmw: Pointer to struct page_vma_mapped_walk. + * + * It restarts the page table walk when changes occur in the page + * table, such as splitting a PMD. Ensures that the PTL held during + * the previous walk is released and resets the state to allow for + * a new walk starting at the current address stored in pvmw->address. + */ +static inline void +page_vma_mapped_walk_restart(struct page_vma_mapped_walk *pvmw) +{ + WARN_ON_ONCE(!pvmw->pmd && !pvmw->pte); + + if (likely(pvmw->ptl)) + spin_unlock(pvmw->ptl); + else + WARN_ON_ONCE(1); + + pvmw->ptl =3D NULL; + pvmw->pmd =3D NULL; + pvmw->pte =3D NULL; +} + bool page_vma_mapped_walk(struct page_vma_mapped_walk *pvmw); =20 /* diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 70d20fefc6db..e766d3f3a302 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2582,6 +2582,27 @@ static void __split_huge_pmd_locked(struct vm_area_s= truct *vma, pmd_t *pmd, pmd_populate(mm, pmd, pgtable); } =20 +void split_huge_pmd_locked(struct vm_area_struct *vma, unsigned long addre= ss, + pmd_t *pmd, bool freeze, struct folio *folio) +{ + VM_WARN_ON_ONCE(folio && !folio_test_pmd_mappable(folio)); + VM_WARN_ON_ONCE(!IS_ALIGNED(address, HPAGE_PMD_SIZE)); + VM_WARN_ON_ONCE(folio && !folio_test_locked(folio)); + VM_BUG_ON(freeze && !folio); + + /* + * When the caller requests to set up a migration entry, we + * require a folio to check the PMD against. Otherwise, there + * is a risk of replacing the wrong folio. + */ + if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || + is_pmd_migration_entry(*pmd)) { + if (folio && folio !=3D pmd_folio(*pmd)) + return; + __split_huge_pmd_locked(vma, pmd, address, freeze); + } +} + void __split_huge_pmd(struct vm_area_struct *vma, pmd_t *pmd, unsigned long address, bool freeze, struct folio *folio) { @@ -2593,26 +2614,7 @@ void __split_huge_pmd(struct vm_area_struct *vma, pm= d_t *pmd, (address & HPAGE_PMD_MASK) + HPAGE_PMD_SIZE); mmu_notifier_invalidate_range_start(&range); ptl =3D pmd_lock(vma->vm_mm, pmd); - - /* - * If caller asks to setup a migration entry, we need a folio to check - * pmd against. Otherwise we can end up replacing wrong folio. - */ - VM_BUG_ON(freeze && !folio); - VM_WARN_ON_ONCE(folio && !folio_test_locked(folio)); - - if (pmd_trans_huge(*pmd) || pmd_devmap(*pmd) || - is_pmd_migration_entry(*pmd)) { - /* - * It's safe to call pmd_page when folio is set because it's - * guaranteed that pmd is present. - */ - if (folio && folio !=3D pmd_folio(*pmd)) - goto out; - __split_huge_pmd_locked(vma, pmd, range.start, freeze); - } - -out: + split_huge_pmd_locked(vma, range.start, pmd, freeze, folio); spin_unlock(ptl); mmu_notifier_invalidate_range_end(&range); } diff --git a/mm/rmap.c b/mm/rmap.c index 2d778725e4f5..dacf24bc82f0 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1642,9 +1642,6 @@ static bool try_to_unmap_one(struct folio *folio, str= uct vm_area_struct *vma, if (flags & TTU_SYNC) pvmw.flags =3D PVMW_SYNC; =20 - if (flags & TTU_SPLIT_HUGE_PMD) - split_huge_pmd_address(vma, address, false, folio); - /* * For THP, we have to assume the worse case ie pmd for invalidation. * For hugetlb, it could be much worse if we need to do pud @@ -1670,9 +1667,6 @@ static bool try_to_unmap_one(struct folio *folio, str= uct vm_area_struct *vma, mmu_notifier_invalidate_range_start(&range); =20 while (page_vma_mapped_walk(&pvmw)) { - /* Unexpected PMD-mapped THP? */ - VM_BUG_ON_FOLIO(!pvmw.pte, folio); - /* * If the folio is in an mlock()d vma, we must not swap it out. */ @@ -1684,6 +1678,21 @@ static bool try_to_unmap_one(struct folio *folio, st= ruct vm_area_struct *vma, goto walk_abort; } =20 + if (!pvmw.pte && (flags & TTU_SPLIT_HUGE_PMD)) { + /* + * We temporarily have to drop the PTL and start once + * again from that now-PTE-mapped page table. + */ + split_huge_pmd_locked(vma, pvmw.address, pvmw.pmd, + false, folio); + flags &=3D ~TTU_SPLIT_HUGE_PMD; + page_vma_mapped_walk_restart(&pvmw); + continue; + } + + /* Unexpected PMD-mapped THP? */ + VM_BUG_ON_FOLIO(!pvmw.pte, folio); + pfn =3D pte_pfn(ptep_get(pvmw.pte)); subpage =3D folio_page(folio, pfn - folio_pfn(folio)); address =3D pvmw.address; --=20 2.33.1 From nobody Thu Dec 18 07:11:24 2025 Received: from mail-pf1-f171.google.com (mail-pf1-f171.google.com [209.85.210.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 720261836E2 for ; Fri, 14 Jun 2024 01:52:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718329951; cv=none; b=DjBxGrSMezCYKuhOxB4HtzIUbEo96rR6mWJzKOYQv6wLvjmXuS7Bcr5Z8CjM1wtkmbB1Ci+NoMTAC5T7F57625qfI/tshLR4TkgY02gKR9/Nt7r/r0qXqH2PSvZGdGN3E8roUlBREATOhLrmIKybp58CoYVEqU0qglLmiAjngLA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718329951; c=relaxed/simple; bh=3a9FeUWHB8TvmQPb495/m7vd26mtQHq+6jEXIAin+D8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=iC9/kHgGl5CQwnTtqEDZXAqE4WH8tU92vo/htQ57UuqaEK/i2sEmfpvambO4wjw3mts/dk4bdlT981i4SNPjml73Am7Hz2h2dRXeb6Q7TDiJ6bEznI3jdxv+VMtXuJgyZXvTRCjWIiBBRHR3cuczjYeVngWBgL+sqxYTvssW6ns= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com; spf=pass smtp.mailfrom=gmail.com; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b=lrVumhDF; arc=none smtp.client-ip=209.85.210.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=gmail.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=gmail.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="lrVumhDF" Received: by mail-pf1-f171.google.com with SMTP id d2e1a72fcca58-70413de08c7so1257428b3a.0 for ; Thu, 13 Jun 2024 18:52:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1718329950; x=1718934750; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=h/pT6ukJbZTbcHICd4lFILO7tOygQ9w7tdWCnznoYp8=; b=lrVumhDFNg3Fg3fHkxfmOC+kS0sbzQfXGrgHP51A9a/JndpINxlMxPiI8IPFa0kacy tIsSN43JRlsE6zykWx6bRg1hRis6CwQuKRwAhy+cIylvxv0voyeYx0/1rN/eFr5dPsnm 868EhFslo9dClRDlymmIHmpCRPjfObmPo0AarS3EF/P2UlHKiwVD2GAsCxsbjOycAYI3 eiZ1VQLI6ZFhaO+gelTsiQ6gTPDIkUmPKBerTDUeyApYZXi02TqMC7xeYXDdUtskIodf bNi3ROyPA5cQmT47WyRAgaO9RVrJ2P/yUhTkXvKXI9/FZf7iVTqMJ/UTrlYFiMrVCOHF 67tA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1718329950; x=1718934750; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=h/pT6ukJbZTbcHICd4lFILO7tOygQ9w7tdWCnznoYp8=; b=epVqU+Dxa1W5g0WfXJnbGwhCAfEK21//SXuM63z8DMKq8lLvrSdETlf4senw5inhL0 j1/xXijU9UALRRuOxRrVDQ8fLIN+MsJTzo2GiUaJ/SWyYOiJo4cWPbudMGan/gYE/JWI TPZ1jUNXzSlEFCoYlRE8elCd01SF1YUcl0r13iCMdy5ZOjXXOe+X2Akdwym59qYkXBuB EudPYviqv+oOOHNYLNXsm9JpaZnAGgDOitEOMNz/bgYIMKOgNSTbNFNHWHuoRnmjmAsg gspWTkOnFbgt4BqnhJFz+MYZtF2sdDFi3XfJTUNbLtkQKzx0g44MtCuqAY7gp1yPlxpQ cZtg== X-Forwarded-Encrypted: i=1; AJvYcCVl7E7QZ6VjjPq+HlJTAudYD/5B6X0gu5AqYJ9LeycIhJDfAVAVK4PIZmnINRUBqQhXXQNzk5Bb5+wPyBrNl7B3q4QJIWMavHduZUa/ X-Gm-Message-State: AOJu0YxAU6z44n5eOu2Pq7TFSdlwnqch2h7YXQLEXmJVYMlYk4eJjhYU ZybEMHQpTHWG+b6O0DLZE3Pai0M0t95svB2qRWi9ZeIJ7aK9iESt X-Google-Smtp-Source: AGHT+IHa1l1GAHgQyasvK4lvnN2hIWQ3RAuc8OqBpVBkUd1r2gMswUPTOL3yjTTBw1mSXUjvCm3kSg== X-Received: by 2002:a05:6a00:234c:b0:704:147b:d768 with SMTP id d2e1a72fcca58-705d71d0e4cmr1657112b3a.24.1718329949695; Thu, 13 Jun 2024 18:52:29 -0700 (PDT) Received: from LancedeMBP.lan.lan ([2403:2c80:6::304f]) by smtp.gmail.com with ESMTPSA id d2e1a72fcca58-705cc96d4c3sm2000912b3a.59.2024.06.13.18.52.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 13 Jun 2024 18:52:29 -0700 (PDT) From: Lance Yang To: akpm@linux-foundation.org Cc: willy@infradead.org, sj@kernel.org, baolin.wang@linux.alibaba.com, maskray@google.com, ziy@nvidia.com, ryan.roberts@arm.com, david@redhat.com, 21cnbao@gmail.com, mhocko@suse.com, fengwei.yin@intel.com, zokeefe@google.com, shy828301@gmail.com, xiehuan09@gmail.com, libang.li@antgroup.com, wangkefeng.wang@huawei.com, songmuchun@bytedance.com, peterx@redhat.com, minchan@kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Lance Yang Subject: [PATCH v8 3/3] mm/vmscan: avoid split lazyfree THP during shrink_folio_list() Date: Fri, 14 Jun 2024 09:51:38 +0800 Message-Id: <20240614015138.31461-4-ioworker0@gmail.com> X-Mailer: git-send-email 2.33.1 In-Reply-To: <20240614015138.31461-1-ioworker0@gmail.com> References: <20240614015138.31461-1-ioworker0@gmail.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When the user no longer requires the pages, they would use madvise(MADV_FREE) to mark the pages as lazy free. Subsequently, they typically would not re-write to that memory again. During memory reclaim, if we detect that the large folio and its PMD are both still marked as clean and there are no unexpected references (such as GUP), so we can just discard the memory lazily, improving the efficiency of memory reclamation in this case. On an Intel i5 CPU, reclaiming 1GiB of lazyfree THPs using mem_cgroup_force_empty() results in the following runtimes in seconds (shorter is better): Suggested-by: David Hildenbrand Suggested-by: Zi Yan -------------------------------------------- | Old | New | Change | -------------------------------------------- | 0.683426 | 0.049197 | -92.80% | -------------------------------------------- Suggested-by: Zi Yan Suggested-by: David Hildenbrand Signed-off-by: Lance Yang --- include/linux/huge_mm.h | 9 +++++ mm/huge_memory.c | 76 +++++++++++++++++++++++++++++++++++++++++ mm/rmap.c | 27 +++++++++------ 3 files changed, 102 insertions(+), 10 deletions(-) diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h index 9f720b0731c4..212cca384d7e 100644 --- a/include/linux/huge_mm.h +++ b/include/linux/huge_mm.h @@ -430,6 +430,8 @@ static inline bool thp_migration_supported(void) =20 void split_huge_pmd_locked(struct vm_area_struct *vma, unsigned long addre= ss, pmd_t *pmd, bool freeze, struct folio *folio); +bool unmap_huge_pmd_locked(struct vm_area_struct *vma, unsigned long addr, + pmd_t *pmdp, struct folio *folio); =20 #else /* CONFIG_TRANSPARENT_HUGEPAGE */ =20 @@ -497,6 +499,13 @@ static inline void split_huge_pmd_locked(struct vm_are= a_struct *vma, unsigned long address, pmd_t *pmd, bool freeze, struct folio *folio) {} =20 +static inline bool unmap_huge_pmd_locked(struct vm_area_struct *vma, + unsigned long addr, pmd_t *pmdp, + struct folio *folio) +{ + return false; +} + #define split_huge_pud(__vma, __pmd, __address) \ do { } while (0) =20 diff --git a/mm/huge_memory.c b/mm/huge_memory.c index e766d3f3a302..425374ae06ed 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2688,6 +2688,82 @@ static void unmap_folio(struct folio *folio) try_to_unmap_flush(); } =20 +static bool __discard_anon_folio_pmd_locked(struct vm_area_struct *vma, + unsigned long addr, pmd_t *pmdp, + struct folio *folio) +{ + VM_WARN_ON_FOLIO(folio_test_swapbacked(folio), folio); + VM_WARN_ON_FOLIO(!folio_test_anon(folio), folio); + + struct mm_struct *mm =3D vma->vm_mm; + int ref_count, map_count; + pmd_t orig_pmd =3D *pmdp; + struct page *page; + + if (unlikely(!pmd_present(orig_pmd) || !pmd_trans_huge(orig_pmd))) + return false; + + page =3D pmd_page(orig_pmd); + if (unlikely(page_folio(page) !=3D folio)) + return false; + + if (folio_test_dirty(folio) || pmd_dirty(orig_pmd)) + return false; + + orig_pmd =3D pmdp_huge_clear_flush(vma, addr, pmdp); + + /* + * Syncing against concurrent GUP-fast: + * - clear PMD; barrier; read refcount + * - inc refcount; barrier; read PMD + */ + smp_mb(); + + ref_count =3D folio_ref_count(folio); + map_count =3D folio_mapcount(folio); + + /* + * Order reads for folio refcount and dirty flag + * (see comments in __remove_mapping()). + */ + smp_rmb(); + + /* + * If the folio or its PMD is redirtied at this point, or if there + * are unexpected references, we will give up to discard this folio + * and remap it. + * + * The only folio refs must be one from isolation plus the rmap(s). + */ + if (folio_test_dirty(folio) || pmd_dirty(orig_pmd) || + ref_count !=3D map_count + 1) { + set_pmd_at(mm, addr, pmdp, orig_pmd); + return false; + } + + folio_remove_rmap_pmd(folio, page, vma); + zap_deposited_table(mm, pmdp); + add_mm_counter(mm, MM_ANONPAGES, -HPAGE_PMD_NR); + if (vma->vm_flags & VM_LOCKED) + mlock_drain_local(); + folio_put(folio); + + return true; +} + +bool unmap_huge_pmd_locked(struct vm_area_struct *vma, unsigned long addr, + pmd_t *pmdp, struct folio *folio) +{ + VM_WARN_ON_FOLIO(!folio_test_pmd_mappable(folio), folio); + VM_WARN_ON_FOLIO(!folio_test_locked(folio), folio); + VM_WARN_ON_ONCE(!IS_ALIGNED(addr, HPAGE_PMD_SIZE)); + + if (folio_test_anon(folio) && !folio_test_swapbacked(folio)) + return __discard_anon_folio_pmd_locked(vma, addr, pmdp, folio); + + return false; +} + static void remap_page(struct folio *folio, unsigned long nr) { int i =3D 0; diff --git a/mm/rmap.c b/mm/rmap.c index dacf24bc82f0..7d97806f74cd 100644 --- a/mm/rmap.c +++ b/mm/rmap.c @@ -1678,16 +1678,23 @@ static bool try_to_unmap_one(struct folio *folio, s= truct vm_area_struct *vma, goto walk_abort; } =20 - if (!pvmw.pte && (flags & TTU_SPLIT_HUGE_PMD)) { - /* - * We temporarily have to drop the PTL and start once - * again from that now-PTE-mapped page table. - */ - split_huge_pmd_locked(vma, pvmw.address, pvmw.pmd, - false, folio); - flags &=3D ~TTU_SPLIT_HUGE_PMD; - page_vma_mapped_walk_restart(&pvmw); - continue; + if (!pvmw.pte) { + if (unmap_huge_pmd_locked(vma, pvmw.address, pvmw.pmd, + folio)) + goto walk_done; + + if (flags & TTU_SPLIT_HUGE_PMD) { + /* + * We temporarily have to drop the PTL and start + * once again from that now-PTE-mapped page + * table. + */ + split_huge_pmd_locked(vma, pvmw.address, + pvmw.pmd, false, folio); + flags &=3D ~TTU_SPLIT_HUGE_PMD; + page_vma_mapped_walk_restart(&pvmw); + continue; + } } =20 /* Unexpected PMD-mapped THP? */ --=20 2.33.1