From nobody Mon Dec 2 13:34:00 2024 Received: from mail-pf1-f179.google.com (mail-pf1-f179.google.com [209.85.210.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B8EC9155333 for ; Thu, 31 Oct 2024 08:14:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730362455; cv=none; b=Mj+HwqDe8QmR6DUPf2RMIUPTvJMoXhZRRyHO5KdUhraxlX4gI2VD27CiRwXA/MWSzTqEa5MZed+cT7orUsE6aYyaoT33xgNrP9sc2i6Q1fzIHcBGNiYgGTxjkrJNXUFgKfH1SX4GtcPd6Tih69GT2U1YjYOS1LVd8Swu8ROU4/I= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730362455; c=relaxed/simple; bh=zjp+gQydTtb+snYiIz7gJ7pjUJX6WXPqJzZSVREH5I4=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=r+KlNw8qVfYjb3MOIOa2GWLOV8n7Ut5E/BgadZT/q7YQF7eb1xsbLNQ7VcM/vj2PQ2eHSCae15PN4zIKsBsy55GoEqw5NFNPS4e+kjVyLrLkMl9xs/Et+5QLh3adRN2bhtlHEryiznzWZvLfwjVXJYHSyHnCqhN/zzFKyEAaZMI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=ON9OUtRz; arc=none smtp.client-ip=209.85.210.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="ON9OUtRz" Received: by mail-pf1-f179.google.com with SMTP id d2e1a72fcca58-71e52582cf8so494581b3a.2 for ; Thu, 31 Oct 2024 01:14:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730362453; x=1730967253; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ipuWYZPCHxpoUpiQDdIApzlSejCgHBDIxG7j4EZzxr4=; b=ON9OUtRz8eZAjgR/FDsgZgn8votKnT7IlO/IAiYcynSSyE7l3QtOtGUlNry42B7f9h lKAs+vFHmi4cGD3WLom+e0vviDi+nk2I87eKRidzengD2H8F9o/IAnDyl9P2wbQdI0Kz eCMTugrFqFoOdoq+juMZaG1IBLLBvmW18zWmyG7AdZbOqjOko9RIc9v6m5xtp/zjyyFQ oEI777QXlL1rVCReiVkxBQ7JmIZ5gfLDmnF6sXOXujYNswulhDLasrDhCFLxmP2+NPcF SZQw7s5kd/OBJvR19o8vlZQJD7SW3x4/Dn8ydq5FJsn8tYXZaT98g/Xd+NgiZKvrbOpP kTLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730362453; x=1730967253; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ipuWYZPCHxpoUpiQDdIApzlSejCgHBDIxG7j4EZzxr4=; b=fsy0lagM7Ia9/4gk5ofHFsfaRLGmLKae/WSaHCPcPO/1VJugf3DE+jPGZKZT027Axv P4567QKpBFVikqrM/x5lk4IH5T1Zjg9bTgu+XP9EY389r18arB/NlrbcRNgC+3A5/1XA vQIdxaXEPQMcT6TrN4hbn0ea4JkPNftm686T8tUUooCadUy+lIWlJb3B4+x0HHZaoetP xTE5r4mVHEwMjp9KqDePumBXVVx68csEdzbAZjN0sUweffXf65yRulskM39qAx+/tKVz mF/zzUioPhzS1466iFIki7wQzTD+nWEcNKu4vSzDLCVUcmFPmSXL66q+6h7SeL3DM56p buuw== X-Forwarded-Encrypted: i=1; AJvYcCWInVuxBnEDn5XpjMl9ONNnFXScnq30FOr2u6sCIX1MXGb9mgk/JccjWwr1DW7rJ+fz9JMdbkroKn7RVc8=@vger.kernel.org X-Gm-Message-State: AOJu0YxNoYEjhMs0YrZs1US+HXe9XguQPOab/mMC7ti4go/H/t3boj+w R6nw0/goK8pb6YIh/fxIeTssCYaUWlYaa1LMckXZ/bYfHnS/YutjXxw3Jw4W5uA= X-Google-Smtp-Source: AGHT+IGhQTI8wahkjaT4dlfz/gtxWygzYcya76y9Il/Jpy4crLp7RHv4SMqkN227j2dZrXUEfJpRWQ== X-Received: by 2002:a05:6a21:3947:b0:1d9:3456:b71e with SMTP id adf61e73a8af0-1d9a83c1926mr23625663637.12.1730362453050; Thu, 31 Oct 2024 01:14:13 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.149]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e9201b2586sm5163996a91.0.2024.10.31.01.14.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 01:14:12 -0700 (PDT) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v2 1/7] mm: khugepaged: retract_page_tables() use pte_offset_map_rw_nolock() Date: Thu, 31 Oct 2024 16:13:17 +0800 Message-Id: <4c3f4aa29f38c013c4529a43bce846a3edd31523.1730360798.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In retract_page_tables(), we may modify the pmd entry after acquiring the pml and ptl, so we should also check whether the pmd entry is stable. Using pte_offset_map_rw_nolock() + pmd_same() to do it, and then we can also remove the calling of the pte_lockptr(). Signed-off-by: Qi Zheng --- mm/khugepaged.c | 17 ++++++++++++++++- 1 file changed, 16 insertions(+), 1 deletion(-) diff --git a/mm/khugepaged.c b/mm/khugepaged.c index 6f8d46d107b4b..6d76dde64f5fb 100644 --- a/mm/khugepaged.c +++ b/mm/khugepaged.c @@ -1721,6 +1721,7 @@ static void retract_page_tables(struct address_space = *mapping, pgoff_t pgoff) spinlock_t *pml; spinlock_t *ptl; bool skipped_uffd =3D false; + pte_t *pte; =20 /* * Check vma->anon_vma to exclude MAP_PRIVATE mappings that @@ -1756,11 +1757,25 @@ static void retract_page_tables(struct address_spac= e *mapping, pgoff_t pgoff) addr, addr + HPAGE_PMD_SIZE); mmu_notifier_invalidate_range_start(&range); =20 + pte =3D pte_offset_map_rw_nolock(mm, pmd, addr, &pgt_pmd, &ptl); + if (!pte) { + mmu_notifier_invalidate_range_end(&range); + continue; + } + pml =3D pmd_lock(mm, pmd); - ptl =3D pte_lockptr(mm, pmd); if (ptl !=3D pml) spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); =20 + if (unlikely(!pmd_same(pgt_pmd, pmdp_get_lockless(pmd)))) { + pte_unmap_unlock(pte, ptl); + if (ptl !=3D pml) + spin_unlock(pml); + mmu_notifier_invalidate_range_end(&range); + continue; + } + pte_unmap(pte); + /* * Huge page lock is still held, so normally the page table * must remain empty; and we have already skipped anon_vma --=20 2.20.1 From nobody Mon Dec 2 13:34:00 2024 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E99A7154445 for ; Thu, 31 Oct 2024 08:14:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730362463; cv=none; b=GMOiWiQSCzP7C7o1pcaRC07y0ewIcflZrAqVhjLepBIgFePRo1G5njuR/fo2TyhmEndVKDkEnjU6ycDW0XV4WivhQH8hcvm7rQRunhxNi6T6AtG+r1uSUEBCEw4BMQ+v4XtUwcTL0RG8zlDxl6jpPNcOTNB/zxq44RzLUgxEshk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730362463; c=relaxed/simple; bh=31kONglB0KigiLfabCVYk2k8OYeGF43W/Zv8pVM3t+w=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=fAKcUYgzJYchW+1p6sG6wYu3RscWUfwZClUbpiwgA7nTpTU9goOgZSiX13rkTenTWVTvHbjLrjKoT/l6+ifrEvlZeo168wtPd43q6pvqKUfyC/GUGD3GHA+1jopJYx2NqTWvLpDvMNnWSnwUZRRyDTxo8JauttuKTNxIl1CydgM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=ereH/toc; arc=none smtp.client-ip=209.85.214.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="ereH/toc" Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-20ce65c8e13so7533785ad.1 for ; Thu, 31 Oct 2024 01:14:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730362459; x=1730967259; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=0uS9zjnEttUyVTWzhhVIHAdfZYrGflmgRk9QlOwbpbs=; b=ereH/tocTjuPaRa1PpXM6qI4zC3IDSxeXKVqrInLS9IjoLAfaFbGY6SL4Gy8V5tD3w 0uN+zsnj9GjdMK7JhJroM7A1Ln8KroVrmrbEPhUQrDImsOrxY0zvUtuXsbISmOvN5RQx 7yoFENx+xRtoCjth3dSOrEjXcx1lEt7tHzdU4vtde25wCBWIOprYzxuKErCe6xf99dJP wQQS+/Rv2YRDYzEPHVUPAVF43inuSLA+1N8nFSZKaba5mtwBDV/7rwnOuIGTw9narcst ZoFc62n0B/CI2BrZ7w/dedOSSJG5Kg7QwZ+jj8LxS0wLogTUOKMOeQqMEY8YDnV/Xpfs cf5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730362459; x=1730967259; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=0uS9zjnEttUyVTWzhhVIHAdfZYrGflmgRk9QlOwbpbs=; b=vcOfsmerVEK9VPShR9FWxUEUxIEsBH8P3ndrP3uMswcdZNBiyFQ4GNA3+oh7apxoLp IjB0EwEG/VFvGaLSdW8LOBB8l7wedINXSxXFWX4y51sh7kIevnetlWi2TvyaE5yek4lQ sFEiQqVWJ29tm7IS3ngRGE1Yu9sqtwZz/4afvHTby99LA1tAdr31o9rQaXg1gBMJjdzU Mol29TAe3wHV8VitgNW1hUAIpckZGxrjI091ys5HMe6DWrK4iXy7WmTXPWSlk2W/h2A0 PyMWA2q3+mY4JzBgoS6E9J0m8f1Y2+WTe4DsizEKteY8bSEFLo7WM7+5fwGyIwhaJgxf 2rag== X-Forwarded-Encrypted: i=1; AJvYcCVle4+3WN6xND6eDv0OchgCrgBEJNx6/fE6Z9EGzZx35BKKZA+QoXS1RiNo3rG50qt5z1kdFKViShlRSvk=@vger.kernel.org X-Gm-Message-State: AOJu0Yzf51ohMn0TPbx/2slZDMPaJkqy8EHZ20oHpPrknQ0bB9AwTGxE h8FTl84xobjZeFfCIeXIKYluGrA5SL8y92wo5TrU+1PHyypis0vFpUA+1GZt57o= X-Google-Smtp-Source: AGHT+IHliHONlLjM71G0WcLE2jZg2du2F6KU7SIJsE29Lnbx/3IrRdww7bJdFluVjjQ6fAFe8KWB2A== X-Received: by 2002:a17:90b:53c3:b0:2e2:d859:1603 with SMTP id 98e67ed59e1d1-2e93c1d39b0mr2658162a91.25.1730362459134; Thu, 31 Oct 2024 01:14:19 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.149]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e9201b2586sm5163996a91.0.2024.10.31.01.14.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 01:14:18 -0700 (PDT) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v2 2/7] mm: introduce zap_nonpresent_ptes() Date: Thu, 31 Oct 2024 16:13:18 +0800 Message-Id: <34ef26a763af7b2d8dee707b6f60ff85edda345f.1730360798.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Similar to zap_present_ptes(), let's introduce zap_nonpresent_ptes() to handle non-present ptes, which can improve code readability. No functional change. Signed-off-by: Qi Zheng Acked-by: David Hildenbrand Reviewed-by: Jann Horn --- mm/memory.c | 136 ++++++++++++++++++++++++++++------------------------ 1 file changed, 73 insertions(+), 63 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index 209885a4134f7..bd9ebe0f4471f 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1587,6 +1587,76 @@ static inline int zap_present_ptes(struct mmu_gather= *tlb, return 1; } =20 +static inline int zap_nonpresent_ptes(struct mmu_gather *tlb, + struct vm_area_struct *vma, pte_t *pte, pte_t ptent, + unsigned int max_nr, unsigned long addr, + struct zap_details *details, int *rss) +{ + swp_entry_t entry; + int nr =3D 1; + + entry =3D pte_to_swp_entry(ptent); + if (is_device_private_entry(entry) || + is_device_exclusive_entry(entry)) { + struct page *page =3D pfn_swap_entry_to_page(entry); + struct folio *folio =3D page_folio(page); + + if (unlikely(!should_zap_folio(details, folio))) + return 1; + /* + * Both device private/exclusive mappings should only + * work with anonymous page so far, so we don't need to + * consider uffd-wp bit when zap. For more information, + * see zap_install_uffd_wp_if_needed(). + */ + WARN_ON_ONCE(!vma_is_anonymous(vma)); + rss[mm_counter(folio)]--; + if (is_device_private_entry(entry)) + folio_remove_rmap_pte(folio, page, vma); + folio_put(folio); + } else if (!non_swap_entry(entry)) { + /* Genuine swap entries, hence a private anon pages */ + if (!should_zap_cows(details)) + return 1; + + nr =3D swap_pte_batch(pte, max_nr, ptent); + rss[MM_SWAPENTS] -=3D nr; + free_swap_and_cache_nr(entry, nr); + } else if (is_migration_entry(entry)) { + struct folio *folio =3D pfn_swap_entry_folio(entry); + + if (!should_zap_folio(details, folio)) + return 1; + rss[mm_counter(folio)]--; + } else if (pte_marker_entry_uffd_wp(entry)) { + /* + * For anon: always drop the marker; for file: only + * drop the marker if explicitly requested. + */ + if (!vma_is_anonymous(vma) && !zap_drop_markers(details)) + return 1; + } else if (is_guard_swp_entry(entry)) { + /* + * Ordinary zapping should not remove guard PTE + * markers. Only do so if we should remove PTE markers + * in general. + */ + if (!zap_drop_markers(details)) + return 1; + } else if (is_hwpoison_entry(entry) || is_poisoned_swp_entry(entry)) { + if (!should_zap_cows(details)) + return 1; + } else { + /* We should have covered all the swap entry types */ + pr_alert("unrecognized swap entry 0x%lx\n", entry.val); + WARN_ON_ONCE(1); + } + clear_not_present_full_ptes(vma->vm_mm, addr, pte, nr, tlb->fullmm); + zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, ptent); + + return nr; +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1598,7 +1668,6 @@ static unsigned long zap_pte_range(struct mmu_gather = *tlb, spinlock_t *ptl; pte_t *start_pte; pte_t *pte; - swp_entry_t entry; int nr; =20 tlb_change_page_size(tlb, PAGE_SIZE); @@ -1611,8 +1680,6 @@ static unsigned long zap_pte_range(struct mmu_gather = *tlb, arch_enter_lazy_mmu_mode(); do { pte_t ptent =3D ptep_get(pte); - struct folio *folio; - struct page *page; int max_nr; =20 nr =3D 1; @@ -1622,8 +1689,8 @@ static unsigned long zap_pte_range(struct mmu_gather = *tlb, if (need_resched()) break; =20 + max_nr =3D (end - addr) / PAGE_SIZE; if (pte_present(ptent)) { - max_nr =3D (end - addr) / PAGE_SIZE; nr =3D zap_present_ptes(tlb, vma, pte, ptent, max_nr, addr, details, rss, &force_flush, &force_break); @@ -1631,67 +1698,10 @@ static unsigned long zap_pte_range(struct mmu_gathe= r *tlb, addr +=3D nr * PAGE_SIZE; break; } - continue; - } - - entry =3D pte_to_swp_entry(ptent); - if (is_device_private_entry(entry) || - is_device_exclusive_entry(entry)) { - page =3D pfn_swap_entry_to_page(entry); - folio =3D page_folio(page); - if (unlikely(!should_zap_folio(details, folio))) - continue; - /* - * Both device private/exclusive mappings should only - * work with anonymous page so far, so we don't need to - * consider uffd-wp bit when zap. For more information, - * see zap_install_uffd_wp_if_needed(). - */ - WARN_ON_ONCE(!vma_is_anonymous(vma)); - rss[mm_counter(folio)]--; - if (is_device_private_entry(entry)) - folio_remove_rmap_pte(folio, page, vma); - folio_put(folio); - } else if (!non_swap_entry(entry)) { - max_nr =3D (end - addr) / PAGE_SIZE; - nr =3D swap_pte_batch(pte, max_nr, ptent); - /* Genuine swap entries, hence a private anon pages */ - if (!should_zap_cows(details)) - continue; - rss[MM_SWAPENTS] -=3D nr; - free_swap_and_cache_nr(entry, nr); - } else if (is_migration_entry(entry)) { - folio =3D pfn_swap_entry_folio(entry); - if (!should_zap_folio(details, folio)) - continue; - rss[mm_counter(folio)]--; - } else if (pte_marker_entry_uffd_wp(entry)) { - /* - * For anon: always drop the marker; for file: only - * drop the marker if explicitly requested. - */ - if (!vma_is_anonymous(vma) && - !zap_drop_markers(details)) - continue; - } else if (is_guard_swp_entry(entry)) { - /* - * Ordinary zapping should not remove guard PTE - * markers. Only do so if we should remove PTE markers - * in general. - */ - if (!zap_drop_markers(details)) - continue; - } else if (is_hwpoison_entry(entry) || - is_poisoned_swp_entry(entry)) { - if (!should_zap_cows(details)) - continue; } else { - /* We should have covered all the swap entry types */ - pr_alert("unrecognized swap entry 0x%lx\n", entry.val); - WARN_ON_ONCE(1); + nr =3D zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, + addr, details, rss); } - clear_not_present_full_ptes(mm, addr, pte, nr, tlb->fullmm); - zap_install_uffd_wp_if_needed(vma, addr, pte, nr, details, ptent); } while (pte +=3D nr, addr +=3D PAGE_SIZE * nr, addr !=3D end); =20 add_mm_rss_vec(mm, rss); --=20 2.20.1 From nobody Mon Dec 2 13:34:00 2024 Received: from mail-pj1-f52.google.com (mail-pj1-f52.google.com [209.85.216.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5D5B5187323 for ; Thu, 31 Oct 2024 08:14:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730362468; cv=none; b=R/+XYFdLBo3nkOY/mnlGFzAKxwhf3nrs1x7J+Lh9lFJ6LtNuMSfYB6ARh+IZhQ5biKiUd4uR7lHRUIFNG5sy6aRoxVpBYcZd1LY93KhLBalI6DGFgYFRYY6hxk+33zEU69dbyOOe+Leh4cFLjreoC8kwvJbzJvQqAKkH6YlFPGU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730362468; c=relaxed/simple; bh=4i8PRUnaVqGvMO4RXt6asLg2xN9z/UY9Zi1Qe/JX+bs=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=JcL+K8Zp9r+nIwbfG58tv63vgfhRWUxzDqrxnUHKG4n99j4HhoCzLuVh0Vq8WFZhQhFl08lc2QEoxywm1OcMiYWFxQrLS+laF96AvtYHv1YVHxJEey8YzIRVmCLvliEmUuyA+EoKpOI0O3pdv2/HR6bp1t+eYzVlT/lZ7D4nqag= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=Vadr0rah; arc=none smtp.client-ip=209.85.216.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="Vadr0rah" Received: by mail-pj1-f52.google.com with SMTP id 98e67ed59e1d1-2e2ed2230d8so512503a91.0 for ; Thu, 31 Oct 2024 01:14:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730362466; x=1730967266; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zhaSq+JiwY2eXMq9iHjNOMwdUtMvBCWvXjuj5WTX7eI=; b=Vadr0rahN3LtIc/4HFmONWQ0dawrkUofE6EaJ4NdXKnzJxbV+7CxSloIGBSVNACEZc ApDc8rsiyWgummM9XS4y72DPIEguyIwr8ycTILJ98oPZk6/xihhryuVEoZQ+/s/x597y VtORjERIs6rrghtzgaypMIZYJHgvTdMn918ger6GQ7FnD5VUsH9p1VuGckfT4OixDoEa we7FxaHCr7DQCn6H6xaWu6rNwABt41sBecOGaw3Wq1ZTzSGk8ljD5Emq1v/QkjPXjrCG +filO8FgmsSTPLbofejQKss47TBRqEC6APpJuLgq8w059XewB9X54g9MvxfOJbTcS61r AEPA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730362466; x=1730967266; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zhaSq+JiwY2eXMq9iHjNOMwdUtMvBCWvXjuj5WTX7eI=; b=g+nrrN3jzYkG5jrdtjFTO10skJLYm/cfev+B92WKPTgeWfQKjEcDu8r4p9cEngkM0Z D/xKWTgc9ib9e95SzoMNvLBDs3QSOCsKCv6SOs1dbrCh/qsZYpAjaeVfBABbBAAtOe9x /KV24YRwwbyh+anIHsbWCYQuMBI8uiBJful7896caZq1ATBhfozRArAN4UMmBG1CKRQ5 AMhMoF55meSR/PNWmaTtNlpbSU9KGTjCzTiVoNFp8PZiKt5LTu+bCecRJazboEDuhQFI v6AxU5tpY1SfRYgRM+yH85p3zjt/1HbutLMkfZV5JZDHVGMDXaReC8zWlfmHntlU2KvX B5xw== X-Forwarded-Encrypted: i=1; AJvYcCWuBNMbglO2W94UH1Cr10dgRRggaCegHWEIZOowJ9HFLm71TH4fzOL5xW1V3GdIgPu/gGll1as3DfsEFvE=@vger.kernel.org X-Gm-Message-State: AOJu0YwsBdC4IxVyB3XaBZc/pqT5QKJbaXYFK9Nu2IaG59zgXRdLOwwb zvFCAdlb3YzPoIGxqdIF4bfCpFxpFBGKWm6fVS+X3EYNdcBSRdBQ6nha6IbMW7U= X-Google-Smtp-Source: AGHT+IHK1eQB1HIOVcDML9/U+qgad2lrGxmYCIjO8Kr0cktQfTqjB9fcEN6v3D+sRskXXQsEGQyaZA== X-Received: by 2002:a17:90a:bc83:b0:2e2:d17e:1ef7 with SMTP id 98e67ed59e1d1-2e8f1054a67mr19337853a91.3.1730362465831; Thu, 31 Oct 2024 01:14:25 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.149]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e9201b2586sm5163996a91.0.2024.10.31.01.14.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 01:14:24 -0700 (PDT) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v2 3/7] mm: introduce do_zap_pte_range() Date: Thu, 31 Oct 2024 16:13:19 +0800 Message-Id: <1639ac32194f2b2590852f410fd3ce3595eb730b.1730360798.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" This commit introduces do_zap_pte_range() to actually zap the PTEs, which will help improve code readability and facilitate secondary checking of the processed PTEs in the future. No functional change. Signed-off-by: Qi Zheng Acked-by: David Hildenbrand Reviewed-by: Jann Horn --- mm/memory.c | 45 ++++++++++++++++++++++++++------------------- 1 file changed, 26 insertions(+), 19 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index bd9ebe0f4471f..c1150e62dd073 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1657,6 +1657,27 @@ static inline int zap_nonpresent_ptes(struct mmu_gat= her *tlb, return nr; } =20 +static inline int do_zap_pte_range(struct mmu_gather *tlb, + struct vm_area_struct *vma, pte_t *pte, + unsigned long addr, unsigned long end, + struct zap_details *details, int *rss, + bool *force_flush, bool *force_break) +{ + pte_t ptent =3D ptep_get(pte); + int max_nr =3D (end - addr) / PAGE_SIZE; + + if (pte_none(ptent)) + return 1; + + if (pte_present(ptent)) + return zap_present_ptes(tlb, vma, pte, ptent, max_nr, + addr, details, rss, force_flush, + force_break); + + return zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, addr, + details, rss); +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1679,28 +1700,14 @@ static unsigned long zap_pte_range(struct mmu_gathe= r *tlb, flush_tlb_batched_pending(mm); arch_enter_lazy_mmu_mode(); do { - pte_t ptent =3D ptep_get(pte); - int max_nr; - - nr =3D 1; - if (pte_none(ptent)) - continue; - if (need_resched()) break; =20 - max_nr =3D (end - addr) / PAGE_SIZE; - if (pte_present(ptent)) { - nr =3D zap_present_ptes(tlb, vma, pte, ptent, max_nr, - addr, details, rss, &force_flush, - &force_break); - if (unlikely(force_break)) { - addr +=3D nr * PAGE_SIZE; - break; - } - } else { - nr =3D zap_nonpresent_ptes(tlb, vma, pte, ptent, max_nr, - addr, details, rss); + nr =3D do_zap_pte_range(tlb, vma, pte, addr, end, details, rss, + &force_flush, &force_break); + if (unlikely(force_break)) { + addr +=3D nr * PAGE_SIZE; + break; } } while (pte +=3D nr, addr +=3D PAGE_SIZE * nr, addr !=3D end); =20 --=20 2.20.1 From nobody Mon Dec 2 13:34:00 2024 Received: from mail-pj1-f48.google.com (mail-pj1-f48.google.com [209.85.216.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 094F6187323 for ; Thu, 31 Oct 2024 08:14:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730362474; cv=none; b=MZXrkkWQ4eY0MA0IP42Pr/PSC98rqTGjl5YR7BIEfgKZOGfcnE/AujNgw/XiNa7M0Wz6hAMme/1VwJdSW8ngefBFTlEzeKuRHIzXWSCjbn1miJsOs3hLBghm1xfo3tLsvm3K9ksGKL+uzmi6C6njadhGGkJ138/EozWdwWIMFU0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730362474; c=relaxed/simple; bh=8PV9Chaa8T7WeXJAYQ/SP4YBbi1NrEJGvTUgFb1bqDk=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=cZSrXl7mbe2hQ0Jdh54OyvAeYCXXQ/PFO6CDUuBI6WoPoff/S6nB5hvN0CPrUGFjX/0HkpLxP4kLCouhoXwCIrolprQxCgCjAhgOku1ktziY9iB0+xOAyva/auL59to87o4I1xvOVIefBClRXoHnnINiKVORRRQhwT+cN9eu4D4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=fsTeXRU0; arc=none smtp.client-ip=209.85.216.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="fsTeXRU0" Received: by mail-pj1-f48.google.com with SMTP id 98e67ed59e1d1-2e2e050b1c3so1354736a91.0 for ; Thu, 31 Oct 2024 01:14:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730362472; x=1730967272; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=PXIV/VGyywoEEnyyPnCjvRn11K4iZ53lY17h1S4nRMU=; b=fsTeXRU0cMHgMtYycW0HWy0X3ztgIyIyYEF2T1G+19xVLL6WslEewpYkwNWEt0kAoT T/BMg7N1uDifbMHG77UIlKmfFnC1ChDSZ6wgEM6q4ReE+VTinr355Kv356APspUwbjpK 2650id3cOCewMgSxGzr3GgYEylQc3RTvvcNNDXDxStV5Ba6Mi/z1RawM4cr6AlWmPRfF eOsvzTh+Vibi/dKAs8Ulo5OiPDSa+8i4n8xMAKF8JpDOH0vbw8ZNbQCQyOVUk/W7ni0W dqLFcXpJS2WetvDAyXNvgN5piEDrxhDUghuLG4ADy4CjM3nibrmDL2ofXZj/j0yL2/Nv Jp1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730362472; x=1730967272; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=PXIV/VGyywoEEnyyPnCjvRn11K4iZ53lY17h1S4nRMU=; b=J03siAjM86Gap5HqDbcn6t8ibLFPhlnrlg5+n6lH2oxrRPn8uKZL6WgUxH4PKTCpBZ AO4ul66Ed2gNheOuOgKC/gY1acGlTkLbx2A5qa+ATEVYJerIVt8q5QuR80VnryuDxPX4 B4yuNqyvDZVTXorxX5e0V1L3+pDTOgbd0RC7uKuZfDROGK5Iwn4xtuLz2DDECKHRS3l7 I9qkgGHae27Lq8SUjQOzkq1yyXKxCJ0b2ndNuPA+G5XkGNfJ+XP5ahjhzXTroZpScTur 4IFD88nf6feFLOox7MAAyrbian2NiSm+GjwcaG2iOTdP1qtYXOkG5vz19/FxxIGtCbKc iXPA== X-Forwarded-Encrypted: i=1; AJvYcCXNjlWZbFpN9r+rxQ6sYm9nEFVq2VICI0Rylk1P3cMUvRVWSk97enP8UbXuVugqX7hnTYdUdMXTW2vN8WQ=@vger.kernel.org X-Gm-Message-State: AOJu0Yy5eiaaN62p2w1h69Vc1ZurnpQo+tRqtuix4VflxmJHVODyI17C BWg00aNyHkXNNoeX9Wsu4e1OSp2iLzcuQ2cOmp41HEQLeobLEeFqrjzalHRFqkY= X-Google-Smtp-Source: AGHT+IHX23XnGunJsTXB9lUCwZL9x2plTKY2+aB5lG85OOgVbjA/C6wsUFYSS/R0cYtYXLJhUsQsBA== X-Received: by 2002:a17:90b:1b4a:b0:2e9:20d8:414c with SMTP id 98e67ed59e1d1-2e93e0139a2mr2423394a91.5.1730362472334; Thu, 31 Oct 2024 01:14:32 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.149]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e9201b2586sm5163996a91.0.2024.10.31.01.14.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 01:14:31 -0700 (PDT) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v2 4/7] mm: make zap_pte_range() handle full within-PMD range Date: Thu, 31 Oct 2024 16:13:20 +0800 Message-Id: <63c92f1e2a2fa6267490ab9fad5b090316b39b98.1730360798.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" In preparation for reclaiming empty PTE pages, this commit first makes zap_pte_range() to handle the full within-PMD range, so that we can more easily detect and free PTE pages in this function in subsequent commits. Signed-off-by: Qi Zheng Reviewed-by: Jann Horn --- mm/memory.c | 8 ++++++++ 1 file changed, 8 insertions(+) diff --git a/mm/memory.c b/mm/memory.c index c1150e62dd073..002aa4f454fa0 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1691,6 +1691,7 @@ static unsigned long zap_pte_range(struct mmu_gather = *tlb, pte_t *pte; int nr; =20 +retry: tlb_change_page_size(tlb, PAGE_SIZE); init_rss_vec(rss); start_pte =3D pte =3D pte_offset_map_lock(mm, pmd, addr, &ptl); @@ -1730,6 +1731,13 @@ static unsigned long zap_pte_range(struct mmu_gather= *tlb, if (force_flush) tlb_flush_mmu(tlb); =20 + if (addr !=3D end) { + cond_resched(); + force_flush =3D false; + force_break =3D false; + goto retry; + } + return addr; } =20 --=20 2.20.1 From nobody Mon Dec 2 13:34:00 2024 Received: from mail-pj1-f43.google.com (mail-pj1-f43.google.com [209.85.216.43]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 542AB156236 for ; Thu, 31 Oct 2024 08:14:39 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.43 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730362481; cv=none; b=KU6aMKMxHkXjDmlR2pvhIt8PmKeArsr0Qx7OWsQrTS3B0nmlZ0sZCoVCAOvIvJx6falCl/l4WV+qlBrkvoU0O78GZPqMgCa5O9V8NsgXY6diaNeCRPA82XXFC7WNArDu6kmPAK4Crcx7ex3/3lTB6t/huBG1yYiGcT1hbaOuHbs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730362481; c=relaxed/simple; bh=rk0n6Li3l9bZTxRdjRis1jk0+ha5cdrGYnE5BpDgaeU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=DSjooZICdn121ejeDkkTbq0vh1gx1oxoWVRlQEaueck/cIX6ZDpPmAeoJxiuPkVxcN/x/0v4/zQmdEK0fyskJDVRbOU5sEcLP89cqukqSbEVW9DNLwzUp3Kz9gNBueXuLLv40c1ezLBJIjA4SdCqkIc66K/R4mdEmXyB1pdYXkI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=Kk1OlZS8; arc=none smtp.client-ip=209.85.216.43 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="Kk1OlZS8" Received: by mail-pj1-f43.google.com with SMTP id 98e67ed59e1d1-2e2a999b287so584524a91.0 for ; Thu, 31 Oct 2024 01:14:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730362478; x=1730967278; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/4RftZhl83d9xOlPt19CbxNfudxdUAgo7jhJTmA1K/8=; b=Kk1OlZS8TH3TNwXrW7IpF17mdmo3aW1oPpf4rZCuRXK83jYqksyukYUUceuudufkbM KQhe6T6NawpgMTSVllHI7xKFxFDJep0sXq9M4WaIFL0MkiUfMpm2iBmZ7CVEO+wSPiZk WYCNAF1IOaxx6Nzz5+wMqBjV5W7bDJe/NwOBAmJuIcqyIwsgjXJTcVI8E1+zHTO0T1RZ wN2KBySf8WPdBacr/saCr3XUe5jx6oaohgKpCeC0LuCB6hVyoGT+JT76ujBqjDL0wycB bdSR12OuiT0U2Awm1hhLM8nswcX+/f3nD5P823yKGZGueMldWORA5kFxUbYVAX+Dgtkw pbKw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730362478; x=1730967278; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/4RftZhl83d9xOlPt19CbxNfudxdUAgo7jhJTmA1K/8=; b=nf3hJovBz3cAEuypkYbRnf3rVkqtM3rPx4dXnrPT+GxpcnwlTi+9wA+1dWKDhBeJb6 seMfp0UIOeyz3jfwPfUXfNB9Oxv9bSlVp2S0qwoXIcpM7QaCKqknv/wFNJwcQvqUetCP R5/88ROorz8jeu3IWv1+XyklEcJD8fEXJ3v7Z6fHyAX6j1oXY59Xtbiq7+8pLPpu/fmY m2hKio6hHuggpOJX0+Uc3EuayAtB4rTUkmyPEOA/p6wN/Ydr4rFWA8l8c0MegcfNL+PY tpjxeGQyZBlU6RhKxOtDtUJ0hZg3G9TuA6PMjgs3XDlT+DfiZ1DH9Xp6dhI/cRRdXuCf AmWQ== X-Forwarded-Encrypted: i=1; AJvYcCV78qAJ7XQy38fYpx7xrqLb0sEGJWe8BKr2d4Whh410UHLSOp9m8BpCZXX4olLGWrtr1RDWCOAyC1BfL40=@vger.kernel.org X-Gm-Message-State: AOJu0YzzRrTZnnghw6TWSY9I2pD3iy3gXgs3s+N8fxaCx1fk8sx4XCkY jW5B+TN80WyF/uZc2D9TJugYLFrsvDB1ESM0spD4KGGE8nBlNGGtbuvI4l7K3so= X-Google-Smtp-Source: AGHT+IG9/W35C0GmYjhqi67a7cjSq3d8MXDjUDLzpnbNphLEBFgOx4BfbHz6qsffBssbygHke8ycIQ== X-Received: by 2002:a17:90b:1fc3:b0:2e2:a661:596a with SMTP id 98e67ed59e1d1-2e93c1860femr3011162a91.13.1730362478602; Thu, 31 Oct 2024 01:14:38 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.149]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e9201b2586sm5163996a91.0.2024.10.31.01.14.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 01:14:38 -0700 (PDT) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v2 5/7] mm: pgtable: try to reclaim empty PTE page in madvise(MADV_DONTNEED) Date: Thu, 31 Oct 2024 16:13:21 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now in order to pursue high performance, applications mostly use some high-performance user-mode memory allocators, such as jemalloc or tcmalloc. These memory allocators use madvise(MADV_DONTNEED or MADV_FREE) to release physical memory, but neither MADV_DONTNEED nor MADV_FREE will release page table memory, which may cause huge page table memory usage. The following are a memory usage snapshot of one process which actually happened on our server: VIRT: 55t RES: 590g VmPTE: 110g In this case, most of the page table entries are empty. For such a PTE page where all entries are empty, we can actually free it back to the system for others to use. As a first step, this commit aims to synchronously free the empty PTE pages in madvise(MADV_DONTNEED) case. We will detect and free empty PTE pages in zap_pte_range(), and will add zap_details.reclaim_pt to exclude cases other than madvise(MADV_DONTNEED). Once an empty PTE is detected, we first try to hold the pmd lock within the pte lock. If successful, we clear the pmd entry directly (fast path). Otherwise, we wait until the pte lock is released, then re-hold the pmd and pte locks and loop PTRS_PER_PTE times to check pte_none() to re-detect whether the PTE page is empty and free it (slow path). For other cases such as madvise(MADV_FREE), consider scanning and freeing empty PTE pages asynchronously in the future. The following code snippet can show the effect of optimization: mmap 50G while (1) { for (; i < 1024 * 25; i++) { touch 2M memory madvise MADV_DONTNEED 2M } } As we can see, the memory usage of VmPTE is reduced: before after VIRT 50.0 GB 50.0 GB RES 3.1 MB 3.1 MB VmPTE 102640 KB 240 KB Signed-off-by: Qi Zheng --- include/linux/mm.h | 1 + mm/Kconfig | 15 ++++++++++ mm/Makefile | 1 + mm/internal.h | 23 ++++++++++++++++ mm/madvise.c | 4 ++- mm/memory.c | 45 +++++++++++++++++++++++++++++- mm/pt_reclaim.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++ 7 files changed, 155 insertions(+), 2 deletions(-) create mode 100644 mm/pt_reclaim.c diff --git a/include/linux/mm.h b/include/linux/mm.h index 3e4bb43035953..ce3936590fe72 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -2319,6 +2319,7 @@ extern void pagefault_out_of_memory(void); struct zap_details { struct folio *single_folio; /* Locked folio to be unmapped */ bool even_cows; /* Zap COWed private pages too? */ + bool reclaim_pt; /* Need reclaim page tables? */ zap_flags_t zap_flags; /* Extra flags for zapping */ }; =20 diff --git a/mm/Kconfig b/mm/Kconfig index 84000b0168086..681909e0a9fa3 100644 --- a/mm/Kconfig +++ b/mm/Kconfig @@ -1301,6 +1301,21 @@ config ARCH_HAS_USER_SHADOW_STACK The architecture has hardware support for userspace shadow call stacks (eg, x86 CET, arm64 GCS or RISC-V Zicfiss). =20 +config ARCH_SUPPORTS_PT_RECLAIM + def_bool n + +config PT_RECLAIM + bool "reclaim empty user page table pages" + default y + depends on ARCH_SUPPORTS_PT_RECLAIM && MMU && SMP + select MMU_GATHER_RCU_TABLE_FREE + help + Try to reclaim empty user page table pages in paths other that munmap + and exit_mmap path. + + Note: now only empty user PTE page table pages will be reclaimed. + + source "mm/damon/Kconfig" =20 endmenu diff --git a/mm/Makefile b/mm/Makefile index d5639b0361663..9d816323d247a 100644 --- a/mm/Makefile +++ b/mm/Makefile @@ -145,3 +145,4 @@ obj-$(CONFIG_GENERIC_IOREMAP) +=3D ioremap.o obj-$(CONFIG_SHRINKER_DEBUG) +=3D shrinker_debug.o obj-$(CONFIG_EXECMEM) +=3D execmem.o obj-$(CONFIG_TMPFS_QUOTA) +=3D shmem_quota.o +obj-$(CONFIG_PT_RECLAIM) +=3D pt_reclaim.o diff --git a/mm/internal.h b/mm/internal.h index d5b93c5b63648..7aba395a9940f 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -1508,4 +1508,27 @@ int walk_page_range_mm(struct mm_struct *mm, unsigne= d long start, unsigned long end, const struct mm_walk_ops *ops, void *private); =20 +#ifdef CONFIG_PT_RECLAIM +bool try_get_and_clear_pmd(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdval= ); +void free_pte(struct mm_struct *mm, unsigned long addr, struct mmu_gather = *tlb, + pmd_t pmdval); +void try_to_free_pte(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, + struct mmu_gather *tlb); +#else +static inline bool try_get_and_clear_pmd(struct mm_struct *mm, pmd_t *pmd, + pmd_t *pmdval) +{ + return false; +} +static inline void free_pte(struct mm_struct *mm, unsigned long addr, + struct mmu_gather *tlb, pmd_t pmdval) +{ +} +static inline void try_to_free_pte(struct mm_struct *mm, pmd_t *pmd, + unsigned long addr, struct mmu_gather *tlb) +{ +} +#endif /* CONFIG_PT_RECLAIM */ + + #endif /* __MM_INTERNAL_H */ diff --git a/mm/madvise.c b/mm/madvise.c index 0ceae57da7dad..ee88652761d45 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -851,7 +851,9 @@ static int madvise_free_single_vma(struct vm_area_struc= t *vma, static long madvise_dontneed_single_vma(struct vm_area_struct *vma, unsigned long start, unsigned long end) { - zap_page_range_single(vma, start, end - start, NULL); + struct zap_details details =3D {.reclaim_pt =3D true,}; + + zap_page_range_single(vma, start, end - start, &details); return 0; } =20 diff --git a/mm/memory.c b/mm/memory.c index 002aa4f454fa0..c4a8c18fbcfd7 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1436,7 +1436,7 @@ copy_page_range(struct vm_area_struct *dst_vma, struc= t vm_area_struct *src_vma) static inline bool should_zap_cows(struct zap_details *details) { /* By default, zap all pages */ - if (!details) + if (!details || details->reclaim_pt) return true; =20 /* Or, we zap COWed pages only if the caller wants to */ @@ -1678,6 +1678,30 @@ static inline int do_zap_pte_range(struct mmu_gather= *tlb, details, rss); } =20 +static inline int count_pte_none(pte_t *pte, int nr) +{ + int none_nr =3D 0; + + /* + * If PTE_MARKER_UFFD_WP is enabled, the uffd-wp PTEs may be + * re-installed, so we need to check pte_none() one by one. + * Otherwise, checking a single PTE in a batch is sufficient. + */ +#ifdef CONFIG_PTE_MARKER_UFFD_WP + for (;;) { + if (pte_none(ptep_get(pte))) + none_nr++; + if (--nr =3D=3D 0) + break; + pte++; + } +#else + if (pte_none(ptep_get(pte))) + none_nr =3D nr; +#endif + return none_nr; +} + static unsigned long zap_pte_range(struct mmu_gather *tlb, struct vm_area_struct *vma, pmd_t *pmd, unsigned long addr, unsigned long end, @@ -1689,8 +1713,16 @@ static unsigned long zap_pte_range(struct mmu_gather= *tlb, spinlock_t *ptl; pte_t *start_pte; pte_t *pte; + pmd_t pmdval; + bool can_reclaim_pt =3D false; + bool direct_reclaim =3D false; + unsigned long start =3D addr; + int none_nr =3D 0; int nr; =20 + if (details && details->reclaim_pt && (end - start >=3D PMD_SIZE)) + can_reclaim_pt =3D true; + retry: tlb_change_page_size(tlb, PAGE_SIZE); init_rss_vec(rss); @@ -1706,12 +1738,16 @@ static unsigned long zap_pte_range(struct mmu_gathe= r *tlb, =20 nr =3D do_zap_pte_range(tlb, vma, pte, addr, end, details, rss, &force_flush, &force_break); + none_nr +=3D count_pte_none(pte, nr); if (unlikely(force_break)) { addr +=3D nr * PAGE_SIZE; break; } } while (pte +=3D nr, addr +=3D PAGE_SIZE * nr, addr !=3D end); =20 + if (addr =3D=3D end && can_reclaim_pt && (none_nr =3D=3D PTRS_PER_PTE)) + direct_reclaim =3D try_get_and_clear_pmd(mm, pmd, &pmdval); + add_mm_rss_vec(mm, rss); arch_leave_lazy_mmu_mode(); =20 @@ -1738,6 +1774,13 @@ static unsigned long zap_pte_range(struct mmu_gather= *tlb, goto retry; } =20 + if (can_reclaim_pt) { + if (direct_reclaim) + free_pte(mm, start, tlb, pmdval); + else + try_to_free_pte(mm, pmd, start, tlb); + } + return addr; } =20 diff --git a/mm/pt_reclaim.c b/mm/pt_reclaim.c new file mode 100644 index 0000000000000..fc055da40b615 --- /dev/null +++ b/mm/pt_reclaim.c @@ -0,0 +1,68 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include + +#include "internal.h" + +bool try_get_and_clear_pmd(struct mm_struct *mm, pmd_t *pmd, pmd_t *pmdval) +{ + spinlock_t *pml =3D pmd_lockptr(mm, pmd); + + if (!spin_trylock(pml)) + return false; + + *pmdval =3D pmdp_get_lockless(pmd); + pmd_clear(pmd); + spin_unlock(pml); + + return true; +} + +void free_pte(struct mm_struct *mm, unsigned long addr, struct mmu_gather = *tlb, + pmd_t pmdval) +{ + pte_free_tlb(tlb, pmd_pgtable(pmdval), addr); + mm_dec_nr_ptes(mm); +} + +void try_to_free_pte(struct mm_struct *mm, pmd_t *pmd, unsigned long addr, + struct mmu_gather *tlb) +{ + pmd_t pmdval; + spinlock_t *pml, *ptl; + pte_t *start_pte, *pte; + int i; + + start_pte =3D pte_offset_map_rw_nolock(mm, pmd, addr, &pmdval, &ptl); + if (!start_pte) + return; + + pml =3D pmd_lock(mm, pmd); + if (ptl !=3D pml) + spin_lock_nested(ptl, SINGLE_DEPTH_NESTING); + + if (unlikely(!pmd_same(pmdval, pmdp_get_lockless(pmd)))) + goto out_ptl; + + /* Check if it is empty PTE page */ + for (i =3D 0, pte =3D start_pte; i < PTRS_PER_PTE; i++, pte++) { + if (!pte_none(ptep_get(pte))) + goto out_ptl; + } + pte_unmap(start_pte); + + pmd_clear(pmd); + + if (ptl !=3D pml) + spin_unlock(ptl); + spin_unlock(pml); + + free_pte(mm, addr, tlb, pmdval); + + return; +out_ptl: + pte_unmap_unlock(start_pte, ptl); + if (pml !=3D ptl) + spin_unlock(pml); +} --=20 2.20.1 From nobody Mon Dec 2 13:34:00 2024 Received: from mail-pl1-f174.google.com (mail-pl1-f174.google.com [209.85.214.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 656EB1922D6 for ; Thu, 31 Oct 2024 08:14:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730362487; cv=none; b=cy/roPppWzD13T9t+livjkofaHznrprG1Wjoy59kwZn8ku3vbGafRBFfElm423U8HXHUI44Q9Er5+zIAAjSvUi55BdaYvQM3dChpogiDvSEt4wTIj1WZ6qIYvPMMQnivg+I8mgR5Se2md3e8VuIJd/EcTPl5+v8cInQhzEQfVXA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730362487; c=relaxed/simple; bh=5pV2lLBjHA1VmB/XAuJiCUqSMDB+4PzLY1ZtruvLm/E=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=QZrkCm0D/qWanZ9QfuoQc4LtQyiAR269NJqpf8xkaP3RNiObFHtHU6/ugPOCgctUbvxn8FyXFhZX7l763I+5X1wvOGFj6ZdfdHuv7Jq414HuO2lrnlltQmutx+KDGB3ToWouA0avEIzeZmlTqenkmqmX8yoAtVssna58sUHHHdI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=OMnmcYlw; arc=none smtp.client-ip=209.85.214.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="OMnmcYlw" Received: by mail-pl1-f174.google.com with SMTP id d9443c01a7336-20c77459558so6115345ad.0 for ; Thu, 31 Oct 2024 01:14:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730362484; x=1730967284; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=96N82BICgFqmbQqgF/vu6V3zKuLzfmtjKmdmS7DODoY=; b=OMnmcYlwXwdAuy6RGPxS9xUVmNM6wwW7jh9NpKxNxIQmIsxvXicwm8U7KW7PR444E7 u0qYaghLJENesbJg8n866tPeCN4SHUzyfpANRnDHk3POYPPj6mRFs4AoZYcN6FfAMyeT i035a71/J6ywg10FuOlKlD9Wu8fUv08ZKeRKgHfrbZgjpMcMGObb5hu58MHpJw1ZH1Q0 5zrOIFPie/89jXjHXv9ASrsVRgK70nWV6uLql42bWpvWxSzkGw2I1DN2zap4EvFhMwQT AI1FA/OplnVqc0jPoLZNSSlpbYaif2oF8k2lJ+Rb0E9DpvS+fJcKQy1zNj55JYZX8eRH x+MA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730362484; x=1730967284; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=96N82BICgFqmbQqgF/vu6V3zKuLzfmtjKmdmS7DODoY=; b=nd6aQKtGIpmaM6QgGIX6fN325oEDXZPfDe2OR0O9tzlLWSs0ybHcb0BVZMvovPKpiE X5g5+qjZZrogt2GQhn2PjYxzwPPoVuYMTk2wUzpz7Tag9J21s+1rr4AviaP9ETqYkUyW nVYcgxCvmP27In6RSeJw5qXRQrgqOmpezWE5IGNBe705HXqEjYq7pEPacuKCZzXl2fma H5wOjrL8aiWckgdlwZLr26oITNtVL4HHdO8mz0/rEtUiWnvt2s7BkCVoDDI0sJE2NYcs IeNu/9sXdAcg7jE1B232DsxmK2B2mdBAakr3Vd1NOgAD9qhqOqHknX6NzahC2qid+ddw M95w== X-Forwarded-Encrypted: i=1; AJvYcCUxECBx3P3A21pp02ph/628gsMHa3CWdC/8OcAhADeBBQC6MFzAOOcDlsMRQ2AptnCLVS/+OBvn9OsbaPU=@vger.kernel.org X-Gm-Message-State: AOJu0Yx3BwKLcOGbQ+z6TQKTmpaygOnM/6ZHSRjEyZkE4jbbDr3Bnlyr DoAQvULFXq7I5MYD0lW0CS70dKmqgSWgraTZwd4cwzZUhH5mn1ccJFiYwxLURBQ= X-Google-Smtp-Source: AGHT+IHc3twInabx1E+TNdhn783JHqCF2GYMwoTNrA61R6GZcT2sSHxVlBm8Eg/JbQvoGhNtrFU0aA== X-Received: by 2002:a17:90b:3b4b:b0:2d8:8509:85cd with SMTP id 98e67ed59e1d1-2e92cf687efmr5271210a91.40.1730362484634; Thu, 31 Oct 2024 01:14:44 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.149]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e9201b2586sm5163996a91.0.2024.10.31.01.14.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 01:14:44 -0700 (PDT) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v2 6/7] x86: mm: free page table pages by RCU instead of semi RCU Date: Thu, 31 Oct 2024 16:13:22 +0800 Message-Id: X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now, if CONFIG_MMU_GATHER_RCU_TABLE_FREE is selected, the page table pages will be freed by semi RCU, that is: - batch table freeing: asynchronous free by RCU - single table freeing: IPI + synchronous free In this way, the page table can be lockless traversed by disabling IRQ in paths such as fast GUP. But this is not enough to free the empty PTE page table pages in paths other that munmap and exit_mmap path, because IPI cannot be synchronized with rcu_read_lock() in pte_offset_map{_lock}(). In preparation for supporting empty PTE page table pages reclaimation, let single table also be freed by RCU like batch table freeing. Then we can also use pte_offset_map() etc to prevent PTE page from being freed. Like pte_free_defer(), we can also safely use ptdesc->pt_rcu_head to free the page table pages: - The pt_rcu_head is unioned with pt_list and pmd_huge_pte. - For pt_list, it is used to manage the PGD page in x86. Fortunately tlb_remove_table() will not be used for free PGD pages, so it is safe to use pt_rcu_head. - For pmd_huge_pte, we will do zap_deposited_table() before freeing the PMD page, so it is also safe. Signed-off-by: Qi Zheng --- arch/x86/include/asm/tlb.h | 19 +++++++++++++++++++ arch/x86/kernel/paravirt.c | 7 +++++++ arch/x86/mm/pgtable.c | 10 +++++++++- mm/mmu_gather.c | 9 ++++++++- 4 files changed, 43 insertions(+), 2 deletions(-) diff --git a/arch/x86/include/asm/tlb.h b/arch/x86/include/asm/tlb.h index 580636cdc257b..e223b53a8b190 100644 --- a/arch/x86/include/asm/tlb.h +++ b/arch/x86/include/asm/tlb.h @@ -34,4 +34,23 @@ static inline void __tlb_remove_table(void *table) free_page_and_swap_cache(table); } =20 +#ifdef CONFIG_PT_RECLAIM +static inline void __tlb_remove_table_one_rcu(struct rcu_head *head) +{ + struct page *page; + + page =3D container_of(head, struct page, rcu_head); + free_page_and_swap_cache(page); +} + +static inline void __tlb_remove_table_one(void *table) +{ + struct page *page; + + page =3D table; + call_rcu(&page->rcu_head, __tlb_remove_table_one_rcu); +} +#define __tlb_remove_table_one __tlb_remove_table_one +#endif /* CONFIG_PT_RECLAIM */ + #endif /* _ASM_X86_TLB_H */ diff --git a/arch/x86/kernel/paravirt.c b/arch/x86/kernel/paravirt.c index fec3815335558..89688921ea62e 100644 --- a/arch/x86/kernel/paravirt.c +++ b/arch/x86/kernel/paravirt.c @@ -59,10 +59,17 @@ void __init native_pv_lock_init(void) static_branch_enable(&virt_spin_lock_key); } =20 +#ifndef CONFIG_PT_RECLAIM static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) { tlb_remove_page(tlb, table); } +#else +static void native_tlb_remove_table(struct mmu_gather *tlb, void *table) +{ + tlb_remove_table(tlb, table); +} +#endif =20 struct static_key paravirt_steal_enabled; struct static_key paravirt_steal_rq_enabled; diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 5745a354a241c..69a357b15974a 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -19,12 +19,20 @@ EXPORT_SYMBOL(physical_mask); #endif =20 #ifndef CONFIG_PARAVIRT +#ifndef CONFIG_PT_RECLAIM static inline void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) { tlb_remove_page(tlb, table); } -#endif +#else +static inline +void paravirt_tlb_remove_table(struct mmu_gather *tlb, void *table) +{ + tlb_remove_table(tlb, table); +} +#endif /* !CONFIG_PT_RECLAIM */ +#endif /* !CONFIG_PARAVIRT */ =20 gfp_t __userpte_alloc_gfp =3D GFP_PGTABLE_USER | PGTABLE_HIGHMEM; =20 diff --git a/mm/mmu_gather.c b/mm/mmu_gather.c index 99b3e9408aa0f..d948479ca09e6 100644 --- a/mm/mmu_gather.c +++ b/mm/mmu_gather.c @@ -311,10 +311,17 @@ static inline void tlb_table_invalidate(struct mmu_ga= ther *tlb) } } =20 +#ifndef __tlb_remove_table_one +static inline void __tlb_remove_table_one(void *table) +{ + __tlb_remove_table(table); +} +#endif + static void tlb_remove_table_one(void *table) { tlb_remove_table_sync_one(); - __tlb_remove_table(table); + __tlb_remove_table_one(table); } =20 static void tlb_table_flush(struct mmu_gather *tlb) --=20 2.20.1 From nobody Mon Dec 2 13:34:00 2024 Received: from mail-pj1-f47.google.com (mail-pj1-f47.google.com [209.85.216.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2C48516DEA7 for ; Thu, 31 Oct 2024 08:14:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.47 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730362493; cv=none; b=GqF5FHV5cXEO49eBlnFwv+sIsJUVAjIGMxs/ginZL3deri3c3saLbiDePLFs2YQ/fzq+6sJ44tq5QL6Xes+hQwBreSF9CjuVjtTu4OCWrWYDAA7496REacUrc9W5SrDhc8UH8Ptm2cSEj4uRacoQ4fv45KKs14ynIvsYiytRBCM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1730362493; c=relaxed/simple; bh=Z3NAs2nRWoVvOWIh+RmFbHZzRV4WWeH50tMnPn/TZrU=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=QbSCIgJ7IYxUEN2OGZA+0x0QZpT0lmH1iBDoH7RLUrSODaEn+x7p4m1hz6AIhdF97C7NdFuqVQhNe8OkfcbD4avnsho3NaRE60diSxJHjd40ZcQEMTN7AFbnJFAMdF1FBdme9xsSuYtD7OjoKlfGQhp37UqcdSfaGWdSMypG1bM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com; spf=pass smtp.mailfrom=bytedance.com; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b=aORo0tZO; arc=none smtp.client-ip=209.85.216.47 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=bytedance.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=bytedance.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=bytedance.com header.i=@bytedance.com header.b="aORo0tZO" Received: by mail-pj1-f47.google.com with SMTP id 98e67ed59e1d1-2e56df894d4so500851a91.3 for ; Thu, 31 Oct 2024 01:14:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1730362491; x=1730967291; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=iWdHYUgzbXn/U7M7SYCKm/NpDuEpbf7wkXka4/ylaW0=; b=aORo0tZOIlzY/DypYpSo7znigYmK17bi0nAr2RtUEDvE7FbJq5VmtYtwKnYTNXndt8 vbkQgNgAasPRaehCVhDQ/g4XtZ1+oPKUaDu96dqNH26GpEXidMiJGdDnOeQ11B14w7ex AU5mbsTnGolLR/GR5laGttSUF0GXPe0svVhotXJGIpyOD+LiYJRPa5AOKkHpvrt3E+Tw GBnMu84Ekx07YGdK5TJf2YVGQrH6NSrWXjLU3LVgEdeP4bUm1LgU/gRnGGA6rK/0NIda Os1Szpkd3mLaE/Wu8jFPc4NIkSPUy4uyExcvn8Q2cUyO/pPQteL93q1Jf0MV+pVYFL5t iO4Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730362491; x=1730967291; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=iWdHYUgzbXn/U7M7SYCKm/NpDuEpbf7wkXka4/ylaW0=; b=eZ5gTpovJTsM1GLZQn3PDsuiJ8wNKETxmQVCLey23oYZkiTBEv0ComJmhqT0ztw5Qx lQ4otmx1EZ9BW/ek7CyxOPI9v/52LtItaaXBhPwQNYdyEc0mo43yYlrvj40EEsFSG+90 HY6OTFOrsBt1NI0x3vXAmKsnVrKHX7GbWDe3TSOtvKQImCDvpyiAjDU2gLe5wcImaaeh Cw9O2CbVJnUYnciHAPboMBLKU1Xpi8Op5GyyCeRLqQe39zvwZ/uCozVeRwHCRI3VUhvF zPMilgm7OVagZvXPAiz6wBtNonRI2fKx4+xwtGPi4w3abCKZ/Aerhqj+DD/AuFYJI3jl Qsgw== X-Forwarded-Encrypted: i=1; AJvYcCU4h/omnC6Nq5CiQue+MGH0W+FEUhyK6sHo6SZLyMu9N7L2gmNz0n8Jk3rd49WQyglKn5CtnK/jn0ikXac=@vger.kernel.org X-Gm-Message-State: AOJu0YwfGgbFEgTF2Ds5tg2o18doquU8OTrpoGJZl/KGJts++qlH0b8q X7n9GQUMpg7UVg5eL8D0J508SvBw4Ck7YPDWlgWUk7NAjt12v9sHh94eXOrHNiQ= X-Google-Smtp-Source: AGHT+IE3rqtNftE0bEC7q6UBe3srVwCXipxTTFqti2PirnO+trKsr5sr4dmd4R0xR3d7wiAw+7f4XQ== X-Received: by 2002:a17:90b:52c8:b0:2e2:a013:859a with SMTP id 98e67ed59e1d1-2e93c128a9cmr2763869a91.7.1730362491551; Thu, 31 Oct 2024 01:14:51 -0700 (PDT) Received: from C02DW0BEMD6R.bytedance.net ([203.208.167.149]) by smtp.gmail.com with ESMTPSA id 98e67ed59e1d1-2e9201b2586sm5163996a91.0.2024.10.31.01.14.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 31 Oct 2024 01:14:51 -0700 (PDT) From: Qi Zheng To: david@redhat.com, jannh@google.com, hughd@google.com, willy@infradead.org, mgorman@suse.de, muchun.song@linux.dev, vbabka@kernel.org, akpm@linux-foundation.org, zokeefe@google.com, rientjes@google.com, peterx@redhat.com, catalin.marinas@arm.com Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, x86@kernel.org, Qi Zheng Subject: [PATCH v2 7/7] x86: select ARCH_SUPPORTS_PT_RECLAIM if X86_64 Date: Thu, 31 Oct 2024 16:13:23 +0800 Message-Id: <8ed7cdb1e5fa5e0a357bec9178b8932d596573c9.1730360798.git.zhengqi.arch@bytedance.com> X-Mailer: git-send-email 2.24.3 (Apple Git-128) In-Reply-To: References: Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Now, x86 has fully supported the CONFIG_PT_RECLAIM feature, and reclaiming PTE pages is profitable only on 64-bit systems, so select ARCH_SUPPORTS_PT_RECLAIM if X86_64. Signed-off-by: Qi Zheng --- arch/x86/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 1d194fb2f979e..194baed21ae5c 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -320,6 +320,7 @@ config X86 select FUNCTION_ALIGNMENT_4B imply IMA_SECURE_AND_OR_TRUSTED_BOOT if EFI select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE + select ARCH_SUPPORTS_PT_RECLAIM if X86_64 =20 config INSTRUCTION_DECODER def_bool y --=20 2.20.1