From nobody Fri Dec 19 19:09:35 2025 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DCA05182C5 for ; Tue, 6 Aug 2024 02:21:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722910884; cv=none; b=t2+k06/AIBOENnAx90zEi4e4K0F7nvQ15wJ5pdYRoM/RKljzMJPGOHwO7ZxS3DFlytw0IMxge7rJzYBk9IRSlPj4pU9aeYkN4duJhU8d0N+7420GWFqQwkm4cEkGuyPksd9oS0e/7q2JRNU0sNowMHuwcXhgAVgvWmAuCCtWbRw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722910884; c=relaxed/simple; bh=xps+glxZeFp3Mb1v2Adq8WZaOGiU0jfykOgLRet3dso=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=b1+Her5AXBSf+xvwX9fqIa20tPkcR8843VdKdZOG762X1Z0f3r3oury2znq58xI/oYVtzDxg4Fk0DazD7yj0suWd7r2YrijMNEZk8sDswbXzk1RWWqaLrdylrH3+fJQeHieEDH4e9kL7ZlFGhQJke9o26CQzq/Soi/96UnkNhZg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=jkGdJ2MC; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="jkGdJ2MC" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6698f11853aso6016537b3.0 for ; Mon, 05 Aug 2024 19:21:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1722910882; x=1723515682; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=yV7dY4XDiIXGgrOmDcJWCpFmJrDLDUqsW20nfMTrSyE=; b=jkGdJ2MCXZPFfjVl/Tv4b/cyVZ/Jet62JDGbkTFAi7Niy9x3bn+uagK/Szse1GiVVf IgkNfv+ljz2JY50nBXyDfaIhz5HMNtT5akpTayvqCuW3yS4wLEf8nXxRDwjs+feH62EF IAlSCvjLuabgebuhYH8uZGIB7UjR3Q8pVBpcEHeVh8Vb8zdKWJyEfIj+do0OlnKK8t3z J4nRy+NnebKymp9dM7U1u9KNM8orxXVGUsyxAMS4tociuzhbxznl+LxcpBtTU87G6aMK 4pvJgE0fJqM+twhw9tuBWlFA+piEbvyNsYmMVORlXHZSkMrO0tHo6N5NJ9njdzwLsJxj wBwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722910882; x=1723515682; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yV7dY4XDiIXGgrOmDcJWCpFmJrDLDUqsW20nfMTrSyE=; b=vApGBLA2l8XkL6miH+JIAVqdku3TwaMiDyRSOWd4Z9wjmkWBW8RB5kQvGD3kzx0FRo PZeFZpXQCctwu30iZvOTRP4zMI1ziDi1s1B8v29NjydWsSxQzf3saZIGbgQpdg22Bl2P xGDvZ6TKABUjmWHe5arZfR2Umj6kxugY4aNUXLghja+yjvtFXE4l+McdQhCun0+HDp05 Ngmbkquh3T5M2WBidJ0MyI9p+zq3Sg3EFXKpkVHNqY7B9lNHJDWukQQhKD0v0OJJDZsH e/K7uZmZJRkF4b+1QOtyde+KVKNIL51V9+NsWlvuB5+5z+1i8Qoatk9r4LcMjik5U6an F3rg== X-Forwarded-Encrypted: i=1; AJvYcCXk8diWnUZcOSLvBU1Kqdae+slEk0v8HEgXRJJXudWe+RYK57Ki1tZ0Xgd3j662gddp35TafhCeHYW2BETvSecwM6Wnuboxk/ImJiRh X-Gm-Message-State: AOJu0YyZqhvx3/HWzkajaVTPnPVgIycZaewUYm0NnWsy8iOqjoNaqJwT xkv1tefRMI50GWMmbnkRMJmCiEYKMfjMvDRV6y+LJOCNebV1AdOtA0deog/wkTA8cV/hVMDxY3J ITw== X-Google-Smtp-Source: AGHT+IGSY6zrX4Qs7JrUctkrSdbtisepSykseFX5DdXlWxT8TG2QUbuQRw2CO7r/i/yRAh7feSGbV18ovH8= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:261c:802b:6b55:e09c]) (user=yuzhao job=sendgmr) by 2002:a05:6902:2b84:b0:e0b:cce3:45c7 with SMTP id 3f1490d57ef6-e0bde3ef075mr24855276.9.1722910881783; Mon, 05 Aug 2024 19:21:21 -0700 (PDT) Date: Mon, 5 Aug 2024 20:21:11 -0600 In-Reply-To: <20240806022114.3320543-1-yuzhao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240806022114.3320543-1-yuzhao@google.com> X-Mailer: git-send-email 2.46.0.rc2.264.g509ed76dc8-goog Message-ID: <20240806022114.3320543-2-yuzhao@google.com> Subject: [RFC PATCH 1/4] mm: HVO: introduce helper function to update and flush pgtable From: Yu Zhao To: Catalin Marinas , Will Deacon Cc: Andrew Morton , David Rientjes , Douglas Anderson , Frank van der Linden , Mark Rutland , Muchun Song , Nanyong Sun , Yang Shi , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Muchun Song , Yu Zhao Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Nanyong Sun Add pmd/pte update and tlb flush helper function to update page table. This refactoring patch is designed to facilitate each architecture to implement its own special logic in preparation for the arm64 architecture to follow the necessary break-before-make sequence when updating page tables. Signed-off-by: Nanyong Sun Reviewed-by: Muchun Song Signed-off-by: Yu Zhao --- mm/hugetlb_vmemmap.c | 55 ++++++++++++++++++++++++++++++++++---------- 1 file changed, 43 insertions(+), 12 deletions(-) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 829112b0a914..2dd92e58f304 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -46,6 +46,37 @@ struct vmemmap_remap_walk { unsigned long flags; }; =20 +#ifndef vmemmap_update_pmd +static inline void vmemmap_update_pmd(unsigned long addr, + pmd_t *pmdp, pte_t *ptep) +{ + pmd_populate_kernel(&init_mm, pmdp, ptep); +} +#endif + +#ifndef vmemmap_update_pte +static inline void vmemmap_update_pte(unsigned long addr, + pte_t *ptep, pte_t pte) +{ + set_pte_at(&init_mm, addr, ptep, pte); +} +#endif + +#ifndef vmemmap_flush_tlb_all +static inline void vmemmap_flush_tlb_all(void) +{ + flush_tlb_all(); +} +#endif + +#ifndef vmemmap_flush_tlb_range +static inline void vmemmap_flush_tlb_range(unsigned long start, + unsigned long end) +{ + flush_tlb_kernel_range(start, end); +} +#endif + static int vmemmap_split_pmd(pmd_t *pmd, struct page *head, unsigned long = start, struct vmemmap_remap_walk *walk) { @@ -81,9 +112,9 @@ static int vmemmap_split_pmd(pmd_t *pmd, struct page *he= ad, unsigned long start, =20 /* Make pte visible before pmd. See comment in pmd_install(). */ smp_wmb(); - pmd_populate_kernel(&init_mm, pmd, pgtable); + vmemmap_update_pmd(start, pmd, pgtable); if (!(walk->flags & VMEMMAP_SPLIT_NO_TLB_FLUSH)) - flush_tlb_kernel_range(start, start + PMD_SIZE); + vmemmap_flush_tlb_range(start, start + PMD_SIZE); } else { pte_free_kernel(&init_mm, pgtable); } @@ -171,7 +202,7 @@ static int vmemmap_remap_range(unsigned long start, uns= igned long end, return ret; =20 if (walk->remap_pte && !(walk->flags & VMEMMAP_REMAP_NO_TLB_FLUSH)) - flush_tlb_kernel_range(start, end); + vmemmap_flush_tlb_range(start, end); =20 return 0; } @@ -220,15 +251,15 @@ static void vmemmap_remap_pte(pte_t *pte, unsigned lo= ng addr, =20 /* * Makes sure that preceding stores to the page contents from - * vmemmap_remap_free() become visible before the set_pte_at() - * write. + * vmemmap_remap_free() become visible before the + * vmemmap_update_pte() write. */ smp_wmb(); } =20 entry =3D mk_pte(walk->reuse_page, pgprot); list_add(&page->lru, walk->vmemmap_pages); - set_pte_at(&init_mm, addr, pte, entry); + vmemmap_update_pte(addr, pte, entry); } =20 /* @@ -267,10 +298,10 @@ static void vmemmap_restore_pte(pte_t *pte, unsigned = long addr, =20 /* * Makes sure that preceding stores to the page contents become visible - * before the set_pte_at() write. + * before the vmemmap_update_pte() write. */ smp_wmb(); - set_pte_at(&init_mm, addr, pte, mk_pte(page, pgprot)); + vmemmap_update_pte(addr, pte, mk_pte(page, pgprot)); } =20 /** @@ -536,7 +567,7 @@ long hugetlb_vmemmap_restore_folios(const struct hstate= *h, } =20 if (restored) - flush_tlb_all(); + vmemmap_flush_tlb_all(); if (!ret) ret =3D restored; return ret; @@ -664,7 +695,7 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, = struct list_head *folio_l break; } =20 - flush_tlb_all(); + vmemmap_flush_tlb_all(); =20 /* avoid writes from page_ref_add_unless() while folding vmemmap */ synchronize_rcu(); @@ -684,7 +715,7 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, = struct list_head *folio_l * allowing more vmemmap remaps to occur. */ if (ret =3D=3D -ENOMEM && !list_empty(&vmemmap_pages)) { - flush_tlb_all(); + vmemmap_flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); INIT_LIST_HEAD(&vmemmap_pages); __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, @@ -692,7 +723,7 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, = struct list_head *folio_l } } =20 - flush_tlb_all(); + vmemmap_flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); } =20 --=20 2.46.0.rc2.264.g509ed76dc8-goog From nobody Fri Dec 19 19:09:35 2025 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6309629CE6 for ; Tue, 6 Aug 2024 02:21:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722910888; cv=none; b=bSSfRwKNGktJkEt9498cLm9/TK25Kq38Lu5jdVv8eRQ97yYpk7UQTIwTNzngz/hVf5magR1KtBVvrSjE0Ulb4gqHcfTy2TpEScU83gtwJJQDNRmXlPJMoSOnQNs+cp5tWD3mW0y8ZXvNoni6zgbn2g2q4dD0RvWFKUxx3f6mOoU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722910888; c=relaxed/simple; bh=F5pXY2Sq6J+Yo5E4HFGDQM6LmvvZAzJGj3h1Vm9k0Gw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=nrD35q5qtm7ZSWyTac18RE7ljsCfx/3ekBUp5BD/87X+gI2pF9DCa9nopQVa+vdSmc1z0A+JdLstmFfw8MKKxRIf3xrjPOAs+KZKslqzO9ZWxwBPRrdKLueCOei36I+tW2Oi3FEvWo80ujEIWw7FRVgr1N7xvBBvq210T77CZqI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=bAQMnU14; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="bAQMnU14" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-664aa55c690so4016817b3.2 for ; Mon, 05 Aug 2024 19:21:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1722910884; x=1723515684; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=O2mshx39uv2zkvLi95oIpi0tXJ9/QGv1zrpIuDCw18w=; b=bAQMnU14xFqtZ8u72hK/Xo6KgJY6TR+U7BTKvJgEN/IuyJpG6kOxiMrMFkJUCo9vNl 3FQzvZ5tsN5LMnqXF1Z0EY3//Zu+p+syuIvDktBsD04UqBBplDhIr6zYyXB8FmVArs08 muSAh3hfOlE9r1I9YZhJkAO9AnsaSh0MSVl6IsoYdL80hXKc4N06QahsBuchMdlC8bR1 T0w2bYrAw9P7cpzdRLrN1f5dmFzZTp/zCGWc7xYJhJprKn/TkQoOvdln08MgH1WUf9mp mT3bgXOYMJ8HcVgqzl9Xn0/8LRajePMUaW3YllwfDOdDhDQ55PHOGfwDStgpNZYyeLen 4Cxg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722910884; x=1723515684; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=O2mshx39uv2zkvLi95oIpi0tXJ9/QGv1zrpIuDCw18w=; b=K/rJgZWl1xy2E0J98qqOmBkZ/gyUSLyG0PqEEUrD/X0SqlLFUBGRe3QlISigUJJkrm A8uyhwlJfIRYFt9EswFeSw7jLlE/G26a9BElpzxrYB9twEMu/1TMknZb70HXe/qjhinZ //568QuYJbOLTR5VEZvd9iHihbNY2sVuwIbdOseVHWoLtxY8R5xCR3Ey2AIKZWd92ZSM WbIzng2Gl8FzFk+XJk4HTujmkvELfBScTvuJk8kAVY+/17+d8I1aglpEMYz3MelFlu3+ GymTJk4LQ8hehMEVmO+Xz+d2DeJxJnUs5VE2pqT2BfiG5X4jPchcM5vHHLG/mP6CXbqR +Q+g== X-Forwarded-Encrypted: i=1; AJvYcCUEPaDHGkq58l0597lh3jFEPbsvU3T2EXVfoVyrJx3oHPjIRpXYKWkxi6B9qgJeOJNy1jpx9Ly+LK9Fsfxrq5zWtW1ESVRvD3iEaAMf X-Gm-Message-State: AOJu0YzmpEzlSrPXHjGCwcfBTARrZjkn1K5dckk/yhAuNjceMJyMJjxz Iq7vu9Ce26e8h+3YmhiIgImdSEekxHXj/joqFcNusEhGEInKWhu1mqLSZh0CkmG31y14omotnDh jzw== X-Google-Smtp-Source: AGHT+IG70ROskvb4vGAVk7HFhpn0VwxmpNbq5+lHHNoIlP/q1TqLemnSQjn2n7kwU3OW0QixZ2DVJNXlB8A= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:261c:802b:6b55:e09c]) (user=yuzhao job=sendgmr) by 2002:a05:690c:288:b0:673:b39a:9291 with SMTP id 00721157ae682-6895ffbd30dmr4163187b3.3.1722910884345; Mon, 05 Aug 2024 19:21:24 -0700 (PDT) Date: Mon, 5 Aug 2024 20:21:12 -0600 In-Reply-To: <20240806022114.3320543-1-yuzhao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240806022114.3320543-1-yuzhao@google.com> X-Mailer: git-send-email 2.46.0.rc2.264.g509ed76dc8-goog Message-ID: <20240806022114.3320543-3-yuzhao@google.com> Subject: [RFC PATCH 2/4] arm64: use IPIs to pause/resume remote CPUs From: Yu Zhao To: Catalin Marinas , Will Deacon Cc: Andrew Morton , David Rientjes , Douglas Anderson , Frank van der Linden , Mark Rutland , Muchun Song , Nanyong Sun , Yang Shi , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yu Zhao Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Use pseudo-NMI IPIs to pause remote CPUs for a short period of time, and then reliably resume them when the local CPU exits critical sections that preclude the execution of remote CPUs. A typical example of such critical sections is BBM on kernel PTEs. HugeTLB Vmemmap Optimization (HVO) on arm64 was disabled by commit 060a2c92d1b6 ("arm64: mm: hugetlb: Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP") due to the folllowing reason: This is deemed UNPREDICTABLE by the Arm architecture without a break-before-make sequence (make the PTE invalid, TLBI, write the new valid PTE). However, such sequence is not possible since the vmemmap may be concurrently accessed by the kernel. Supporting BBM on kernel PTEs is one of the approaches that can potentially make arm64 support HVO. Signed-off-by: Yu Zhao --- arch/arm64/include/asm/smp.h | 3 + arch/arm64/kernel/smp.c | 110 +++++++++++++++++++++++++++++++++++ 2 files changed, 113 insertions(+) diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index 2510eec026f7..cffb0cfed961 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -133,6 +133,9 @@ bool cpus_are_stuck_in_kernel(void); extern void crash_smp_send_stop(void); extern bool smp_crash_stop_failed(void); =20 +void pause_remote_cpus(void); +void resume_remote_cpus(void); + #endif /* ifndef __ASSEMBLY__ */ =20 #endif /* ifndef __ASM_SMP_H */ diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 5e18fbcee9a2..aa80266e5c9d 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -68,16 +68,25 @@ enum ipi_msg_type { IPI_RESCHEDULE, IPI_CALL_FUNC, IPI_CPU_STOP, + IPI_CPU_PAUSE, +#ifdef CONFIG_KEXEC_CORE IPI_CPU_CRASH_STOP, +#endif +#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST IPI_TIMER, +#endif +#ifdef CONFIG_IRQ_WORK IPI_IRQ_WORK, +#endif NR_IPI, /* * Any enum >=3D NR_IPI and < MAX_IPI is special and not tracable * with trace_ipi_* */ IPI_CPU_BACKTRACE =3D NR_IPI, +#ifdef CONFIG_KGDB IPI_KGDB_ROUNDUP, +#endif MAX_IPI }; =20 @@ -821,11 +830,20 @@ static const char *ipi_types[MAX_IPI] __tracepoint_st= ring =3D { [IPI_RESCHEDULE] =3D "Rescheduling interrupts", [IPI_CALL_FUNC] =3D "Function call interrupts", [IPI_CPU_STOP] =3D "CPU stop interrupts", + [IPI_CPU_PAUSE] =3D "CPU pause interrupts", +#ifdef CONFIG_KEXEC_CORE [IPI_CPU_CRASH_STOP] =3D "CPU stop (for crash dump) interrupts", +#endif +#ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST [IPI_TIMER] =3D "Timer broadcast interrupts", +#endif +#ifdef CONFIG_IRQ_WORK [IPI_IRQ_WORK] =3D "IRQ work interrupts", +#endif [IPI_CPU_BACKTRACE] =3D "CPU backtrace interrupts", +#ifdef CONFIG_KGDB [IPI_KGDB_ROUNDUP] =3D "KGDB roundup interrupts", +#endif }; =20 static void smp_cross_call(const struct cpumask *target, unsigned int ipin= r); @@ -884,6 +902,85 @@ void __noreturn panic_smp_self_stop(void) local_cpu_stop(); } =20 +static DEFINE_SPINLOCK(cpu_pause_lock); +static cpumask_t paused_cpus; +static cpumask_t resumed_cpus; + +static void pause_local_cpu(void) +{ + int cpu =3D smp_processor_id(); + + cpumask_clear_cpu(cpu, &resumed_cpus); + /* + * Paired with pause_remote_cpus() to confirm that this CPU not only + * will be paused but also can be reliably resumed. + */ + smp_wmb(); + cpumask_set_cpu(cpu, &paused_cpus); + /* A typical example for sleep and wake-up functions. */ + smp_mb(); + while (!cpumask_test_cpu(cpu, &resumed_cpus)) { + wfe(); + barrier(); + } + barrier(); + cpumask_clear_cpu(cpu, &paused_cpus); +} + +void pause_remote_cpus(void) +{ + cpumask_t cpus_to_pause; + + lockdep_assert_cpus_held(); + lockdep_assert_preemption_disabled(); + + cpumask_copy(&cpus_to_pause, cpu_online_mask); + cpumask_clear_cpu(smp_processor_id(), &cpus_to_pause); + + spin_lock(&cpu_pause_lock); + + WARN_ON_ONCE(!cpumask_empty(&paused_cpus)); + + smp_cross_call(&cpus_to_pause, IPI_CPU_PAUSE); + + while (!cpumask_equal(&cpus_to_pause, &paused_cpus)) { + cpu_relax(); + barrier(); + } + /* + * Paired with pause_local_cpu() to confirm that all CPUs not only will + * be paused but also can be reliably resumed. + */ + smp_rmb(); + WARN_ON_ONCE(cpumask_intersects(&cpus_to_pause, &resumed_cpus)); + + spin_unlock(&cpu_pause_lock); +} + +void resume_remote_cpus(void) +{ + cpumask_t cpus_to_resume; + + lockdep_assert_cpus_held(); + lockdep_assert_preemption_disabled(); + + cpumask_copy(&cpus_to_resume, cpu_online_mask); + cpumask_clear_cpu(smp_processor_id(), &cpus_to_resume); + + spin_lock(&cpu_pause_lock); + + cpumask_setall(&resumed_cpus); + /* A typical example for sleep and wake-up functions. */ + smp_mb(); + while (cpumask_intersects(&cpus_to_resume, &paused_cpus)) { + sev(); + cpu_relax(); + barrier(); + } + + spin_unlock(&cpu_pause_lock); +} + #ifdef CONFIG_KEXEC_CORE static atomic_t waiting_for_crash_ipi =3D ATOMIC_INIT(0); #endif @@ -963,6 +1060,11 @@ static void do_handle_IPI(int ipinr) local_cpu_stop(); break; =20 + case IPI_CPU_PAUSE: + pause_local_cpu(); + break; + +#ifdef CONFIG_KEXEC_CORE case IPI_CPU_CRASH_STOP: if (IS_ENABLED(CONFIG_KEXEC_CORE)) { ipi_cpu_crash_stop(cpu, get_irq_regs()); @@ -970,6 +1072,7 @@ static void do_handle_IPI(int ipinr) unreachable(); } break; +#endif =20 #ifdef CONFIG_GENERIC_CLOCKEVENTS_BROADCAST case IPI_TIMER: @@ -991,9 +1094,11 @@ static void do_handle_IPI(int ipinr) nmi_cpu_backtrace(get_irq_regs()); break; =20 +#ifdef CONFIG_KGDB case IPI_KGDB_ROUNDUP: kgdb_nmicallback(cpu, get_irq_regs()); break; +#endif =20 default: pr_crit("CPU%u: Unknown IPI message 0x%x\n", cpu, ipinr); @@ -1023,9 +1128,14 @@ static bool ipi_should_be_nmi(enum ipi_msg_type ipi) =20 switch (ipi) { case IPI_CPU_STOP: + case IPI_CPU_PAUSE: +#ifdef CONFIG_KEXEC_CORE case IPI_CPU_CRASH_STOP: +#endif case IPI_CPU_BACKTRACE: +#ifdef CONFIG_KGDB case IPI_KGDB_ROUNDUP: +#endif return true; default: return false; --=20 2.46.0.rc2.264.g509ed76dc8-goog From nobody Fri Dec 19 19:09:35 2025 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E0DAF28DC1 for ; Tue, 6 Aug 2024 02:21:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722910889; cv=none; b=UM7/5PDn39dNDvnKEe0eIlmRt+MQGxKnsuy4zWZsf1F857tP7AVnW90s+j69VytiCEZENgrgPDMJMjUgAdY1koQK10v+fzBpzc3QZOaKdcFkx6upIXuYZ8tcd0EKPGt4MVqlwLAb77vX103YTbuQykAagfI8pNMiUS+N8ab5JC8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722910889; c=relaxed/simple; bh=gYr+IKanW7SO53Hh0FAwufRCBjhpyVN8/ZLbUUxhUqI=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=nRscTMuCh69NSlx+P29lLFoUI8y7ZJ+U2ZXMTKhr4j05HFCQd2R5s7no3dd0/IyWB9CBObhOw6GSw1okTFgB5vZATQgMEiUdtidKC9ApHxHPRBaIhMxplG3QaCX69dNSq0QoEYEp8Hb4jEox9fT4YQaujbQc+ohSe60vB1rZRtw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=KEMtGgpH; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="KEMtGgpH" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-65b985bb059so410457b3.2 for ; Mon, 05 Aug 2024 19:21:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1722910887; x=1723515687; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=Dz9+L9Nas5YdR+3aU4+X/tR5OCfJ0AKY3iAQMLv0S5k=; b=KEMtGgpHKPr1kCuqmks0+nESA+I99x+GCEQ8lD6u4tjp9zFJF0U2npuQZi9snCHJ3Y 9gQERoJJ3aL9XixKXAj3FkdCg0zI4di367gU7S6FKMglloK5I1P9iNUpZM4zHq5/xnxb LL49F9HEqzkqkayj9eEKkWckXZHjHHTXLhCWY8UYJ3RLiWP21Lw5N8cx0vThvgsfm99F vx2W5h9MLBkj6WEjQaGWlBehH+duLxDGRa0DGRddILVkNkgF5hM6+wJYw6Bjal9N5YJR HxPaSfn+oLmiWcB4Q+EDzbHcyKV9Aur8jymaYMyoARbILonDOcGtpu0wcz9lb52KQb2+ Gmyg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722910887; x=1723515687; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=Dz9+L9Nas5YdR+3aU4+X/tR5OCfJ0AKY3iAQMLv0S5k=; b=Psrp+2qJBxW3XwMYvsu/l3P885jbIYgDyU9saYHKYNjS5hCa7P/DcuLmelxDjPsRXY /Zrvt7oODj6S/RjY40Rl6Xyv5gCsSuE3PevjZc8080izAzY6mYV2ykget0KfVCJPyRme JrSfzhUzo76XSN4d8aewNxD+1ssHbDY6W2ix9Ys+EJ0LC6KdkcGfAU3JZS2fUb0i7PgV PpcUSKwE50c5GWUVU7KD7gYzZF97ckwzSJZi8xQeS6cL/NN3DBV7mWkZJmyFj8SDTbhN reGmn+aALzLDtzQ/tkEZlNkoHJkUXGe7hxT/dWc6p9mEYAIS9Gug+t3pY3oHT/oj81G4 2jBg== X-Forwarded-Encrypted: i=1; AJvYcCWS1dDW0Hru0TNT1D5nvJPoILbPwQOUUp539mUJ5+l81WyWj/+//SvOY7JFnxnopFy9VAqhpCLW1eMmnNQlCvlPXkNLwKN441p+trQN X-Gm-Message-State: AOJu0Yyf53esu2+/K0WthVzvWhn+XtdIEsEoG7e6z/yZlYeqYy/rJYvr tpgn/I65JXY1BBgSfnu2Sa489B8Ku/nRdtfZBqoZvQFokj+mAQJOriG69URa4ZwLqJsDagB4oa0 TgQ== X-Google-Smtp-Source: AGHT+IE5Ot4MCUehL2BALtzt0Nbbacrl8HCjMPwdvfemN8nL8qevC8oWRD8zmA2efNk4jGDFZ//D943lM9s= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:261c:802b:6b55:e09c]) (user=yuzhao job=sendgmr) by 2002:a05:690c:113:b0:648:3f93:61f2 with SMTP id 00721157ae682-689641a414emr5295867b3.6.1722910886865; Mon, 05 Aug 2024 19:21:26 -0700 (PDT) Date: Mon, 5 Aug 2024 20:21:13 -0600 In-Reply-To: <20240806022114.3320543-1-yuzhao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240806022114.3320543-1-yuzhao@google.com> X-Mailer: git-send-email 2.46.0.rc2.264.g509ed76dc8-goog Message-ID: <20240806022114.3320543-4-yuzhao@google.com> Subject: [RFC PATCH 3/4] arm64: pause remote CPUs to update vmemmap From: Yu Zhao To: Catalin Marinas , Will Deacon Cc: Andrew Morton , David Rientjes , Douglas Anderson , Frank van der Linden , Mark Rutland , Muchun Song , Nanyong Sun , Yang Shi , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Yu Zhao Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Pause remote CPUs so that the local CPU can follow the proper BBM sequence to safely update the vmemmap mapping `struct page` areas. While updating the vmemmap, it is guaranteed that neither the local CPU nor the remote ones will access the `struct page` area being updated, and therefore they will not trigger kernel PFs. Signed-off-by: Yu Zhao --- arch/arm64/include/asm/pgalloc.h | 55 ++++++++++++++++++++++++++++++++ mm/hugetlb_vmemmap.c | 14 ++++++++ 2 files changed, 69 insertions(+) diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgal= loc.h index 8ff5f2a2579e..1af1aa34a351 100644 --- a/arch/arm64/include/asm/pgalloc.h +++ b/arch/arm64/include/asm/pgalloc.h @@ -12,6 +12,7 @@ #include #include #include +#include =20 #define __HAVE_ARCH_PGD_FREE #define __HAVE_ARCH_PUD_FREE @@ -137,4 +138,58 @@ pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtabl= e_t ptep) __pmd_populate(pmdp, page_to_phys(ptep), PMD_TYPE_TABLE | PMD_TABLE_PXN); } =20 +#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP + +#define vmemmap_update_lock vmemmap_update_lock +static inline void vmemmap_update_lock(void) +{ + cpus_read_lock(); +} + +#define vmemmap_update_unlock vmemmap_update_unlock +static inline void vmemmap_update_unlock(void) +{ + cpus_read_unlock(); +} + +#define vmemmap_update_pte vmemmap_update_pte +static inline void vmemmap_update_pte(unsigned long addr, pte_t *ptep, pte= _t pte) +{ + preempt_disable(); + pause_remote_cpus(); + + pte_clear(&init_mm, addr, ptep); + flush_tlb_kernel_range(addr, addr + PAGE_SIZE); + set_pte_at(&init_mm, addr, ptep, pte); + + resume_remote_cpus(); + preempt_enable(); +} + +#define vmemmap_update_pmd vmemmap_update_pmd +static inline void vmemmap_update_pmd(unsigned long addr, pmd_t *pmdp, pte= _t *ptep) +{ + preempt_disable(); + pause_remote_cpus(); + + pmd_clear(pmdp); + flush_tlb_kernel_range(addr, addr + PMD_SIZE); + pmd_populate_kernel(&init_mm, pmdp, ptep); + + resume_remote_cpus(); + preempt_enable(); +} + +#define vmemmap_flush_tlb_all vmemmap_flush_tlb_all +static inline void vmemmap_flush_tlb_all(void) +{ +} + +#define vmemmap_flush_tlb_range vmemmap_flush_tlb_range +static inline void vmemmap_flush_tlb_range(unsigned long start, unsigned l= ong end) +{ +} + +#endif /* CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP */ + #endif diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 2dd92e58f304..893c73493d9c 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -46,6 +46,18 @@ struct vmemmap_remap_walk { unsigned long flags; }; =20 +#ifndef vmemmap_update_lock +static void vmemmap_update_lock(void) +{ +} +#endif + +#ifndef vmemmap_update_unlock +static void vmemmap_update_unlock(void) +{ +} +#endif + #ifndef vmemmap_update_pmd static inline void vmemmap_update_pmd(unsigned long addr, pmd_t *pmdp, pte_t *ptep) @@ -194,10 +206,12 @@ static int vmemmap_remap_range(unsigned long start, u= nsigned long end, =20 VM_BUG_ON(!PAGE_ALIGNED(start | end)); =20 + vmemmap_update_lock(); mmap_read_lock(&init_mm); ret =3D walk_page_range_novma(&init_mm, start, end, &vmemmap_remap_ops, NULL, walk); mmap_read_unlock(&init_mm); + vmemmap_update_unlock(); if (ret) return ret; =20 --=20 2.46.0.rc2.264.g509ed76dc8-goog From nobody Fri Dec 19 19:09:35 2025 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 676D7374D1 for ; Tue, 6 Aug 2024 02:21:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722910891; cv=none; b=RTaAKrk9DulXBwzAxO2eh0DkRjBEgom6n24l0osII8qmvlqkE0UejXpgFQFbCb44svBBTHPxD7h9Vknr8dxszzNaNexx4sFRf0X+DNYFCH4mpZfihwSOs+4qpF/YL56iAHBw2L+gLWfXNnHePhgcqxFT6j3HFRCJLFCQUKSFYgA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1722910891; c=relaxed/simple; bh=Sj7lfQGWvEtR7yX9mZKYVaCGXWLFUoNnXGjl2VnaJPU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=lWBKY4byHRibHjNd2MjxR06kfaFjenqeItVVlSUZu16mRTMWj4l6DmJ+OkEGHSei/t7ax0KZSEbAWWSvEBfv/44LKQMP1GQ3gUQZsoYWSUXiqps0exGX4wyL4wdJ5TSdt5NIfL1hFVsMfjAg2oBwW/lnMlurFcdFwhpUzS5xuIo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=UqOLC0x4; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="UqOLC0x4" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-672bea19dd3so3764347b3.1 for ; Mon, 05 Aug 2024 19:21:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1722910889; x=1723515689; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=klWRv7jv+zR9aL4I7OIDRLjBsHfroErcwBIt7k9CqK8=; b=UqOLC0x4WDgfSwQXe7HSnxMRx3PySMuLFwi75HhU2E9CfcAy8BpgVxyBxiXo1/HTkL 05SIyCMOo0EPtYDlg37g4+zMPVTYuc+jujGCtLKGknzhBXd6HKsQAX0JdwFPPPkOFN37 Z08tdIAXdM/kitT7ELiKnSAH5riwJbJ87QIVkAz/yhpRTSwsC3etQ5/aOYFeKFP6MCx/ mAeVcTOjY+HUWrySsd+NWHpvL6/AOLUHh6k5J5mZ911KOC7PRfo2d6I/feeRH6eirYez JMS0DJUmH4vFo0CkvTlOKVnlF9L91p/m6iUnqGKAIir52wwTNwQo5jAFnYZCK+fcF93G YADA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722910889; x=1723515689; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=klWRv7jv+zR9aL4I7OIDRLjBsHfroErcwBIt7k9CqK8=; b=XME9LILP8wfWz+mTXQ5PIpP9myOMOnyNPcnyeAAkxaDKNTwH+iY9dy6U+XTKAb/wtA 4Bvmvzir48dwik+bHcFb0DRqpHABR+kXC6kBja/8AuwsCgWba/OvlpWjAzM4qf0nPJTA 1GWu3mNdQGFE9bWW2h2Sokl47F8I22E8bfl1GeysoYa73Hkn9YLmAJRFcdb+PCe2qWUT rIU4DHzAuQWqc0veDC6yW7wxTdRFZqIftqR63iIMVJtjMeDWw9g5XSwwHafSWZEr+jdi u34PH185aVtBWRSD1BG68d3dpgxLQmWeHAKvW3CqNRk3dI0tiRtV/erGaB4Qyor5deLE Z3fg== X-Forwarded-Encrypted: i=1; AJvYcCWt7MMLorTfZVLG9OXTNb1txEGCS8Mmn9nGhHVvLVXnXsRP5rzHdk8NLMy4spbmZT/2Sdtf2xQBrJI3wkFPko52ddGsIb68CbZ/S7C3 X-Gm-Message-State: AOJu0YxuDDU4VxeFtascojqNaGi0HcfF8o1O4e46HsjXBDRVZQ9Nr76c T7wuTanOg3JVLv7ShrlhjRYfCWt/7a9rxvK1pzy6XD8SBx/EnNJmJApuc9YIRWlnT1IKDj4Il8U ttA== X-Google-Smtp-Source: AGHT+IGBeIkNUsFUwrlBhpMmppS6FEO5hkHGCy4Rnvwll+98qZLKAoOlt7677TqFz3cp98ge6KT5O3g57gw= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:261c:802b:6b55:e09c]) (user=yuzhao job=sendgmr) by 2002:a05:6902:2b90:b0:e05:eccb:95dc with SMTP id 3f1490d57ef6-e0bde283222mr64471276.6.1722910889235; Mon, 05 Aug 2024 19:21:29 -0700 (PDT) Date: Mon, 5 Aug 2024 20:21:14 -0600 In-Reply-To: <20240806022114.3320543-1-yuzhao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20240806022114.3320543-1-yuzhao@google.com> X-Mailer: git-send-email 2.46.0.rc2.264.g509ed76dc8-goog Message-ID: <20240806022114.3320543-5-yuzhao@google.com> Subject: [RFC PATCH 4/4] arm64: mm: Re-enable OPTIMIZE_HUGETLB_VMEMMAP From: Yu Zhao To: Catalin Marinas , Will Deacon Cc: Andrew Morton , David Rientjes , Douglas Anderson , Frank van der Linden , Mark Rutland , Muchun Song , Nanyong Sun , Yang Shi , linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, Muchun Song , Yu Zhao Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Nanyong Sun Now update of vmemmap page table can follow the rule of break-before-make safely for arm64 architecture, re-enable HVO on arm64. Signed-off-by: Nanyong Sun Reviewed-by: Muchun Song Signed-off-by: Yu Zhao --- arch/arm64/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index a2f8ff354ca6..25ff026cdaf5 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -105,6 +105,7 @@ config ARM64 select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT select ARCH_WANT_FRAME_POINTERS select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && = !ARM64_VA_BITS_36) + select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANTS_EXECMEM_LATE if EXECMEM select ARCH_WANTS_NO_INSTR --=20 2.46.0.rc2.264.g509ed76dc8-goog