From nobody Tue Nov 26 06:43:19 2024 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 762D2193427 for ; Mon, 21 Oct 2024 04:22:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729484548; cv=none; b=S8TujW9+VESFGd3E4kIuEl+DqCPva1G7Uw6JHZgqxnz7gF92iajk3cqckP4RhW+0qSEKw/1bUafEx6dBhL3exAnoucDnJS4iqBfx0WSCjHlwJmeTYAhlx+7cC+/zGTEaMz3qle3DQEzZnb/AMYVVZ+/xi5IbEEU99JDUOTbAyno= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729484548; c=relaxed/simple; bh=KbebdPdkRooXXqkmI/vtXOKlzOhTBRlcFc6xshyoOy8=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=rWc175Ys2ySFd4rYIwtZxxg2JwZyFki4nlfouFHRCC6gtGqagzuKud/rUIopFbkPxWZ7lZLCBByrnmeUo46SAFirg8tvt/u5dfhYhzHQHKRZOb5mFQSlNJUphCkSF3Q7Vr4JWeWqxCequ3e2WU6dkoy8h54iST1VfM7xwpacLPw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ZcZ2eqJj; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ZcZ2eqJj" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e290222fde4so5385422276.1 for ; Sun, 20 Oct 2024 21:22:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729484545; x=1730089345; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=KalWPiT5Cmq0awZpJ2ht7HZzxnfLYA4lEUcwS3DKUno=; b=ZcZ2eqJjosmKC5CLl3+M0lq4V+SeSbfQpXwCzFzuPvUIRb5kMKoq6zYOVJ9jk29LMM Shnlpd45711IrvGuSRSh98VFkkEPzBZ71m5jA55OWxaKDMQ9Bcxlc7KuZCSxZeAkaSsO 3FH2v9KXIhgpmAxw4nQGZ5h1tYHrCazNwnu/hSFRpMXy4wyW0x4qhQ6t3wXcZXlui19c S3JLIzVX2+M6AEaonUcxO5urEGWB1/swzpwLlnQjZ/Y7S0PDXg8gEJQl2w9ytNgQOw1D LM8RMFOqMGrFe6yHhpdP23gI2u/kmkDlg00/X0riiOf9ARs2GRQQwqW0ngvLvOrHUbEs VzLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729484545; x=1730089345; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KalWPiT5Cmq0awZpJ2ht7HZzxnfLYA4lEUcwS3DKUno=; b=xUHAfcbSj6d6LjhKCJ0xJaVdPfaaBRUFJtlmqzTOqo2p08YQPqBuXYqx5aWIxvOOOB vARH4Y7OISYaLPFZvGlblHN6gWE6QG618sq8rAqKCBaiEIqVOLkmCIMSt/lm/jAU5CxP E/rZIVViIq1PLkdWPR11p+sRHHa5UiPykaL6yoobEYzCQDD6NF+mo/JcLexIMqH769lU YokudHx+MPaw49vJPsd9RLB4cZk/NGVr2aJ9eUo8oKoMAL9EXkNuV9NJzwgPEaUP6cRZ +3uTAiS5IoH7uDPh8Ijg8fFBp9IYO+qt54yZnmVJGe1QjNEnohnrIe8vo8I9h1HYXyZm BgFQ== X-Forwarded-Encrypted: i=1; AJvYcCVUW9PC7FlHS7H7OuQDEs59afzWfqnOe7ZIL9un98GsVh9PTmJgt3RzXDfJwNlQ5pXva6PDWpuKg+3zefU=@vger.kernel.org X-Gm-Message-State: AOJu0YzI1yBvAnoTMYSgOd+dH+TlGRK0ZwepQDg0IoUXw8Esz1lqt+xW xKfays6CamSFdR6RSAJ61v0771rn0vY5NYLgyQM3lr/R0yRs2YlqULrwFHdicNFJ5bkBfwNrW/S zDA== X-Google-Smtp-Source: AGHT+IH2Ur7HHiXBaY4bDEA69MOZ2p1T90XrFcEHYsQ0uSyMb8pWEEibt70U8jRDqfM7NgoMPydl9BV9rAw= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:1569:9ef4:20ab:abf9]) (user=yuzhao job=sendgmr) by 2002:a25:dc41:0:b0:e25:caa3:9c2e with SMTP id 3f1490d57ef6-e2bb16d9206mr22332276.11.1729484545011; Sun, 20 Oct 2024 21:22:25 -0700 (PDT) Date: Sun, 20 Oct 2024 22:22:13 -0600 In-Reply-To: <20241021042218.746659-1-yuzhao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241021042218.746659-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241021042218.746659-2-yuzhao@google.com> Subject: [PATCH v1 1/6] mm/hugetlb_vmemmap: batch update PTEs From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Convert vmemmap_remap_walk->remap_pte to ->remap_pte_range so that vmemmap remap walks can batch update PTEs. The goal of this conversion is to allow architectures to implement their own optimizations if possible, e.g., only to stop remote CPUs once for each batch when updating vmemmap on arm64. It is not intended to change the remap workflow nor should it by itself have any side effects on performance. Signed-off-by: Yu Zhao --- mm/hugetlb_vmemmap.c | 163 ++++++++++++++++++++++++------------------- 1 file changed, 91 insertions(+), 72 deletions(-) diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 57b7f591eee8..46befab48d41 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -22,7 +22,7 @@ /** * struct vmemmap_remap_walk - walk vmemmap page table * - * @remap_pte: called for each lowest-level entry (PTE). + * @remap_pte_range: called on a range of PTEs. * @nr_walked: the number of walked pte. * @reuse_page: the page which is reused for the tail vmemmap pages. * @reuse_addr: the virtual address of the @reuse_page page. @@ -32,8 +32,8 @@ * operations. */ struct vmemmap_remap_walk { - void (*remap_pte)(pte_t *pte, unsigned long addr, - struct vmemmap_remap_walk *walk); + void (*remap_pte_range)(pte_t *pte, unsigned long start, + unsigned long end, struct vmemmap_remap_walk *walk); unsigned long nr_walked; struct page *reuse_page; unsigned long reuse_addr; @@ -101,10 +101,6 @@ static int vmemmap_pmd_entry(pmd_t *pmd, unsigned long= addr, struct page *head; struct vmemmap_remap_walk *vmemmap_walk =3D walk->private; =20 - /* Only splitting, not remapping the vmemmap pages. */ - if (!vmemmap_walk->remap_pte) - walk->action =3D ACTION_CONTINUE; - spin_lock(&init_mm.page_table_lock); head =3D pmd_leaf(*pmd) ? pmd_page(*pmd) : NULL; /* @@ -129,33 +125,36 @@ static int vmemmap_pmd_entry(pmd_t *pmd, unsigned lon= g addr, ret =3D -ENOTSUPP; } spin_unlock(&init_mm.page_table_lock); - if (!head || ret) + if (ret) return ret; =20 - return vmemmap_split_pmd(pmd, head, addr & PMD_MASK, vmemmap_walk); -} + if (head) { + ret =3D vmemmap_split_pmd(pmd, head, addr & PMD_MASK, vmemmap_walk); + if (ret) + return ret; + } =20 -static int vmemmap_pte_entry(pte_t *pte, unsigned long addr, - unsigned long next, struct mm_walk *walk) -{ - struct vmemmap_remap_walk *vmemmap_walk =3D walk->private; + if (vmemmap_walk->remap_pte_range) { + pte_t *pte =3D pte_offset_kernel(pmd, addr); =20 - /* - * The reuse_page is found 'first' in page table walking before - * starting remapping. - */ - if (!vmemmap_walk->reuse_page) - vmemmap_walk->reuse_page =3D pte_page(ptep_get(pte)); - else - vmemmap_walk->remap_pte(pte, addr, vmemmap_walk); - vmemmap_walk->nr_walked++; + vmemmap_walk->nr_walked +=3D (next - addr) / PAGE_SIZE; + /* + * The reuse_page is found 'first' in page table walking before + * starting remapping. + */ + if (!vmemmap_walk->reuse_page) { + vmemmap_walk->reuse_page =3D pte_page(ptep_get(pte)); + pte++; + addr +=3D PAGE_SIZE; + } + vmemmap_walk->remap_pte_range(pte, addr, next, vmemmap_walk); + } =20 return 0; } =20 static const struct mm_walk_ops vmemmap_remap_ops =3D { .pmd_entry =3D vmemmap_pmd_entry, - .pte_entry =3D vmemmap_pte_entry, }; =20 static int vmemmap_remap_range(unsigned long start, unsigned long end, @@ -172,7 +171,7 @@ static int vmemmap_remap_range(unsigned long start, uns= igned long end, if (ret) return ret; =20 - if (walk->remap_pte && !(walk->flags & VMEMMAP_REMAP_NO_TLB_FLUSH)) + if (walk->remap_pte_range && !(walk->flags & VMEMMAP_REMAP_NO_TLB_FLUSH)) flush_tlb_kernel_range(start, end); =20 return 0; @@ -204,33 +203,45 @@ static void free_vmemmap_page_list(struct list_head *= list) free_vmemmap_page(page); } =20 -static void vmemmap_remap_pte(pte_t *pte, unsigned long addr, - struct vmemmap_remap_walk *walk) +static void vmemmap_remap_pte_range(pte_t *pte, unsigned long start, unsig= ned long end, + struct vmemmap_remap_walk *walk) { - /* - * Remap the tail pages as read-only to catch illegal write operation - * to the tail pages. - */ - pgprot_t pgprot =3D PAGE_KERNEL_RO; - struct page *page =3D pte_page(ptep_get(pte)); - pte_t entry; - - /* Remapping the head page requires r/w */ - if (unlikely(addr =3D=3D walk->reuse_addr)) { - pgprot =3D PAGE_KERNEL; - list_del(&walk->reuse_page->lru); + int i; + struct page *page; + int nr_pages =3D (end - start) / PAGE_SIZE; =20 + for (i =3D 0; i < nr_pages; i++) { + page =3D pte_page(ptep_get(pte + i)); + + list_add(&page->lru, walk->vmemmap_pages); + } + + page =3D walk->reuse_page; + + if (start =3D=3D walk->reuse_addr) { + list_del(&page->lru); + copy_page(page_to_virt(page), (void *)walk->reuse_addr); /* - * Makes sure that preceding stores to the page contents from - * vmemmap_remap_free() become visible before the set_pte_at() - * write. + * Makes sure that preceding stores to the page contents become + * visible before set_pte_at(). */ smp_wmb(); } =20 - entry =3D mk_pte(walk->reuse_page, pgprot); - list_add(&page->lru, walk->vmemmap_pages); - set_pte_at(&init_mm, addr, pte, entry); + for (i =3D 0; i < nr_pages; i++) { + pte_t val; + + /* + * The head page must be mapped read-write; the tail pages are + * mapped read-only to catch illegal modifications. + */ + if (!i && start =3D=3D walk->reuse_addr) + val =3D mk_pte(page, PAGE_KERNEL); + else + val =3D mk_pte(page, PAGE_KERNEL_RO); + + set_pte_at(&init_mm, start + PAGE_SIZE * i, pte + i, val); + } } =20 /* @@ -252,27 +263,39 @@ static inline void reset_struct_pages(struct page *st= art) memcpy(start, from, sizeof(*from) * NR_RESET_STRUCT_PAGE); } =20 -static void vmemmap_restore_pte(pte_t *pte, unsigned long addr, - struct vmemmap_remap_walk *walk) +static void vmemmap_restore_pte_range(pte_t *pte, unsigned long start, uns= igned long end, + struct vmemmap_remap_walk *walk) { - pgprot_t pgprot =3D PAGE_KERNEL; + int i; struct page *page; - void *to; - - BUG_ON(pte_page(ptep_get(pte)) !=3D walk->reuse_page); + int nr_pages =3D (end - start) / PAGE_SIZE; =20 page =3D list_first_entry(walk->vmemmap_pages, struct page, lru); - list_del(&page->lru); - to =3D page_to_virt(page); - copy_page(to, (void *)walk->reuse_addr); - reset_struct_pages(to); + + for (i =3D 0; i < nr_pages; i++) { + BUG_ON(pte_page(ptep_get(pte + i)) !=3D walk->reuse_page); + + copy_page(page_to_virt(page), (void *)walk->reuse_addr); + reset_struct_pages(page_to_virt(page)); + + page =3D list_next_entry(page, lru); + } =20 /* * Makes sure that preceding stores to the page contents become visible - * before the set_pte_at() write. + * before set_pte_at(). */ smp_wmb(); - set_pte_at(&init_mm, addr, pte, mk_pte(page, pgprot)); + + for (i =3D 0; i < nr_pages; i++) { + pte_t val; + + page =3D list_first_entry(walk->vmemmap_pages, struct page, lru); + list_del(&page->lru); + + val =3D mk_pte(page, PAGE_KERNEL); + set_pte_at(&init_mm, start + PAGE_SIZE * i, pte + i, val); + } } =20 /** @@ -290,7 +313,6 @@ static int vmemmap_remap_split(unsigned long start, uns= igned long end, unsigned long reuse) { struct vmemmap_remap_walk walk =3D { - .remap_pte =3D NULL, .flags =3D VMEMMAP_SPLIT_NO_TLB_FLUSH, }; =20 @@ -322,10 +344,10 @@ static int vmemmap_remap_free(unsigned long start, un= signed long end, { int ret; struct vmemmap_remap_walk walk =3D { - .remap_pte =3D vmemmap_remap_pte, - .reuse_addr =3D reuse, - .vmemmap_pages =3D vmemmap_pages, - .flags =3D flags, + .remap_pte_range =3D vmemmap_remap_pte_range, + .reuse_addr =3D reuse, + .vmemmap_pages =3D vmemmap_pages, + .flags =3D flags, }; int nid =3D page_to_nid((struct page *)reuse); gfp_t gfp_mask =3D GFP_KERNEL | __GFP_NORETRY | __GFP_NOWARN; @@ -340,8 +362,6 @@ static int vmemmap_remap_free(unsigned long start, unsi= gned long end, */ walk.reuse_page =3D alloc_pages_node(nid, gfp_mask, 0); if (walk.reuse_page) { - copy_page(page_to_virt(walk.reuse_page), - (void *)walk.reuse_addr); list_add(&walk.reuse_page->lru, vmemmap_pages); memmap_pages_add(1); } @@ -371,10 +391,9 @@ static int vmemmap_remap_free(unsigned long start, uns= igned long end, * They will be restored in the following call. */ walk =3D (struct vmemmap_remap_walk) { - .remap_pte =3D vmemmap_restore_pte, - .reuse_addr =3D reuse, - .vmemmap_pages =3D vmemmap_pages, - .flags =3D 0, + .remap_pte_range =3D vmemmap_restore_pte_range, + .reuse_addr =3D reuse, + .vmemmap_pages =3D vmemmap_pages, }; =20 vmemmap_remap_range(reuse, end, &walk); @@ -425,10 +444,10 @@ static int vmemmap_remap_alloc(unsigned long start, u= nsigned long end, { LIST_HEAD(vmemmap_pages); struct vmemmap_remap_walk walk =3D { - .remap_pte =3D vmemmap_restore_pte, - .reuse_addr =3D reuse, - .vmemmap_pages =3D &vmemmap_pages, - .flags =3D flags, + .remap_pte_range =3D vmemmap_restore_pte_range, + .reuse_addr =3D reuse, + .vmemmap_pages =3D &vmemmap_pages, + .flags =3D flags, }; =20 /* See the comment in the vmemmap_remap_free(). */ --=20 2.47.0.rc1.288.g06298d1525-goog From nobody Tue Nov 26 06:43:19 2024 Received: from mail-yw1-f201.google.com (mail-yw1-f201.google.com [209.85.128.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DCE061CF7B6 for ; Mon, 21 Oct 2024 04:22:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729484550; cv=none; b=ZQPMTQJ0QJx/qZNsERV04zyu60RK6g5dMFZZ/raurjjsbfjEduAsurFh3I7SDsVgb6SsvFrpgk8W7ExQAPJKMmz7dN66RNqmtmt10EJYlpfscG14EREDphEtB5mDqxptsCHAAqzH7GsUD5rbKrzgyK3ZD6PaAmJAsBDyWHIb/YU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729484550; c=relaxed/simple; bh=q1CirfEE+g7TLW2THus/ysy93zvYFU3FA3njHvBj4WM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=e0vFCQWPFpd1kWkq+Uibbm1UzDbTJDgM834igwOhmGz4K4O65dckF30D8Bmftk7h+qrE5MDiCct3T+5PNnkKXkpOvCS4oDCzv78WR6fFDvFiZGbyvnnAtWkBgbgNq5y82pHHQfH/GNVd8p3yqkejHtBp3lzgW04PxSmdHNeRyMU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=iPeG/E48; arc=none smtp.client-ip=209.85.128.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="iPeG/E48" Received: by mail-yw1-f201.google.com with SMTP id 00721157ae682-6e36cfed818so57449237b3.3 for ; Sun, 20 Oct 2024 21:22:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729484548; x=1730089348; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=kQVfzlUfyOXzQNiM6rrgf1IxWXC/TuVQ/9/mg1gNSUk=; b=iPeG/E48pt9aXKx/m25EdLFsGMhBJL50WiJDxePUKZuXdKE3sYhwRHxZ3wlA9nBhx3 RRavlR6cn3mJfsQ/G8V4HNdSN+QBUOhAy2UAmyW7w6gpI42hpm2+7grvo4pMxaBOJwAF qOJ8AMHMsheSDyzvBWnhP1oiZBD4wQktMPPkkzzLy+lDAMOYSZ6Q4nVTCJtIgHvxTDtR UHN/CYIRFTg2kWZn3/bAGboRNFLAxjCKjoSycJvsNQILMKyXY0wRPzW9iZG3gZvf8T6J V2oy/Hq18K4H7GBG5x+86v9vDvUi45FcNallzZQ76VcdubTauTixxK4gYnuhyx7AoPOr cvnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729484548; x=1730089348; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=kQVfzlUfyOXzQNiM6rrgf1IxWXC/TuVQ/9/mg1gNSUk=; b=gwliLN6Ql2M3dU3WP/za2O3VK4VIyYzscUpymbp0Ue1DNnJIUdxeNi5Cpytjj1Boik FKN+LEk+CeBmASBj2ZrOTp+9wpD6EkqJN5loIRpv+d1iIlmS2pv4Hh4BTVGiSEkT/YR8 vpc6zS+1MTXPP5KegDX1JhWeqzt9kev6aKpRL6DmxtYdg2GFmwEpiXMn4btIvFqHo1zl +CXhQ9hWOo3oqUo0sZll9a0Bi2oSzab3oDg1QyU90jUcRzaRKNSM30iZInrB8v1NOLbh 9O8esgCW3GEKtOIykZB2qFra66IPu5nz4tvlvjpEhDNMBaAJg/lmDgXjScUZF5zkzY4E gNtw== X-Forwarded-Encrypted: i=1; AJvYcCUea0p1icJtifmsXkXvbMbxeO3kPksTACpf9QFBhhARdb6CdmfzwGQEE3z1lNjS3L0h7jl1V2R6BBAw2jA=@vger.kernel.org X-Gm-Message-State: AOJu0YzA6X77t8f980kZEy6St/Yg61hUqA7zeKQlIxhIQSj0ZiYpgy9v ALbyimgOKBesWbSf3uKGfIPGHdRYslJbgWPmE9UWb5tX04TjOlm99TUGeQfuABMaqEWaV13B5Gs Yww== X-Google-Smtp-Source: AGHT+IFa+RAhMnCTBGawKhIBMzr1i4wNva1osbWPtOgVISCaAsioJIQoe2S7YEzQTgFhpJ04zIjZfcK8uDQ= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:1569:9ef4:20ab:abf9]) (user=yuzhao job=sendgmr) by 2002:a05:690c:6383:b0:663:ddc1:eab8 with SMTP id 00721157ae682-6e5bfbeb2e0mr2568607b3.4.1729484547736; Sun, 20 Oct 2024 21:22:27 -0700 (PDT) Date: Sun, 20 Oct 2024 22:22:14 -0600 In-Reply-To: <20241021042218.746659-1-yuzhao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241021042218.746659-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241021042218.746659-3-yuzhao@google.com> Subject: [PATCH v1 2/6] mm/hugetlb_vmemmap: add arch-independent helpers From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Add architecture-independent helpers to allow individual architectures to work around their own limitations when updating vmemmap. Specifically, the current remap workflow requires break-before-make (BBM) on arm64. By overriding the default helpers later in this series, arm64 will be able to support the current HVO implementation. Signed-off-by: Yu Zhao --- include/linux/mm_types.h | 7 +++ mm/hugetlb_vmemmap.c | 99 ++++++++++++++++++++++++++++++++++------ 2 files changed, 92 insertions(+), 14 deletions(-) diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h index 6e3bdf8e38bc..0f3ae6e173f6 100644 --- a/include/linux/mm_types.h +++ b/include/linux/mm_types.h @@ -1499,4 +1499,11 @@ enum { /* See also internal only FOLL flags in mm/internal.h */ }; =20 +/* Skip the TLB flush when we split the PMD */ +#define VMEMMAP_SPLIT_NO_TLB_FLUSH BIT(0) +/* Skip the TLB flush when we remap the PTE */ +#define VMEMMAP_REMAP_NO_TLB_FLUSH BIT(1) +/* synchronize_rcu() to avoid writes from page_ref_add_unless() */ +#define VMEMMAP_SYNCHRONIZE_RCU BIT(2) + #endif /* _LINUX_MM_TYPES_H */ diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c index 46befab48d41..e50a196399f5 100644 --- a/mm/hugetlb_vmemmap.c +++ b/mm/hugetlb_vmemmap.c @@ -38,16 +38,56 @@ struct vmemmap_remap_walk { struct page *reuse_page; unsigned long reuse_addr; struct list_head *vmemmap_pages; - -/* Skip the TLB flush when we split the PMD */ -#define VMEMMAP_SPLIT_NO_TLB_FLUSH BIT(0) -/* Skip the TLB flush when we remap the PTE */ -#define VMEMMAP_REMAP_NO_TLB_FLUSH BIT(1) -/* synchronize_rcu() to avoid writes from page_ref_add_unless() */ -#define VMEMMAP_SYNCHRONIZE_RCU BIT(2) unsigned long flags; }; =20 +#ifndef VMEMMAP_ARCH_TLB_FLUSH_FLAGS +#define VMEMMAP_ARCH_TLB_FLUSH_FLAGS 0 +#endif + +#ifndef vmemmap_update_supported +static bool vmemmap_update_supported(void) +{ + return true; +} +#endif + +#ifndef vmemmap_update_lock +static void vmemmap_update_lock(void) +{ +} +#endif + +#ifndef vmemmap_update_unlock +static void vmemmap_update_unlock(void) +{ +} +#endif + +#ifndef vmemmap_update_pte_range_start +static void vmemmap_update_pte_range_start(pte_t *pte, unsigned long start= , unsigned long end) +{ +} +#endif + +#ifndef vmemmap_update_pte_range_end +static void vmemmap_update_pte_range_end(void) +{ +} +#endif + +#ifndef vmemmap_update_pmd_range_start +static void vmemmap_update_pmd_range_start(pmd_t *pmd, unsigned long start= , unsigned long end) +{ +} +#endif + +#ifndef vmemmap_update_pmd_range_end +static void vmemmap_update_pmd_range_end(void) +{ +} +#endif + static int vmemmap_split_pmd(pmd_t *pmd, struct page *head, unsigned long = start, struct vmemmap_remap_walk *walk) { @@ -83,7 +123,9 @@ static int vmemmap_split_pmd(pmd_t *pmd, struct page *he= ad, unsigned long start, =20 /* Make pte visible before pmd. See comment in pmd_install(). */ smp_wmb(); + vmemmap_update_pmd_range_start(pmd, start, start + PMD_SIZE); pmd_populate_kernel(&init_mm, pmd, pgtable); + vmemmap_update_pmd_range_end(); if (!(walk->flags & VMEMMAP_SPLIT_NO_TLB_FLUSH)) flush_tlb_kernel_range(start, start + PMD_SIZE); } else { @@ -164,10 +206,12 @@ static int vmemmap_remap_range(unsigned long start, u= nsigned long end, =20 VM_BUG_ON(!PAGE_ALIGNED(start | end)); =20 + vmemmap_update_lock(); mmap_read_lock(&init_mm); ret =3D walk_page_range_novma(&init_mm, start, end, &vmemmap_remap_ops, NULL, walk); mmap_read_unlock(&init_mm); + vmemmap_update_unlock(); if (ret) return ret; =20 @@ -228,6 +272,8 @@ static void vmemmap_remap_pte_range(pte_t *pte, unsigne= d long start, unsigned lo smp_wmb(); } =20 + vmemmap_update_pte_range_start(pte, start, end); + for (i =3D 0; i < nr_pages; i++) { pte_t val; =20 @@ -242,6 +288,8 @@ static void vmemmap_remap_pte_range(pte_t *pte, unsigne= d long start, unsigned lo =20 set_pte_at(&init_mm, start + PAGE_SIZE * i, pte + i, val); } + + vmemmap_update_pte_range_end(); } =20 /* @@ -287,6 +335,8 @@ static void vmemmap_restore_pte_range(pte_t *pte, unsig= ned long start, unsigned */ smp_wmb(); =20 + vmemmap_update_pte_range_start(pte, start, end); + for (i =3D 0; i < nr_pages; i++) { pte_t val; =20 @@ -296,6 +346,8 @@ static void vmemmap_restore_pte_range(pte_t *pte, unsig= ned long start, unsigned val =3D mk_pte(page, PAGE_KERNEL); set_pte_at(&init_mm, start + PAGE_SIZE * i, pte + i, val); } + + vmemmap_update_pte_range_end(); } =20 /** @@ -513,7 +565,8 @@ static int __hugetlb_vmemmap_restore_folio(const struct= hstate *h, */ int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *fo= lio) { - return __hugetlb_vmemmap_restore_folio(h, folio, VMEMMAP_SYNCHRONIZE_RCU); + return __hugetlb_vmemmap_restore_folio(h, folio, + VMEMMAP_SYNCHRONIZE_RCU | VMEMMAP_ARCH_TLB_FLUSH_FLAGS); } =20 /** @@ -553,7 +606,7 @@ long hugetlb_vmemmap_restore_folios(const struct hstate= *h, list_move(&folio->lru, non_hvo_folios); } =20 - if (restored) + if (restored && !(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_REMAP_NO_TLB_FLU= SH)) flush_tlb_all(); if (!ret) ret =3D restored; @@ -641,7 +694,8 @@ void hugetlb_vmemmap_optimize_folio(const struct hstate= *h, struct folio *folio) { LIST_HEAD(vmemmap_pages); =20 - __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, VMEMMAP_SYNCHR= ONIZE_RCU); + __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, + VMEMMAP_SYNCHRONIZE_RCU | VMEMMAP_ARCH_TLB_FLUSH_FLAGS); free_vmemmap_page_list(&vmemmap_pages); } =20 @@ -683,7 +737,8 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, = struct list_head *folio_l break; } =20 - flush_tlb_all(); + if (!(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_SPLIT_NO_TLB_FLUSH)) + flush_tlb_all(); =20 list_for_each_entry(folio, folio_list, lru) { int ret; @@ -701,24 +756,35 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h= , struct list_head *folio_l * allowing more vmemmap remaps to occur. */ if (ret =3D=3D -ENOMEM && !list_empty(&vmemmap_pages)) { - flush_tlb_all(); + if (!(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_REMAP_NO_TLB_FLUSH)) + flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); INIT_LIST_HEAD(&vmemmap_pages); __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, flags); } } =20 - flush_tlb_all(); + if (!(VMEMMAP_ARCH_TLB_FLUSH_FLAGS & VMEMMAP_REMAP_NO_TLB_FLUSH)) + flush_tlb_all(); free_vmemmap_page_list(&vmemmap_pages); } =20 +static int hugetlb_vmemmap_sysctl(const struct ctl_table *ctl, int write, + void *buffer, size_t *lenp, loff_t *ppos) +{ + if (!vmemmap_update_supported()) + return -ENODEV; + + return proc_dobool(ctl, write, buffer, lenp, ppos); +} + static struct ctl_table hugetlb_vmemmap_sysctls[] =3D { { .procname =3D "hugetlb_optimize_vmemmap", .data =3D &vmemmap_optimize_enabled, .maxlen =3D sizeof(vmemmap_optimize_enabled), .mode =3D 0644, - .proc_handler =3D proc_dobool, + .proc_handler =3D hugetlb_vmemmap_sysctl, }, }; =20 @@ -729,6 +795,11 @@ static int __init hugetlb_vmemmap_init(void) /* HUGETLB_VMEMMAP_RESERVE_SIZE should cover all used struct pages */ BUILD_BUG_ON(__NR_USED_SUBPAGE > HUGETLB_VMEMMAP_RESERVE_PAGES); =20 + if (READ_ONCE(vmemmap_optimize_enabled) && !vmemmap_update_supported()) { + pr_warn("HugeTLB: disabling HVO due to missing support.\n"); + WRITE_ONCE(vmemmap_optimize_enabled, false); + } + for_each_hstate(h) { if (hugetlb_vmemmap_optimizable(h)) { register_sysctl_init("vm", hugetlb_vmemmap_sysctls); --=20 2.47.0.rc1.288.g06298d1525-goog From nobody Tue Nov 26 06:43:19 2024 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 49ED81CFEB0 for ; Mon, 21 Oct 2024 04:22:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729484553; cv=none; b=ld92Nhv4CqcciJ+R7K791+ZiCuAlVWR9+uynffUUwxTrB7e9iCAa8wCdm5MyTEAdZhnrYmY/tZLiF4qWsPrpNi14Qthnf5hf3+YtTTPTV7+3+t9mwa3tCDjNX+Sb0EWzlbq6V+P5A4NGPemwlj+NPVqClsmKps1kzNNRWlCAj+o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729484553; c=relaxed/simple; bh=yxPuj92Fmmn3KaM12Xuj4lY8zDFa3ylivFiGQ6oeGpQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Umx2JupjrdFe2QIvTfKEV6awuUnqKS5GcmVDn0l6RWDIH5lhOhnA0xu9OBVWcyJM3D1wKiwRVdET+pEiAolJ7ZqW3jLpuNPUHcvRNcBu/jR7Bh+SmMAYN5FmxQ2W/bbYIQjag8n6uLfvJ6tyEEcbp4UZGhxrtWYTb1Xt+nFhjFw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=dP2gV3Zn; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="dP2gV3Zn" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e2bd9b1441aso1483676276.0 for ; Sun, 20 Oct 2024 21:22:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729484550; x=1730089350; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=rF+TQYGEDP6+t2V2Dte+w8ImgbG43FZ1I4yOJyMV9jY=; b=dP2gV3ZnkZrNO/8ipXjP6I2QFG/yzaZiiAHQCXCfqFLorIfNQR7aNZLwCliGdNUqzR z0VpR9fCU/Q10NmLbBwpmrXrRD0NL6TUW4CzhAnniutlID6iMO9aPXQqe5kgdl6DbIv0 Ybgnf5ush5ks8DTMItbFc9LJz/sszMX46YDkziW78qO4/vyIacbLcaJmjhBwUkpjOeix ccxFEAQkJqF1gOEwzjVP30yNGGfVCS5jt5w/zVCXGDcH9uSKHKeVLRXLirykj/rZz6Rk oKHUz9a/OV9Y1jbovnTmNoLMplqamO4817glHB7/BikU3UOaCv+6sXOu5ckuCudDgzdm 6UdA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729484550; x=1730089350; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=rF+TQYGEDP6+t2V2Dte+w8ImgbG43FZ1I4yOJyMV9jY=; b=W+5AbBVbKc2m053wtKV8wHJ6AbPzwpXL7+76MAqHwogZH7LDl/DyoiMEyyMT1QqVRJ kCH3XlzjJ91hBKiHQfWp/vJUGuV0yX/HO9U69WZ65EZ2Edok3CJcIZafGC61Dy8Ie7Gw WVYswsbR694Y+5nxvDBeKgGRYU6usZXYCqJXIQoLpq68JrMkU5hENEYbk/AFPI10BImd S8WbLIXe6KLRMb26Llf7b4Zh/Lc98b48aGQtgy6eRP41U0LKZQXZwsewDMJGT6fmG/gf yBNiTsqE3NqzMmyj7RCCGH0o+vQ3KEIaVc7DddqvFquI793Rm2l/VqONv8de6YHrLtHd rNjg== X-Forwarded-Encrypted: i=1; AJvYcCW9H7b2zbKkfB9epaEVOADKXIMKaxl1EsiahC2mK5kA4YIMPCoocA0mPoPHX43r6zEURTq3gT5RwxW6aJM=@vger.kernel.org X-Gm-Message-State: AOJu0YzM0shx4qJDKub132dNPivZhmfPX9/DPH8T6hvKykPTXfcQ2CiH C5ONLH3e7CvQx5vTwY5bIMxfZEKeQAwMW5o9WkKFNojUrxb8bgogqpnmlcb+h74MiMHMchpoagx k/g== X-Google-Smtp-Source: AGHT+IEeIIe5Xg98lfzs9VJWYOJ07F6ek87y8El+d0cpjxlVvqWrNUTXfkoh0Qdzc1+S/vFogSJ2meafeF0= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:1569:9ef4:20ab:abf9]) (user=yuzhao job=sendgmr) by 2002:a25:800f:0:b0:e05:6532:166 with SMTP id 3f1490d57ef6-e2bb11cccaemr14439276.1.1729484550141; Sun, 20 Oct 2024 21:22:30 -0700 (PDT) Date: Sun, 20 Oct 2024 22:22:15 -0600 In-Reply-To: <20241021042218.746659-1-yuzhao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241021042218.746659-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241021042218.746659-4-yuzhao@google.com> Subject: [PATCH v1 3/6] irqchip/gic-v3: support SGI broadcast From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" GIC v3 and later support SGI broadcast, i.e., the mode that routes interrupts to all PEs in the system excluding the local CPU. Supporting this mode can avoid looping through all the remote CPUs when broadcasting SGIs, especially for systems with 200+ CPUs. The performance improvement can be measured with the rest of this series booted with "hugetlb_free_vmemmap=3Don irqchip.gicv3_pseudo_nmi=3D1": cd /sys/kernel/mm/hugepages/ echo 600 >hugepages-1048576kB/nr_hugepages echo 2048kB >hugepages-1048576kB/demote_size perf record -g -- bash -c "echo 600 >hugepages-1048576kB/demote" gic_ipi_send_mask() bash sys time Before: 38.14% 0m10.513s After: 0.20% 0m5.132s Signed-off-by: Yu Zhao --- drivers/irqchip/irq-gic-v3.c | 20 +++++++++++++++++++- 1 file changed, 19 insertions(+), 1 deletion(-) diff --git a/drivers/irqchip/irq-gic-v3.c b/drivers/irqchip/irq-gic-v3.c index ce87205e3e82..42c39385e1b9 100644 --- a/drivers/irqchip/irq-gic-v3.c +++ b/drivers/irqchip/irq-gic-v3.c @@ -1394,9 +1394,20 @@ static void gic_send_sgi(u64 cluster_id, u16 tlist, = unsigned int irq) gic_write_sgi1r(val); } =20 +static void gic_broadcast_sgi(unsigned int irq) +{ + u64 val; + + val =3D BIT(ICC_SGI1R_IRQ_ROUTING_MODE_BIT) | (irq << ICC_SGI1R_SGI_ID_SH= IFT); + + pr_devel("CPU %d: broadcasting SGI %u\n", smp_processor_id(), irq); + gic_write_sgi1r(val); +} + static void gic_ipi_send_mask(struct irq_data *d, const struct cpumask *ma= sk) { int cpu; + cpumask_t broadcast; =20 if (WARN_ON(d->hwirq >=3D 16)) return; @@ -1407,6 +1418,13 @@ static void gic_ipi_send_mask(struct irq_data *d, co= nst struct cpumask *mask) */ dsb(ishst); =20 + cpumask_copy(&broadcast, cpu_present_mask); + cpumask_clear_cpu(smp_processor_id(), &broadcast); + if (cpumask_equal(&broadcast, mask)) { + gic_broadcast_sgi(d->hwirq); + goto done; + } + for_each_cpu(cpu, mask) { u64 cluster_id =3D MPIDR_TO_SGI_CLUSTER_ID(gic_cpu_to_affinity(cpu)); u16 tlist; @@ -1414,7 +1432,7 @@ static void gic_ipi_send_mask(struct irq_data *d, con= st struct cpumask *mask) tlist =3D gic_compute_target_list(&cpu, mask, cluster_id); gic_send_sgi(cluster_id, tlist, d->hwirq); } - +done: /* Force the above writes to ICC_SGI1R_EL1 to be executed */ isb(); } --=20 2.47.0.rc1.288.g06298d1525-goog From nobody Tue Nov 26 06:43:19 2024 Received: from mail-yw1-f202.google.com (mail-yw1-f202.google.com [209.85.128.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 758161D0DCB for ; Mon, 21 Oct 2024 04:22:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729484555; cv=none; b=U5HVgADx3Dysqfpz/DZ6GwSTUI/+KfKa5nu6m1JAss+yQuUJf2vQIuEgufwr2G8PjyWPntTKlKa6tsgKhqDfaTmQPVhKrKzuBbrSX03146gFG9s16U9WKWeCly5mKNk07Zl3fc0GHbixhu/rwFw1dnSsSZpRKH++WASUToloSaE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729484555; c=relaxed/simple; bh=IKmv5l1pYnG1SRlKUheSWIhA3bXn6fHcqVHlBZwuIKQ=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=m2R31lEKS4idvYGasGLz1vK+N44kTeGbfJ1vX2kPLMj1namyR04b9mQ/C+V0yvwl/t6gJfOHMdEfVe1dYX17gdlQEiOoUveCKjnh5UFu9Sm8WNpm/bYIyRGH0ZK6NpE65uq1XdFnN9y0M1iFJoDZ+TFFycfqOa/9Tg+rMIOZ62s= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=SRgJuSxa; arc=none smtp.client-ip=209.85.128.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="SRgJuSxa" Received: by mail-yw1-f202.google.com with SMTP id 00721157ae682-6e2e4874925so68011017b3.1 for ; Sun, 20 Oct 2024 21:22:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729484552; x=1730089352; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=KJDQv9BsbNqPzAGIVHBRqdKJMgT8Ksx982Co8Pglskw=; b=SRgJuSxauB0Mdh5dF0cXzoRA9OPuv+N13jXzoOeRxGwCNXlUDZ7d7hd9xbMf7ueifC /3ovq3TjOdcjLTSmSu/rVZKsY/+GrTtcKkGzLs7w0KicKl1/YcQEJe4KtgI6TGRsZ62z NQMKKVh9qLeIZqCkD20wOp7G+cZ28gWhrfp4Tp/SiKqRcLHzDsyv4LCXjf0CGHJEUpto DHlKTyNpXqCkfeR9I8cDTVo6N/NPutDc6t5Nq8pc8YboQul6UQQtSpDrcNezo5QDfSen smdqfYo4IpwH/srNtkuhDZTK1xPOwy/9AuoEM8yKpbJGp5NXcc9E8/F0VGkHsm1a2K5E mHfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729484552; x=1730089352; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=KJDQv9BsbNqPzAGIVHBRqdKJMgT8Ksx982Co8Pglskw=; b=DQf+sRH/puU+CAb7JFpk1+AQ+IAUd19DiW0Buh3MIqHrNvHdO5XoN8Oo9C0Wz19TyL E7FBYtN3ALqF1+0Iq/Bg3IRbyiSGyftCU2BiLxohlZXMzdHqXOlPQDyWcdxJCqZ9TnqX irBsXF8S0VVCOejlhlyHJy0TNKdL21X8UaQMz2lwhvR9n7FNtwPW7Bre1jyef7vgA4ZT 3GiOPkjBatdJ86yttSo7QFSMwQUmDIvvanLQ2calWArVNwu1aPDJCUR9yypc3IiX14+y PhOxKrZIRyp1UKV33IkU592WdtY45rA0Hmm+eo9WOH7grWEb4odTtyiREvhaVLPcEpFL ywWw== X-Forwarded-Encrypted: i=1; AJvYcCU+RNqMB3cfA+kh1DhrbadJjmOQHeQRBMuybJTCyjHGmYjR45XvieEpoRxbC+gV9R3/77BIPDb+GTIWFdw=@vger.kernel.org X-Gm-Message-State: AOJu0Yztjv8N2vLwylzlVRkix1KRGK6QLCx5VzjPeuf2GpLs491atrJy OnOp7hSShYtTBHrI6ZFk03RkVoNMBxMnAKiQMD3Md1r6uv04V13Li/GcXlCfokNC5607NqjcHg6 K5w== X-Google-Smtp-Source: AGHT+IGEZGtfLFTAo+zlX7imoG9sifk97tpSM8aK6pq0aGlMOVsYXw6fthvIZGCv/EPFEWt1vDf5ZDyKg70= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:1569:9ef4:20ab:abf9]) (user=yuzhao job=sendgmr) by 2002:a05:690c:fc2:b0:6e2:70e:e82e with SMTP id 00721157ae682-6e5bfc0c757mr2067897b3.6.1729484552429; Sun, 20 Oct 2024 21:22:32 -0700 (PDT) Date: Sun, 20 Oct 2024 22:22:16 -0600 In-Reply-To: <20241021042218.746659-1-yuzhao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241021042218.746659-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241021042218.746659-5-yuzhao@google.com> Subject: [PATCH v1 4/6] arm64: broadcast IPIs to pause remote CPUs From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Broadcast pseudo-NMI IPIs to pause remote CPUs for a short period of time, and then reliably resume them when the local CPU exits critical sections that preclude the execution of remote CPUs. A typical example of such critical sections is BBM on kernel PTEs. HugeTLB Vmemmap Optimization (HVO) on arm64 was disabled by commit 060a2c92d1b6 ("arm64: mm: hugetlb: Disable HUGETLB_PAGE_OPTIMIZE_VMEMMAP") due to the folllowing reason: This is deemed UNPREDICTABLE by the Arm architecture without a break-before-make sequence (make the PTE invalid, TLBI, write the new valid PTE). However, such sequence is not possible since the vmemmap may be concurrently accessed by the kernel. Supporting BBM on kernel PTEs is one of the approaches that can make HVO theoretically safe on arm64. Note that it is still possible for the paused CPUs to perform speculative translations. Such translations would cause spurious kernel PFs, which should be properly handled by is_spurious_el1_translation_fault(). Signed-off-by: Yu Zhao --- arch/arm64/include/asm/smp.h | 3 ++ arch/arm64/kernel/smp.c | 92 +++++++++++++++++++++++++++++++++--- 2 files changed, 88 insertions(+), 7 deletions(-) diff --git a/arch/arm64/include/asm/smp.h b/arch/arm64/include/asm/smp.h index 2510eec026f7..cffb0cfed961 100644 --- a/arch/arm64/include/asm/smp.h +++ b/arch/arm64/include/asm/smp.h @@ -133,6 +133,9 @@ bool cpus_are_stuck_in_kernel(void); extern void crash_smp_send_stop(void); extern bool smp_crash_stop_failed(void); =20 +void pause_remote_cpus(void); +void resume_remote_cpus(void); + #endif /* ifndef __ASSEMBLY__ */ =20 #endif /* ifndef __ASM_SMP_H */ diff --git a/arch/arm64/kernel/smp.c b/arch/arm64/kernel/smp.c index 3b3f6b56e733..68829c6de1b1 100644 --- a/arch/arm64/kernel/smp.c +++ b/arch/arm64/kernel/smp.c @@ -85,7 +85,12 @@ static int ipi_irq_base __ro_after_init; static int nr_ipi __ro_after_init =3D NR_IPI; static struct irq_desc *ipi_desc[MAX_IPI] __ro_after_init; =20 -static bool crash_stop; +enum { + SEND_STOP =3D BIT(0), + CRASH_STOP =3D BIT(1), +}; + +static unsigned long stop_in_progress; =20 static void ipi_setup(int cpu); =20 @@ -917,6 +922,79 @@ static void __noreturn ipi_cpu_crash_stop(unsigned int= cpu, struct pt_regs *regs #endif } =20 +static DEFINE_SPINLOCK(cpu_pause_lock); +static cpumask_t paused_cpus; +static cpumask_t resumed_cpus; + +static void pause_local_cpu(void) +{ + int cpu =3D smp_processor_id(); + + cpumask_clear_cpu(cpu, &resumed_cpus); + /* + * Paired with pause_remote_cpus() to confirm that this CPU not only + * will be paused but also can be reliably resumed. + */ + smp_wmb(); + cpumask_set_cpu(cpu, &paused_cpus); + /* paused_cpus must be set before waiting on resumed_cpus. */ + barrier(); + while (!cpumask_test_cpu(cpu, &resumed_cpus)) + cpu_relax(); + /* A typical example for sleep and wake-up functions. */ + smp_mb(); + cpumask_clear_cpu(cpu, &paused_cpus); +} + +void pause_remote_cpus(void) +{ + cpumask_t cpus_to_pause; + + lockdep_assert_cpus_held(); + lockdep_assert_preemption_disabled(); + + cpumask_copy(&cpus_to_pause, cpu_online_mask); + cpumask_clear_cpu(smp_processor_id(), &cpus_to_pause); + + spin_lock(&cpu_pause_lock); + + WARN_ON_ONCE(!cpumask_empty(&paused_cpus)); + + smp_cross_call(&cpus_to_pause, IPI_CPU_STOP_NMI); + + while (!cpumask_equal(&cpus_to_pause, &paused_cpus)) + cpu_relax(); + /* + * Paired with pause_local_cpu() to confirm that all CPUs not only will + * be paused but also can be reliably resumed. + */ + smp_rmb(); + WARN_ON_ONCE(cpumask_intersects(&cpus_to_pause, &resumed_cpus)); + + spin_unlock(&cpu_pause_lock); +} + +void resume_remote_cpus(void) +{ + cpumask_t cpus_to_resume; + + lockdep_assert_cpus_held(); + lockdep_assert_preemption_disabled(); + + cpumask_copy(&cpus_to_resume, cpu_online_mask); + cpumask_clear_cpu(smp_processor_id(), &cpus_to_resume); + + spin_lock(&cpu_pause_lock); + + cpumask_setall(&resumed_cpus); + /* A typical example for sleep and wake-up functions. */ + smp_mb(); + while (cpumask_intersects(&cpus_to_resume, &paused_cpus)) + cpu_relax(); + + spin_unlock(&cpu_pause_lock); +} + static void arm64_backtrace_ipi(cpumask_t *mask) { __ipi_send_mask(ipi_desc[IPI_CPU_BACKTRACE], mask); @@ -970,7 +1048,9 @@ static void do_handle_IPI(int ipinr) =20 case IPI_CPU_STOP: case IPI_CPU_STOP_NMI: - if (IS_ENABLED(CONFIG_KEXEC_CORE) && crash_stop) { + if (!test_bit(SEND_STOP, &stop_in_progress)) { + pause_local_cpu(); + } else if (test_bit(CRASH_STOP, &stop_in_progress)) { ipi_cpu_crash_stop(cpu, get_irq_regs()); unreachable(); } else { @@ -1142,7 +1222,6 @@ static inline unsigned int num_other_online_cpus(void) =20 void smp_send_stop(void) { - static unsigned long stop_in_progress; cpumask_t mask; unsigned long timeout; =20 @@ -1154,7 +1233,7 @@ void smp_send_stop(void) goto skip_ipi; =20 /* Only proceed if this is the first CPU to reach this code */ - if (test_and_set_bit(0, &stop_in_progress)) + if (test_and_set_bit(SEND_STOP, &stop_in_progress)) return; =20 /* @@ -1230,12 +1309,11 @@ void crash_smp_send_stop(void) * This function can be called twice in panic path, but obviously * we execute this only once. * - * We use this same boolean to tell whether the IPI we send was a + * We use the CRASH_STOP bit to tell whether the IPI we send was a * stop or a "crash stop". */ - if (crash_stop) + if (test_and_set_bit(CRASH_STOP, &stop_in_progress)) return; - crash_stop =3D 1; =20 smp_send_stop(); =20 --=20 2.47.0.rc1.288.g06298d1525-goog From nobody Tue Nov 26 06:43:19 2024 Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A37841D0E35 for ; Mon, 21 Oct 2024 04:22:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729484557; cv=none; b=nVjKlsklHe8lpqUC1g+3bjPceZrCR0jxnW5yik+efknJWfIvcoNiGkahmu4Je725qTE7cmg6+wVp3VaHCzn3z/0bb5Z+6BhkkBwU99e1ciL48z5TokWbM1TD8Q6sa+t/bUg9nAGctxYCFQkXXGZWv9HHX+wB6ng7Nh9/sLSK7aw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729484557; c=relaxed/simple; bh=KS3IBh6pNTFwZWi8Y6TKh/aDmrKSQhwTGt8tYaFvGTs=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=Ht/g0AHNXPzMaoR4a1oOz9NQG2CsCQgXV+tU4A++SNZC5ZNhqRFLaylYReB29N21EtNlV7lJZD0ojxegNEwkX40owgpMg39rTD2JwVmtuFxRbeTh5oOf/tD0eyR67C18fgHwaxid9zw3aQL6JZZtu9mDqG74sXstitFOthNvEyA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=CAk5zSQk; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="CAk5zSQk" Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-e293150c2c6so8378661276.1 for ; Sun, 20 Oct 2024 21:22:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729484555; x=1730089355; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=toVG9Bv1ZFXM9w/cWVF/q6arTJn/fxOOJss0eISMw6g=; b=CAk5zSQkh6HH7+mSljMyoxOK1v80+vZzykJQSHEjWQblGpeaSVZQeiNJ8YarezCJ/Y c7pSvGv9YQbQifnoy9aJp1aBtz777Fq3YqgpXaWcWqiZJl8y1D7blxVb2K/IGnPGSCG6 OcHWTjIiMJfOmC/CVFgy1HDhaPLhDOeEuoJ00Rl6+35abcMdB4qpM6UYRXlLGmOcmiKT ixISqkc3aOb5616KQp/k+yBniycthn/oMiGO02jfDT8LzCgO6gMtx3QfVem4ajP/uA8G mgOceWScvZGAwrpwfMKHYrgtLesI9OPKnpERgY1s+9fcR6om3qKpQyeil0QOlt2wg7rV uhUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729484555; x=1730089355; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=toVG9Bv1ZFXM9w/cWVF/q6arTJn/fxOOJss0eISMw6g=; b=wTySrMKdvPdy2EVMFwhBsou2qrllnl2QEwFCsHCL5aVXY3OZBXx6ThC7z60Vy0wNnb UdEHDMEM5q0fBBB6L2yRoHIaxP495U3bx/AvM/650ip01OA5W+QjMD//4snrz3dzrWzK VIHHguE4gk9XMkYTtLD7Cz7BqmvwznZUjfGHOFqR7EDmuzkN5ZrSXiX5tHcFlY2P2EjX YbF+r0YOgaiU+7SH3xy9b4I+tMI1ZEwCSoa2Mhv0CH6/aI0lm0jW1dNQwx9LzsLTBLD8 C41jvEqWlWdf64fTg/v2x8tgbwnkAl/4uv2MG8Bj8MHnguOytk0ILuzzK99mw3hhGc9V N5Jw== X-Forwarded-Encrypted: i=1; AJvYcCXmJO+zV7VndJoHmzPZlBSGUWKn/7hxtXqmNDHCVEfeBz24ozxwmm62zsLMTwsZoKR5Cbmyi0Y6okciWfI=@vger.kernel.org X-Gm-Message-State: AOJu0Yyfp4tMZrLxnT0jzLoaZBOU6rl0itkyrVa73qTNJkbMPs+HB+rz p7AhS074GUlpnD//0Dkq274vn4P0srdW4ERKeIQADfswKQVJDUKq7ICL5v6sz5+snlWN3VQUE4D u4w== X-Google-Smtp-Source: AGHT+IFk9uamhIyVPiPVkC9AAkl4R2cZvpHrX9RuV4B2/rBtrreDBZ8gNEYfOVDCArI/WEZD+WVYAn+KZ5o= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:1569:9ef4:20ab:abf9]) (user=yuzhao job=sendgmr) by 2002:a25:7453:0:b0:e28:eee0:aaa1 with SMTP id 3f1490d57ef6-e2bb12db7e5mr5668276.4.1729484554722; Sun, 20 Oct 2024 21:22:34 -0700 (PDT) Date: Sun, 20 Oct 2024 22:22:17 -0600 In-Reply-To: <20241021042218.746659-1-yuzhao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241021042218.746659-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241021042218.746659-6-yuzhao@google.com> Subject: [PATCH v1 5/6] arm64: pause remote CPUs to update vmemmap From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Pause remote CPUs so that the local CPU can follow the proper BBM sequence to safely update the vmemmap mapping `struct page` areas. While updating the vmemmap, it is guaranteed that neither the local CPU nor the remote ones will access the `struct page` area being updated, and therefore they should not trigger (non-spurious) kernel PFs. Signed-off-by: Yu Zhao --- arch/arm64/include/asm/pgalloc.h | 69 ++++++++++++++++++++++++++++++++ 1 file changed, 69 insertions(+) diff --git a/arch/arm64/include/asm/pgalloc.h b/arch/arm64/include/asm/pgal= loc.h index 8ff5f2a2579e..f50f79f57c1e 100644 --- a/arch/arm64/include/asm/pgalloc.h +++ b/arch/arm64/include/asm/pgalloc.h @@ -12,6 +12,7 @@ #include #include #include +#include =20 #define __HAVE_ARCH_PGD_FREE #define __HAVE_ARCH_PUD_FREE @@ -137,4 +138,72 @@ pmd_populate(struct mm_struct *mm, pmd_t *pmdp, pgtabl= e_t ptep) __pmd_populate(pmdp, page_to_phys(ptep), PMD_TYPE_TABLE | PMD_TABLE_PXN); } =20 +#ifdef CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP + +#define VMEMMAP_ARCH_TLB_FLUSH_FLAGS (VMEMMAP_SPLIT_NO_TLB_FLUSH | VMEMMAP= _REMAP_NO_TLB_FLUSH) + +#define vmemmap_update_supported vmemmap_update_supported +static inline bool vmemmap_update_supported(void) +{ + return system_uses_irq_prio_masking(); +} + +#define vmemmap_update_lock vmemmap_update_lock +static inline void vmemmap_update_lock(void) +{ + cpus_read_lock(); +} + +#define vmemmap_update_unlock vmemmap_update_unlock +static inline void vmemmap_update_unlock(void) +{ + cpus_read_unlock(); +} + +#define vmemmap_update_pte_range_start vmemmap_update_pte_range_start +static inline void vmemmap_update_pte_range_start(pte_t *pte, + unsigned long start, unsigned long end) +{ + unsigned long addr; + + local_irq_disable(); + pause_remote_cpus(); + + for (addr =3D start; addr !=3D end; addr +=3D PAGE_SIZE, pte++) + pte_clear(&init_mm, addr, pte); + + flush_tlb_kernel_range(start, end); +} + +#define vmemmap_update_pte_range_end vmemmap_update_pte_range_end +static inline void vmemmap_update_pte_range_end(void) +{ + resume_remote_cpus(); + local_irq_enable(); +} + +#define vmemmap_update_pmd_range_start vmemmap_update_pmd_range_start +static inline void vmemmap_update_pmd_range_start(pmd_t *pmd, + unsigned long start, unsigned long end) +{ + unsigned long addr; + + local_irq_disable(); + pause_remote_cpus(); + + for (addr =3D start; addr !=3D end; addr +=3D PMD_SIZE, pmd++) + pmd_clear(pmd); + + flush_tlb_kernel_range(start, end); +} + +#define vmemmap_update_pmd_range_end vmemmap_update_pmd_range_end +static inline void vmemmap_update_pmd_range_end(void) +{ + resume_remote_cpus(); + local_irq_enable(); +} + +#endif /* CONFIG_HUGETLB_PAGE_OPTIMIZE_VMEMMAP */ + #endif --=20 2.47.0.rc1.288.g06298d1525-goog From nobody Tue Nov 26 06:43:19 2024 Received: from mail-yb1-f201.google.com (mail-yb1-f201.google.com [209.85.219.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3DD571D12E5 for ; Mon, 21 Oct 2024 04:22:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729484559; cv=none; b=SVZTe/YFnIUW12Z81C3q53oRYA7cVIeUKnAQvhbamBxDOrhNpRql9WYVw6Z+3VZVUJy/148FvjAzwmFlb39PaQsHmcytvDy1dvnotQiNdzEoj6+faKWws2Jr7uFq8N2nU9200k3uCcodtLupmo4qaQTGuaCsRfzK4U++uztkAns= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1729484559; c=relaxed/simple; bh=iO3OI/1zN6TD1V0JjPsN855BFkMCfNnKpuMNvwgj78A=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=sFaFGHqpW8SsRLCBalJT2G2kVepLdAMzsZifPRBjH2iwdlBkUWCp5NiHThLE06rsiRXxeOYmOWXaMi9ajcVjeIUwAW8fE1aMxhkJ/W2gfZODrEj260eQT2F4pGB3mOc22mIykGsjpSzpqIrEMZY/7of/fanjGLI/NhSJ2x7QmE0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=TRFKlVCo; arc=none smtp.client-ip=209.85.219.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--yuzhao.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="TRFKlVCo" Received: by mail-yb1-f201.google.com with SMTP id 3f1490d57ef6-e290222fde4so5385545276.1 for ; Sun, 20 Oct 2024 21:22:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1729484557; x=1730089357; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=0qOoUc7S7Ph4MJcNIkBb7u3tw8zxiypBu0RBb7iGSGQ=; b=TRFKlVCo0jtTjmLZ4N8fhPzMJHA19kG7VJjQy1LD2NAyCetdKgrbvNO1xxs/4QgF+z ymKjPoWMj4JEV7YeD4g5St4QPwjHcrBUV6rHsPElKLB2bCc4fCkCsyRYad31RztN4jRt hPa0KSOmRKXUau1JKCP0ARmh0V10fnExGO0Zcer7UrKxYETL9Q9uJofeBmyhcPRaExgR N2qrQJT2Md58j+iHUqfhr+FUH1wA9q/vHP6ptcORieMJ0VPnHggigSMT8YuMqT7rC2mq 9GgTL9FbN/fAduHrY563hjvUi37+QxboTrbsnb88b+vqdx2fY4YfO+3gTIIcDGMInA3v bV2w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729484557; x=1730089357; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=0qOoUc7S7Ph4MJcNIkBb7u3tw8zxiypBu0RBb7iGSGQ=; b=SKn42tvg/EN/yeIDRYhzT+0WINVSO8rEnJt87KxqzdjMNIEnoz4OGDEOmQOM7zbpkB xihytAoG7tzFtqPFjRoy/nIpitFTne1gihxBEAAxSA12npsoncPX8zpo1Phc4RlU9ujH EiOHiWdw8FSntRQY1qxr/pdber/bKdGdR9rC9JgNDHCdfxh0tBZ3R36fkuOINMreuCpC lZoHvH5a1C2yXfKt3y3RijV+8xPQ3hht0HCIZrYNGGzOOyBLOExGgsTvuQ67jRzgnLkp qnDVTTjMN+DPzUxQUjGQdtWNTK7Xo7Bf6uraI9yVX4EXzfptGwAGpL56gT0glq4wO3RJ znJg== X-Forwarded-Encrypted: i=1; AJvYcCUEVXgAdlsh7S08ge7Bm769RDDiiiw9UINb9PdTh4AlM0p+fFZIqS1B10wdzHLBTUSdFfcApHM2xmKMIxA=@vger.kernel.org X-Gm-Message-State: AOJu0YysSmWUm6dBSHEiHM0q9TUDluBQoAcb+cKz6LwscKH2vCpf/Wxh MveqevvUNCtoNOwVjcMK+tZ3cK/EPypend8Cbd8NgLMm2zd/e2pe8lA1W8cUe+XqSI+ZjW8Bs6z QSg== X-Google-Smtp-Source: AGHT+IHT0kbssKqnqvzVXYJ6z8GdpmqmvaKobmcxDyj7IjWvvjuKhF8UkhPdehMt/EKyLWoXTgeCtaMmNCE= X-Received: from yuzhao2.bld.corp.google.com ([2a00:79e0:2e28:6:1569:9ef4:20ab:abf9]) (user=yuzhao job=sendgmr) by 2002:a25:c781:0:b0:e29:7cd6:593b with SMTP id 3f1490d57ef6-e2bb169afa4mr24157276.8.1729484557262; Sun, 20 Oct 2024 21:22:37 -0700 (PDT) Date: Sun, 20 Oct 2024 22:22:18 -0600 In-Reply-To: <20241021042218.746659-1-yuzhao@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20241021042218.746659-1-yuzhao@google.com> X-Mailer: git-send-email 2.47.0.rc1.288.g06298d1525-goog Message-ID: <20241021042218.746659-7-yuzhao@google.com> Subject: [PATCH v1 6/6] arm64: select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP From: Yu Zhao To: Andrew Morton , Catalin Marinas , Marc Zyngier , Muchun Song , Thomas Gleixner , Will Deacon Cc: Douglas Anderson , Mark Rutland , Nanyong Sun , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Yu Zhao Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" To use HVO, make sure that the kernel is booted with pseudo-NMI enabled by "irqchip.gicv3_pseudo_nmi=3D1", as well as "hugetlb_free_vmemmap=3Don" unless HVO is enabled by default. Note that HVO checks the pseudo-NMI capability and is disabled at runtime if the capability turns out not supported. Successfully enabling HVO should have the following: # dmesg | grep NMI GICv3: Pseudo-NMIs enabled using ... # sysctl vm.hugetlb_optimize_vmemmap vm.hugetlb_optimize_vmemmap =3D 1 Signed-off-by: Yu Zhao --- arch/arm64/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index fd9df6dcc593..e93745f819d9 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -109,6 +109,7 @@ config ARM64 select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT select ARCH_WANT_FRAME_POINTERS select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && = !ARM64_VA_BITS_36) + select ARCH_WANT_OPTIMIZE_HUGETLB_VMEMMAP select ARCH_WANT_LD_ORPHAN_WARN select ARCH_WANTS_EXECMEM_LATE if EXECMEM select ARCH_WANTS_NO_INSTR --=20 2.47.0.rc1.288.g06298d1525-goog