From nobody Fri Dec 19 20:13:05 2025 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CA21217557C for ; Mon, 19 Aug 2024 14:56:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724079376; cv=none; b=rgLC8fQW1awwbE/rjmygnyF2iKqNp8H+gnoXHIaYn00ts+Ac42X3JWqdcq3ImIqwz2lshCyeHsKg2BZVWZjBr1Y9UrpEAvU4krhAizGwsWjri8VNETlk5tozv12kGPT54WbIB9f/oSRc6W/HUK1NkIQd4+irRnY6RycwKQfvcl8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724079376; c=relaxed/simple; bh=ehRAQ21OMPu9VTeAFsmD9ehQvbNSIsV1Aulxae74dQI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=WkLIx1zIK+2abup6F/yb01tZTWdqyYWsY2AkIhyZGSu8icyqbdxCn/fsQr/Kk1d2vvmBn0orBATV16tYJJrTVWfJWeFl8A8UWX4AQtbC4+qQCpV9g0PV2YUi7GYB3OFlBC/6ZHLiYN3PUnd60ZA7MOVV8FbbzWkK/v/gu1MY3r4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=TChoGPBh; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="TChoGPBh" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1724079374; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=x1wlLI8HeV7ugiPZwClPrcrHBMcL2Ga2rfvmngOKpRU=; b=TChoGPBhGdUJXp91hgSBsE2MIT4rgZyIX4YVoAW6ggqsLmCULut0t0C42dOShzcvDxpOXZ CgNGTegC1bPkKi6gSdowPGwKpDBSgp975uz6GrML6HtQA62kwXjfvBbPMQyAEZZwMIX8wG icucyXihYqqeTu8UXX6usVDUOgGbadU= Received: from mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (ec2-54-186-198-63.us-west-2.compute.amazonaws.com [54.186.198.63]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-113-C2pU-VYVONmHz2FowVFgWg-1; Mon, 19 Aug 2024 10:56:10 -0400 X-MC-Unique: C2pU-VYVONmHz2FowVFgWg-1 Received: from mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com [10.30.177.17]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by mx-prod-mc-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTPS id C97BB1955BF2; Mon, 19 Aug 2024 14:56:08 +0000 (UTC) Received: from fedora.redhat.com (unknown [10.72.116.15]) by mx-prod-int-05.mail-002.prod.us-west-2.aws.redhat.com (Postfix) with ESMTP id D257319560A3; Mon, 19 Aug 2024 14:55:59 +0000 (UTC) From: Pingfan Liu To: linux-arm-kernel@lists.infradead.org Cc: Pingfan Liu , Ard Biesheuvel , Jan Hendrik Farr , Philipp Rudo , Lennart Poettering , Jarkko Sakkinen , Eric Biederman , Baoquan He , Dave Young , Mark Rutland , Will Deacon , Catalin Marinas , kexec@lists.infradead.org, linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [RFCv2 8/9] arm64: kexec: Prepare page table for emulator Date: Mon, 19 Aug 2024 22:53:41 +0800 Message-ID: <20240819145417.23367-9-piliu@redhat.com> In-Reply-To: <20240819145417.23367-1-piliu@redhat.com> References: <20240819145417.23367-1-piliu@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-Scanned-By: MIMEDefang 3.0 on 10.30.177.17 Content-Type: text/plain; charset="utf-8" The emulator will run in identity mapping, and the first kernel prepares for it. The special allocator resorts to kimage_alloc_control_pages(), which can avoid the allocation on the spot of the destination and overwritten during copying the kernel. The identity mapping covers only all of the kexec segments and efi runtime service. This reduces the memory consumption of page table. Signed-off-by: Pingfan Liu Cc: Ard Biesheuvel Cc: Mark Rutland Cc: Will Deacon Cc: Catalin Marinas To: linux-arm-kernel@lists.infradead.org --- arch/arm64/kernel/machine_kexec.c | 101 ++++++++++++++++++++++++++++-- include/linux/kexec.h | 5 +- 2 files changed, 101 insertions(+), 5 deletions(-) diff --git a/arch/arm64/kernel/machine_kexec.c b/arch/arm64/kernel/machine_= kexec.c index 6958c1fc84a5a..871ee1163ebca 100644 --- a/arch/arm64/kernel/machine_kexec.c +++ b/arch/arm64/kernel/machine_kexec.c @@ -14,6 +14,7 @@ #include #include #include +#include =20 #include #include @@ -86,8 +87,22 @@ static void kexec_segment_flush(const struct kimage *kim= age) } } =20 +/* todo: alloc page for the pgtable used by efi emulator in crashkernel ra= nge */ +static phys_addr_t crash_page_alloc(int unused, void *arg) +{ + struct kimage *kimage =3D (struct kimage *)arg; + int i; + + //skip kimage->segment[].mem + for (i =3D 0; i < kimage->nr_segments; i ++) { + //seg =3D &kimage->segment[i]; + } + //skip any range allocated + return -1; +} + /* Allocates pages for kexec page table */ -static void *kexec_page_alloc(void *arg) +static void *__kexec_page_alloc(void *arg) { struct kimage *kimage =3D arg; struct page *page =3D kimage_alloc_control_pages(kimage, 0); @@ -102,6 +117,82 @@ static void *kexec_page_alloc(void *arg) return vaddr; } =20 +static phys_addr_t kexec_page_alloc(int unused, void *arg) +{ + void *vaddr; + + vaddr =3D __kexec_page_alloc(arg); + if (!vaddr) + return (phys_addr_t)-1; + return virt_to_phys(vaddr); +} + +/* + * This function should be called after all kimage segments have been prof= iled=20 + * Return physical address of page table's root + */ +phys_addr_t arch_emulator_prepare_pgtable(struct kimage *kimage, + struct efi_emulator_param *param) +{ + efi_memory_desc_t *md; + struct kexec_segment *seg; + unsigned long paddr, vaddr, sz; + pgd_t *pgd; + typedef phys_addr_t (* alloc_fn)(int, void *); + alloc_fn alloc; + phys_addr_t pgd_paddr; + int i; + + /* + * Set up pgtable of emulator, either for crash or for reboot. + * All of the segments have been profiled, and kimage_alloc_normal_contro= l_pages() + * will allocate page in safe zone. + * On the other hand, these pages are not in any segment, which means the= y are + * left, not copied. Hence the radix tree laying out on them is not broke= n. + */ + if (kimage->head & IND_DONE) + alloc =3D crash_page_alloc; + else + alloc =3D kexec_page_alloc; + pgd_paddr =3D alloc(0, kimage); + pgd =3D (pgd_t *)phys_to_virt(pgd_paddr); + for (i =3D 0; i < kimage->nr_segments; i ++) { + seg =3D &kimage->segment[i]; + paddr =3D ALIGN_DOWN(seg->mem, PAGE_SIZE); + sz =3D ALIGN(seg->mem - paddr + seg->memsz, PAGE_SIZE); + kexec_dprintk("Set up mapping for phyaddr: 0x%lx, size:0x%lx", paddr, sz= ); + //todo: distinguish executable segment + __create_pgd_mapping_locked(pgd, paddr, paddr, sz, + PAGE_KERNEL_EXEC, alloc, kimage, 0); + } + + /* + * UEFI stub can call EFI runtime service either before or after one-shot + * SetVirtualAddressMap(). That means the mapping for + * EFI_RUNTIME_SERVICES_CODE/_DATA should be set up here. + * And the virtual address range occupied by md must be reserved, + * accordingly, its physical address should not be allocated by kexec + * allocator + */ + for_each_efi_memory_desc(md) { + if (md->attribute & EFI_MEMORY_RUNTIME) { + vaddr =3D md->virt_addr; + paddr =3D md->phys_addr; + sz =3D md->num_pages * EFI_PAGE_SIZE; + kexec_dprintk("Set up mapping for md phyaddr: 0x%lx, virt: 0x%lx, size:= 0x%lx", paddr, vaddr, sz); + __create_pgd_mapping_locked(pgd, paddr, vaddr, sz, + PAGE_KERNEL_EXEC, alloc, kimage, 0); + } + } + + if (param->print_enabled) + __create_pgd_mapping_locked(pgd, param->earlycon_reg_base, + param->earlycon_reg_base, param->earlycon_reg_sz, + pgprot_device(PAGE_KERNEL), alloc, kimage, 0); + + return pgd_paddr; +} + int machine_kexec_post_load(struct kimage *kimage) { int rc; @@ -109,7 +200,7 @@ int machine_kexec_post_load(struct kimage *kimage) void *reloc_code =3D page_to_virt(kimage->control_code_page); long reloc_size; struct trans_pgd_info info =3D { - .trans_alloc_page =3D kexec_page_alloc, + .trans_alloc_page =3D __kexec_page_alloc, .trans_alloc_arg =3D kimage, }; =20 @@ -129,7 +220,7 @@ int machine_kexec_post_load(struct kimage *kimage) } =20 /* Create a copy of the linear map */ - trans_pgd =3D kexec_page_alloc(kimage); + trans_pgd =3D __kexec_page_alloc(kimage); if (!trans_pgd) return -ENOMEM; rc =3D trans_pgd_create_copy(&info, &trans_pgd, PAGE_OFFSET, PAGE_END); @@ -145,6 +236,7 @@ int machine_kexec_post_load(struct kimage *kimage) &kimage->arch.t0sz, reloc_code); if (rc) return rc; + kimage->arch.phys_offset =3D virt_to_phys(kimage) - (long)kimage; =20 /* Flush the reloc_code in preparation for its execution. */ @@ -175,7 +267,6 @@ void machine_kexec(struct kimage *kimage) "Some CPUs may be stale, kdump will be unreliable.\n"); =20 pr_info("Bye!\n"); - local_daif_mask(); =20 /* @@ -192,6 +283,7 @@ void machine_kexec(struct kimage *kimage) =20 cpu_install_idmap(); restart =3D (void *)__pa_symbol(cpu_soft_restart); + /* kimage->start can be either the entry of kernel or efi emulator */ restart(is_hyp_nvhe(), kimage->start, kimage->arch.param_mem, 0, 0); } else { @@ -201,6 +293,7 @@ void machine_kexec(struct kimage *kimage) __hyp_set_vectors(kimage->arch.el2_vectors); cpu_install_ttbr0(kimage->arch.ttbr0, kimage->arch.t0sz); kernel_reloc =3D (void *)kimage->arch.kern_reloc; + //tell between the emulator and normal kernel inside the relocate code=20 kernel_reloc(kimage); } =20 diff --git a/include/linux/kexec.h b/include/linux/kexec.h index 57b98bcaa5228..1599c21e7c5d5 100644 --- a/include/linux/kexec.h +++ b/include/linux/kexec.h @@ -22,6 +22,7 @@ =20 #include #include +#include =20 extern note_buf_t __percpu *crash_notes; =20 @@ -464,7 +465,9 @@ static inline int arch_kexec_post_alloc_pages(void *vad= dr, unsigned int pages, g static inline void arch_kexec_pre_free_pages(void *vaddr, unsigned int pag= es) { } #endif =20 -extern phys_addr_t arch_emulator_prepare_pgtable(struct kimage *kimage); +extern phys_addr_t arch_emulator_prepare_pgtable(struct kimage *kimage, + struct efi_emulator_param *param); + extern bool kexec_file_dbg_print; =20 #define kexec_dprintk(fmt, arg...) \ --=20 2.41.0