From nobody Tue Dec 16 19:46:41 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1A82A160865; Fri, 21 Jun 2024 16:44:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718988268; cv=none; b=tLmL61a0/ucwYtioza0Bfy+P1qKv/14qiJjsaifkbc82DW1CifZ2VrurfyPoOnmC30MNYx783iBRNNDoyH1TXVjm1iKs+a4/pWwTUeHGKpuL1vaAgD6bukxBs4KTRzjlkbPe8L/pCE08LYu2i+UW0azBrdWizJjrO/AWBKmmePo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718988268; c=relaxed/simple; bh=vQhv6zBl+J5Tnsr+ly6iaE4iRU0OCteyzIdlEzGRYuI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=uxaT7CIGyw/2AWbgC+LWVFhCperlikG3+00kF61EDpRuPmftIlRAtFCNkYWwJ1uhhZk4tXlWQjh6WV6c2cCD5BpR7fMpAU4doaEiM7+SU1Hbd0lpNLqqZCct2l2xFy1lT+phIw444O/LYWc2n3tvqJLSn7baJYQZw2aEItbqbJU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=lCuy782g; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="lCuy782g" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1718988267; x=1750524267; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=vQhv6zBl+J5Tnsr+ly6iaE4iRU0OCteyzIdlEzGRYuI=; b=lCuy782gOhNJ889ttFhuYVq20H7XkMAT6IlUGLm+zGnm+JQeef6mRrsT 86sCalfdqswI65wc1U36uvu5TMKsyZwrt4rjafTAgO/TRKpOgedgUn7vl VeG0pL+dueNeJqDM2csS/8O53lIhOeBy7+jvQzW+V7Enu6FSK60BzIy7d R3RF0Rf1bxBfAjInWhY3vapu/nbUWM1Rz+dREX3132siPMPaCPm8jP3m3 zrYxJ+ZyB1s42HF6cRQuYZjLQa0px9Aoxg2xTUz9trmisfmdEJ6vqRkxd JD0v7I7GC2xp1uksPuOeI+40lOXhgd2Iktem5VQ6q4jVrtGRnF60RyA8d A==; X-CSE-ConnectionGUID: Z3C/CHeARVOYpQXFBIuzbQ== X-CSE-MsgGUID: h36VyECMSpGQ5c3puvEqkQ== X-IronPort-AV: E=McAfee;i="6700,10204,11110"; a="15727182" X-IronPort-AV: E=Sophos;i="6.08,255,1712646000"; d="scan'208";a="15727182" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jun 2024 09:44:24 -0700 X-CSE-ConnectionGUID: irTw7mS5TLWlF9ye/qQdGg== X-CSE-MsgGUID: s2D0gDk6ReGQKZ1Ozj6i6A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,255,1712646000"; d="scan'208";a="42745631" Received: from black.fi.intel.com ([10.237.72.28]) by fmviesa010.fm.intel.com with ESMTP; 21 Jun 2024 09:44:18 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 42799122; Fri, 21 Jun 2024 19:44:17 +0300 (EEST) From: "Kirill A. Shutemov" To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Cc: Jonathan Corbet , Andy Lutomirski , Peter Zijlstra , Ard Biesheuvel , Jan Kiszka , Kieran Bingham , "Kirill A. Shutemov" , Michael Roth , Rick Edgecombe , Brijesh Singh , Sandipan Das , Juergen Gross , Tom Lendacky , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-efi@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 1/3] x86/64/mm: Always use dynamic memory layout Date: Fri, 21 Jun 2024 19:44:04 +0300 Message-ID: <20240621164406.256314-2-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240621164406.256314-1-kirill.shutemov@linux.intel.com> References: <20240621164406.256314-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Dynamic memory layout is used by KASLR and 5-level paging. CONFIG_X86_5LEVEL is going to be removed, making 5-level paging support unconditional which requires unconditional support of dynamic memory layout. Remove CONFIG_DYNAMIC_MEMORY_LAYOUT. Signed-off-by: Kirill A. Shutemov --- arch/x86/Kconfig | 8 -------- arch/x86/include/asm/page_64_types.h | 4 ---- arch/x86/include/asm/pgtable_64_types.h | 6 ------ arch/x86/kernel/head64.c | 2 -- scripts/gdb/linux/pgtable.py | 4 +--- 5 files changed, 1 insertion(+), 23 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index e30ea4129d2c..827928680ed6 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1501,7 +1501,6 @@ config X86_PAE config X86_5LEVEL bool "Enable 5-level page tables support" default y - select DYNAMIC_MEMORY_LAYOUT select SPARSEMEM_VMEMMAP depends on X86_64 help @@ -2237,17 +2236,10 @@ config PHYSICAL_ALIGN =20 Don't change this unless you know what you are doing. =20 -config DYNAMIC_MEMORY_LAYOUT - bool - help - This option makes base addresses of vmalloc and vmemmap as well as - __PAGE_OFFSET movable during boot. - config RANDOMIZE_MEMORY bool "Randomize the kernel memory sections" depends on X86_64 depends on RANDOMIZE_BASE - select DYNAMIC_MEMORY_LAYOUT default RANDOMIZE_BASE help Randomizes the base virtual address of kernel memory sections diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/pa= ge_64_types.h index 06ef25411d62..c2f3c50a2787 100644 --- a/arch/x86/include/asm/page_64_types.h +++ b/arch/x86/include/asm/page_64_types.h @@ -41,11 +41,7 @@ #define __PAGE_OFFSET_BASE_L5 _AC(0xff11000000000000, UL) #define __PAGE_OFFSET_BASE_L4 _AC(0xffff888000000000, UL) =20 -#ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT #define __PAGE_OFFSET page_offset_base -#else -#define __PAGE_OFFSET __PAGE_OFFSET_BASE_L4 -#endif /* CONFIG_DYNAMIC_MEMORY_LAYOUT */ =20 #define __START_KERNEL_map _AC(0xffffffff80000000, UL) =20 diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm= /pgtable_64_types.h index 9053dfe9fa03..09df8939b997 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -130,15 +130,9 @@ extern unsigned int ptrs_per_p4d; #define __VMEMMAP_BASE_L4 0xffffea0000000000UL #define __VMEMMAP_BASE_L5 0xffd4000000000000UL =20 -#ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT # define VMALLOC_START vmalloc_base # define VMALLOC_SIZE_TB (pgtable_l5_enabled() ? VMALLOC_SIZE_TB_L5 : VMAL= LOC_SIZE_TB_L4) # define VMEMMAP_START vmemmap_base -#else -# define VMALLOC_START __VMALLOC_BASE_L4 -# define VMALLOC_SIZE_TB VMALLOC_SIZE_TB_L4 -# define VMEMMAP_START __VMEMMAP_BASE_L4 -#endif /* CONFIG_DYNAMIC_MEMORY_LAYOUT */ =20 /* * End of the region for which vmalloc page tables are pre-allocated. diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index a817ed0724d1..ec36ad7117ae 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -60,14 +60,12 @@ unsigned int ptrs_per_p4d __ro_after_init =3D 1; EXPORT_SYMBOL(ptrs_per_p4d); #endif =20 -#ifdef CONFIG_DYNAMIC_MEMORY_LAYOUT unsigned long page_offset_base __ro_after_init =3D __PAGE_OFFSET_BASE_L4; EXPORT_SYMBOL(page_offset_base); unsigned long vmalloc_base __ro_after_init =3D __VMALLOC_BASE_L4; EXPORT_SYMBOL(vmalloc_base); unsigned long vmemmap_base __ro_after_init =3D __VMEMMAP_BASE_L4; EXPORT_SYMBOL(vmemmap_base); -#endif =20 static inline bool check_la57_support(void) { diff --git a/scripts/gdb/linux/pgtable.py b/scripts/gdb/linux/pgtable.py index 30d837f3dfae..09aac2421fb8 100644 --- a/scripts/gdb/linux/pgtable.py +++ b/scripts/gdb/linux/pgtable.py @@ -29,11 +29,9 @@ def page_mask(level=3D1): raise Exception(f'Unknown page level: {level}') =20 =20 -#page_offset_base in case CONFIG_DYNAMIC_MEMORY_LAYOUT is disabled -POB_NO_DYNAMIC_MEM_LAYOUT =3D '0xffff888000000000' def _page_offset_base(): pob_symbol =3D gdb.lookup_global_symbol('page_offset_base') - pob =3D pob_symbol.name if pob_symbol else POB_NO_DYNAMIC_MEM_LAYOUT + pob =3D pob_symbol.name return gdb.parse_and_eval(pob) =20 =20 --=20 2.43.0 From nobody Tue Dec 16 19:46:41 2025 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A69491607AD; Fri, 21 Jun 2024 16:44:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718988268; cv=none; b=Hx969sgQJxYpts+dr1SPmPVdI+pxrUDhI+F4YUMliLuf4Ua8yh0BlMZxhXplqR1vZnNguEeHAGYuicLZMuCySjKVJdBn8s4UPHjf9CudYP6cw8v1HlLeStUpn64th/8BNHELddIhgj0E/rtwpBbrcMiTYqd3Mm5AhhZDirDustI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718988268; c=relaxed/simple; bh=1jHhrYMe7a3/ohkHtpaJJmyxwj3amjnL9BxFfmnB90E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=To2WIgvZ0UpgMy07pSfHDzdErkTawiX+DCbKzBYBqwb4UZ7Nb2NOCvtndklYINMDX+aunLOsa1uDz8Cr6eF3R2fzpQdv3hHQwoNgBq5UXQb8ryR5VGvjc1B/oBePD3lOYMd+OdPfEZLSCXwLQttK602+O7lH2zCp4s20TdWesF0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ih3Si3Ao; arc=none smtp.client-ip=192.198.163.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ih3Si3Ao" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1718988267; x=1750524267; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=1jHhrYMe7a3/ohkHtpaJJmyxwj3amjnL9BxFfmnB90E=; b=ih3Si3AoW/6UskLj1aevPovzuzl60xDjrJSll/p6UOQ/64CViirMBttK trOZ5EOOB9vumA+f9gemqOncG58/YaCZ5O2jaBZROFsW8TJXHC/uBRqYs LyGrWEhgGAgasUEOe7j9OG3Ld9qTZPfsjB4ZLk0Z48x0FluEgaY1vFHfn 8PFUfmYDP7fYiCtKPw7OpPRUVJKeG6kn7BSa6K3yjG285ksGg1gzeBgao co8F9ImDNKgzGtuHTIizPK3oANEc/YPqocyEASkTWzgHCCy/XMnYR+c2Y Gghe/4gQZUF+Hl8cBzUBE45wY1PQtneojn0AyapgzGmLqS2061VvUhafz Q==; X-CSE-ConnectionGUID: XMiNO8V0SLKVRFNBDeaNVg== X-CSE-MsgGUID: kPPvrddFRCCu1bsEIzKFvA== X-IronPort-AV: E=McAfee;i="6700,10204,11110"; a="15727160" X-IronPort-AV: E=Sophos;i="6.08,255,1712646000"; d="scan'208";a="15727160" Received: from fmviesa010.fm.intel.com ([10.60.135.150]) by fmvoesa112.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jun 2024 09:44:23 -0700 X-CSE-ConnectionGUID: a4aJpdS9R9qGyjNv7x3NEg== X-CSE-MsgGUID: ScJEe9lLTj2jJq5Yf9OUuQ== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,255,1712646000"; d="scan'208";a="42745628" Received: from black.fi.intel.com ([10.237.72.28]) by fmviesa010.fm.intel.com with ESMTP; 21 Jun 2024 09:44:18 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 4E5FB1F8; Fri, 21 Jun 2024 19:44:17 +0300 (EEST) From: "Kirill A. Shutemov" To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Cc: Jonathan Corbet , Andy Lutomirski , Peter Zijlstra , Ard Biesheuvel , Jan Kiszka , Kieran Bingham , "Kirill A. Shutemov" , Michael Roth , Rick Edgecombe , Brijesh Singh , Sandipan Das , Juergen Gross , Tom Lendacky , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-efi@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 2/3] x86/64/mm: Make SPARSEMEM_VMEMMAP the only memory model Date: Fri, 21 Jun 2024 19:44:05 +0300 Message-ID: <20240621164406.256314-3-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240621164406.256314-1-kirill.shutemov@linux.intel.com> References: <20240621164406.256314-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" 5-level paging only supports SPARSEMEM_VMEMMAP. CONFIG_X86_5LEVEL is being phased out, making 5-level paging support mandatory. Make CONFIG_SPARSEMEM_VMEMMAP mandatory for x86-64 and eliminate any associated conditional statements. Signed-off-by: Kirill A. Shutemov --- arch/x86/Kconfig | 2 +- arch/x86/mm/init_64.c | 9 +-------- 2 files changed, 2 insertions(+), 9 deletions(-) diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 827928680ed6..54ad2462e9ef 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -1501,7 +1501,6 @@ config X86_PAE config X86_5LEVEL bool "Enable 5-level page tables support" default y - select SPARSEMEM_VMEMMAP depends on X86_64 help 5-level paging enables access to larger address space: @@ -1625,6 +1624,7 @@ config ARCH_SPARSEMEM_ENABLE depends on X86_64 || NUMA || X86_32 || X86_32_NON_STANDARD select SPARSEMEM_STATIC if X86_32 select SPARSEMEM_VMEMMAP_ENABLE if X86_64 + select SPARSEMEM_VMEMMAP if X86_64 =20 config ARCH_SPARSEMEM_DEFAULT def_bool X86_64 || (NUMA && X86_32) diff --git a/arch/x86/mm/init_64.c b/arch/x86/mm/init_64.c index 28002cc7a37d..552a11d5829a 100644 --- a/arch/x86/mm/init_64.c +++ b/arch/x86/mm/init_64.c @@ -835,7 +835,6 @@ void __init paging_init(void) zone_sizes_init(); } =20 -#ifdef CONFIG_SPARSEMEM_VMEMMAP #define PAGE_UNUSED 0xFD =20 /* @@ -934,7 +933,6 @@ static void __meminit vmemmap_use_new_sub_pmd(unsigned = long start, unsigned long if (!IS_ALIGNED(end, PMD_SIZE)) unused_pmd_start =3D end; } -#endif =20 /* * Memory hotplug specific functions @@ -1133,16 +1131,13 @@ remove_pmd_table(pmd_t *pmd_start, unsigned long ad= dr, unsigned long end, pmd_clear(pmd); spin_unlock(&init_mm.page_table_lock); pages++; - } -#ifdef CONFIG_SPARSEMEM_VMEMMAP - else if (vmemmap_pmd_is_unused(addr, next)) { + } else if (vmemmap_pmd_is_unused(addr, next)) { free_hugepage_table(pmd_page(*pmd), altmap); spin_lock(&init_mm.page_table_lock); pmd_clear(pmd); spin_unlock(&init_mm.page_table_lock); } -#endif continue; } =20 @@ -1490,7 +1485,6 @@ unsigned long memory_block_size_bytes(void) return memory_block_size_probed; } =20 -#ifdef CONFIG_SPARSEMEM_VMEMMAP /* * Initialise the sparsemem vmemmap using huge-pages at the PMD level. */ @@ -1639,4 +1633,3 @@ void __meminit vmemmap_populate_print_last(void) node_start =3D 0; } } -#endif --=20 2.43.0 From nobody Tue Dec 16 19:46:41 2025 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.18]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DABFB15FA94; Fri, 21 Jun 2024 16:44:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.18 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718988266; cv=none; b=Bh3Dw1kS5uTCHbySZOQy8vjjsuBaH9cuYzEVu87ef7btNRMaohSb5YfZ5tBAeFMm4/bRUjH/oo4vUWQqjdOQc/BPuLFzjfLdxASlhINqCKzDMKlPvGZ7+cmIoaDnbMKa3JUaGYkjIn/0nBmyzgw63YHbfSL02t58HDjGENYTrE0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718988266; c=relaxed/simple; bh=uodnvDI6gaNrtfWohtxZilsOj+Gv6hSBDDU4/JSRST4=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=KYTKk/OPLnzt+pgiFdE0k2OWFbaf74Ij08BQO2RIk82IDO741EYP2Ww8YVsRVeIsfIKEpegEbxkC+38wM6phf3jvrnA5jCtU9akWJ1AJuwddfITr1+O1LhovGlVLG73EuDiFJb30qXSS+bqA0SaDO/rw6sn4+Zyq0clp3D7PVwM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=d503qeiJ; arc=none smtp.client-ip=198.175.65.18 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="d503qeiJ" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1718988265; x=1750524265; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=uodnvDI6gaNrtfWohtxZilsOj+Gv6hSBDDU4/JSRST4=; b=d503qeiJ4f8hyoEl3BVObicq/DxfDtUmB/PKor/3l/jJU06ugcd44Cgr 9s1KnMVK20xYyRHeKaIH7uhiCShR/hHNxEAAem3GJz0jUPgcA9oiKrxjf nko7Nn1Axo693trVwIVyo5w4932E2X9aeOe42Mf+LKFujj+J/hwKug5oX w6/XC36Rgurp2+mtMwQ1KC5joHfEPOtGJKwj6Ho3JB9Z7DsT2U4MRU/kQ 2nzmG3qTQzJhympWqPhrO23ZY3WwyFrFmDoI4QC8+/v4p/1foKu+MExnp KynRyO/Glm37PpfEkHAIVKbKYJIVtuFR8/nJmjAyWBrwJcytXnAUgHC3w A==; X-CSE-ConnectionGUID: POiiaCIYRAqlUSnqWJ52ow== X-CSE-MsgGUID: 6A8aQ2toQ42MvWIdA0t/XA== X-IronPort-AV: E=McAfee;i="6700,10204,11110"; a="16172417" X-IronPort-AV: E=Sophos;i="6.08,255,1712646000"; d="scan'208";a="16172417" Received: from orviesa005.jf.intel.com ([10.64.159.145]) by orvoesa110.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Jun 2024 09:44:24 -0700 X-CSE-ConnectionGUID: nYEeHVfERg2yfcUvm8Fz/A== X-CSE-MsgGUID: UvWr4dstRaivBihyr6JXDA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,255,1712646000"; d="scan'208";a="47581606" Received: from black.fi.intel.com ([10.237.72.28]) by orviesa005.jf.intel.com with ESMTP; 21 Jun 2024 09:44:19 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 5F3E61FD; Fri, 21 Jun 2024 19:44:17 +0300 (EEST) From: "Kirill A. Shutemov" To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org, "H. Peter Anvin" Cc: Jonathan Corbet , Andy Lutomirski , Peter Zijlstra , Ard Biesheuvel , Jan Kiszka , Kieran Bingham , "Kirill A. Shutemov" , Michael Roth , Rick Edgecombe , Brijesh Singh , Sandipan Das , Juergen Gross , Tom Lendacky , linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-efi@vger.kernel.org, linux-mm@kvack.org Subject: [PATCH 3/3] x86/64/mm: Make 5-level paging support unconditional Date: Fri, 21 Jun 2024 19:44:06 +0300 Message-ID: <20240621164406.256314-4-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240621164406.256314-1-kirill.shutemov@linux.intel.com> References: <20240621164406.256314-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Both Intel and AMD CPUs support 5-level paging, which is expected to become more widely adopted in the future. Remove CONFIG_X86_5LEVEL and ifdeffery for it to make it more readable. Signed-off-by: Kirill A. Shutemov Suggested-by: Borislav Petkov --- Documentation/arch/x86/cpuinfo.rst | 8 +++---- .../arch/x86/x86_64/5level-paging.rst | 9 -------- arch/x86/Kconfig | 22 +------------------ arch/x86/boot/compressed/pgtable_64.c | 11 ++-------- arch/x86/boot/header.S | 4 ---- arch/x86/include/asm/disabled-features.h | 9 +------- arch/x86/include/asm/page_64.h | 2 -- arch/x86/include/asm/page_64_types.h | 7 ------ arch/x86/include/asm/pgtable_64_types.h | 18 --------------- arch/x86/kernel/alternative.c | 2 +- arch/x86/kernel/head64.c | 5 ----- arch/x86/kernel/head_64.S | 2 -- arch/x86/mm/init.c | 4 ---- arch/x86/mm/pgtable.c | 2 -- drivers/firmware/efi/libstub/x86-5lvl.c | 2 +- .../arch/x86/include/asm/disabled-features.h | 9 +------- 16 files changed, 10 insertions(+), 106 deletions(-) diff --git a/Documentation/arch/x86/cpuinfo.rst b/Documentation/arch/x86/cp= uinfo.rst index 8895784d4784..0ea70924c89e 100644 --- a/Documentation/arch/x86/cpuinfo.rst +++ b/Documentation/arch/x86/cpuinfo.rst @@ -171,10 +171,10 @@ For example, when an old kernel is running on new har= dware. =20 c: The kernel disabled support for it at compile-time. ------------------------------------------------------ -For example, if 5-level-paging is not enabled when building (i.e., -CONFIG_X86_5LEVEL is not selected) the flag "la57" will not show up [#f1]_. +For example, if Linear Address Masking (LAM) is not enabled when building = (i.e., +CONFIG_ADDRESS_MASKING is not selected) the flag "lam" will not show up. Even though the feature will still be detected via CPUID, the kernel disab= les -it by clearing via setup_clear_cpu_cap(X86_FEATURE_LA57). +it by clearing via setup_clear_cpu_cap(X86_FEATURE_LAM). =20 d: The feature is disabled at boot-time. ---------------------------------------- @@ -197,5 +197,3 @@ missing at runtime. For example, AVX flags will not sho= w up if XSAVE feature is disabled since they depend on XSAVE feature. Another example would be b= roken CPUs and them missing microcode patches. Due to that, the kernel decides n= ot to enable a feature. - -.. [#f1] 5-level paging uses linear address of 57 bits. diff --git a/Documentation/arch/x86/x86_64/5level-paging.rst b/Documentatio= n/arch/x86/x86_64/5level-paging.rst index 71f882f4a173..ad7ddc13f79d 100644 --- a/Documentation/arch/x86/x86_64/5level-paging.rst +++ b/Documentation/arch/x86/x86_64/5level-paging.rst @@ -22,15 +22,6 @@ QEMU 2.9 and later support 5-level paging. Virtual memory layout for 5-level paging is described in Documentation/arch/x86/x86_64/mm.rst =20 - -Enabling 5-level paging -=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D -CONFIG_X86_5LEVEL=3Dy enables the feature. - -Kernel with CONFIG_X86_5LEVEL=3Dy still able to boot on 4-level hardware. -In this case additional page table level -- p4d -- will be folded at -runtime. - User-space and large virtual address space =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D On x86, 5-level paging enables 56-bit userspace virtual address space. diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig index 54ad2462e9ef..f95a5048ad09 100644 --- a/arch/x86/Kconfig +++ b/arch/x86/Kconfig @@ -408,8 +408,7 @@ config DYNAMIC_PHYSICAL_MASK =20 config PGTABLE_LEVELS int - default 5 if X86_5LEVEL - default 4 if X86_64 + default 5 if X86_64 default 3 if X86_PAE default 2 =20 @@ -1498,25 +1497,6 @@ config X86_PAE has the cost of more pagetable lookup overhead, and also consumes more pagetable space per process. =20 -config X86_5LEVEL - bool "Enable 5-level page tables support" - default y - depends on X86_64 - help - 5-level paging enables access to larger address space: - up to 128 PiB of virtual address space and 4 PiB of - physical address space. - - It will be supported by future Intel CPUs. - - A kernel with the option enabled can be booted on machines that - support 4- or 5-level paging. - - See Documentation/arch/x86/x86_64/5level-paging.rst for more - information. - - Say N if unsure. - config X86_DIRECT_GBPAGES def_bool y depends on X86_64 diff --git a/arch/x86/boot/compressed/pgtable_64.c b/arch/x86/boot/compress= ed/pgtable_64.c index c882e1f67af0..61b9ca61bde1 100644 --- a/arch/x86/boot/compressed/pgtable_64.c +++ b/arch/x86/boot/compressed/pgtable_64.c @@ -10,12 +10,10 @@ #define BIOS_START_MIN 0x20000U /* 128K, less than this is insane */ #define BIOS_START_MAX 0x9f000U /* 640K, absolute maximum */ =20 -#ifdef CONFIG_X86_5LEVEL /* __pgtable_l5_enabled needs to be in .data to avoid being cleared along = with .bss */ unsigned int __section(".data") __pgtable_l5_enabled; unsigned int __section(".data") pgdir_shift =3D 39; unsigned int __section(".data") ptrs_per_p4d =3D 1; -#endif =20 /* Buffer to preserve trampoline memory */ static char trampoline_save[TRAMPOLINE_32BIT_SIZE]; @@ -113,18 +111,13 @@ asmlinkage void configure_5level_paging(struct boot_p= arams *bp, void *pgtable) * Check if LA57 is desired and supported. * * There are several parts to the check: - * - if the kernel supports 5-level paging: CONFIG_X86_5LEVEL=3Dy * - if user asked to disable 5-level paging: no5lvl in cmdline * - if the machine supports 5-level paging: * + CPUID leaf 7 is supported * + the leaf has the feature bit set - * - * That's substitute for boot_cpu_has() in early boot code. */ - if (IS_ENABLED(CONFIG_X86_5LEVEL) && - !cmdline_find_option_bool("no5lvl") && - native_cpuid_eax(0) >=3D 7 && - (native_cpuid_ecx(7) & (1 << (X86_FEATURE_LA57 & 31)))) { + if (!cmdline_find_option_bool("no5lvl") && + native_cpuid_eax(0) >=3D 7 && (native_cpuid_ecx(7) & BIT(16))) { l5_required =3D true; =20 /* Initialize variables for 5-level paging */ diff --git a/arch/x86/boot/header.S b/arch/x86/boot/header.S index b5c79f43359b..32361cef909e 100644 --- a/arch/x86/boot/header.S +++ b/arch/x86/boot/header.S @@ -361,12 +361,8 @@ xloadflags: #endif =20 #ifdef CONFIG_X86_64 -#ifdef CONFIG_X86_5LEVEL #define XLF56 (XLF_5LEVEL|XLF_5LEVEL_ENABLED) #else -#define XLF56 XLF_5LEVEL -#endif -#else #define XLF56 0 #endif =20 diff --git a/arch/x86/include/asm/disabled-features.h b/arch/x86/include/as= m/disabled-features.h index c492bdc97b05..19cf1678fcaa 100644 --- a/arch/x86/include/asm/disabled-features.h +++ b/arch/x86/include/asm/disabled-features.h @@ -38,12 +38,6 @@ # define DISABLE_OSPKE (1<<(X86_FEATURE_OSPKE & 31)) #endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */ =20 -#ifdef CONFIG_X86_5LEVEL -# define DISABLE_LA57 0 -#else -# define DISABLE_LA57 (1<<(X86_FEATURE_LA57 & 31)) -#endif - #ifdef CONFIG_MITIGATION_PAGE_TABLE_ISOLATION # define DISABLE_PTI 0 #else @@ -149,8 +143,7 @@ #define DISABLED_MASK13 0 #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 -#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UM= IP| \ - DISABLE_ENQCMD) +#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_UMIP|DISABLE_EN= QCMD) #define DISABLED_MASK17 0 #define DISABLED_MASK18 (DISABLE_IBT) #define DISABLED_MASK19 (DISABLE_SEV_SNP) diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h index cc6b8e087192..3b8cb6a8b122 100644 --- a/arch/x86/include/asm/page_64.h +++ b/arch/x86/include/asm/page_64.h @@ -60,7 +60,6 @@ static inline void clear_page(void *page) =20 void copy_page(void *to, void *from); =20 -#ifdef CONFIG_X86_5LEVEL /* * User space process size. This is the first address outside the user ra= nge. * There are a few constraints that determine this: @@ -91,7 +90,6 @@ static __always_inline unsigned long task_size_max(void) =20 return ret; } -#endif /* CONFIG_X86_5LEVEL */ =20 #endif /* !__ASSEMBLY__ */ =20 diff --git a/arch/x86/include/asm/page_64_types.h b/arch/x86/include/asm/pa= ge_64_types.h index c2f3c50a2787..666a5d6ab910 100644 --- a/arch/x86/include/asm/page_64_types.h +++ b/arch/x86/include/asm/page_64_types.h @@ -48,14 +48,7 @@ /* See Documentation/arch/x86/x86_64/mm.rst for a description of the memor= y map. */ =20 #define __PHYSICAL_MASK_SHIFT 52 - -#ifdef CONFIG_X86_5LEVEL #define __VIRTUAL_MASK_SHIFT (pgtable_l5_enabled() ? 56 : 47) -/* See task_size_max() in */ -#else -#define __VIRTUAL_MASK_SHIFT 47 -#define task_size_max() ((_AC(1,UL) << __VIRTUAL_MASK_SHIFT) - PAGE_SIZE) -#endif =20 #define TASK_SIZE_MAX task_size_max() #define DEFAULT_MAP_WINDOW ((1UL << 47) - PAGE_SIZE) diff --git a/arch/x86/include/asm/pgtable_64_types.h b/arch/x86/include/asm= /pgtable_64_types.h index 09df8939b997..2c77489ac86c 100644 --- a/arch/x86/include/asm/pgtable_64_types.h +++ b/arch/x86/include/asm/pgtable_64_types.h @@ -23,7 +23,6 @@ typedef struct { pmdval_t pmd; } pmd_t; =20 extern unsigned int __pgtable_l5_enabled; =20 -#ifdef CONFIG_X86_5LEVEL #ifdef USE_EARLY_PGTABLE_L5 /* * cpu_feature_enabled() is not available in early boot code. @@ -37,10 +36,6 @@ static inline bool pgtable_l5_enabled(void) #define pgtable_l5_enabled() cpu_feature_enabled(X86_FEATURE_LA57) #endif /* USE_EARLY_PGTABLE_L5 */ =20 -#else -#define pgtable_l5_enabled() 0 -#endif /* CONFIG_X86_5LEVEL */ - extern unsigned int pgdir_shift; extern unsigned int ptrs_per_p4d; =20 @@ -48,8 +43,6 @@ extern unsigned int ptrs_per_p4d; =20 #define SHARED_KERNEL_PMD 0 =20 -#ifdef CONFIG_X86_5LEVEL - /* * PGDIR_SHIFT determines what a top-level page table entry can map */ @@ -67,17 +60,6 @@ extern unsigned int ptrs_per_p4d; =20 #define MAX_POSSIBLE_PHYSMEM_BITS 52 =20 -#else /* CONFIG_X86_5LEVEL */ - -/* - * PGDIR_SHIFT determines what a top-level page table entry can map - */ -#define PGDIR_SHIFT 39 -#define PTRS_PER_PGD 512 -#define MAX_PTRS_PER_P4D 1 - -#endif /* CONFIG_X86_5LEVEL */ - /* * 3rd level page */ diff --git a/arch/x86/kernel/alternative.c b/arch/x86/kernel/alternative.c index 37596a417094..f1c519abb925 100644 --- a/arch/x86/kernel/alternative.c +++ b/arch/x86/kernel/alternative.c @@ -457,7 +457,7 @@ void __init_or_module noinline apply_alternatives(struc= t alt_instr *start, DPRINTK(ALT, "alt table %px, -> %px", start, end); =20 /* - * In the case CONFIG_X86_5LEVEL=3Dy, KASAN_SHADOW_START is defined using + * KASAN_SHADOW_START is defined using * cpu_feature_enabled(X86_FEATURE_LA57) and is therefore patched here. * During the process, KASAN becomes confused seeing partial LA57 * conversion and triggers a false-positive out-of-bound report. diff --git a/arch/x86/kernel/head64.c b/arch/x86/kernel/head64.c index ec36ad7117ae..ec3a7e2ea222 100644 --- a/arch/x86/kernel/head64.c +++ b/arch/x86/kernel/head64.c @@ -52,13 +52,11 @@ extern pmd_t early_dynamic_pgts[EARLY_DYNAMIC_PAGE_TABL= ES][PTRS_PER_PMD]; static unsigned int __initdata next_early_pgt; pmdval_t early_pmd_flags =3D __PAGE_KERNEL_LARGE & ~(_PAGE_GLOBAL | _PAGE_= NX); =20 -#ifdef CONFIG_X86_5LEVEL unsigned int __pgtable_l5_enabled __ro_after_init; unsigned int pgdir_shift __ro_after_init =3D 39; EXPORT_SYMBOL(pgdir_shift); unsigned int ptrs_per_p4d __ro_after_init =3D 1; EXPORT_SYMBOL(ptrs_per_p4d); -#endif =20 unsigned long page_offset_base __ro_after_init =3D __PAGE_OFFSET_BASE_L4; EXPORT_SYMBOL(page_offset_base); @@ -69,9 +67,6 @@ EXPORT_SYMBOL(vmemmap_base); =20 static inline bool check_la57_support(void) { - if (!IS_ENABLED(CONFIG_X86_5LEVEL)) - return false; - /* * 5-level paging is detected and enabled at kernel decompression * stage. Only check if it has been enabled there. diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S index 330922b328bf..4b2b2138c163 100644 --- a/arch/x86/kernel/head_64.S +++ b/arch/x86/kernel/head_64.S @@ -659,12 +659,10 @@ SYM_DATA_START_PTI_ALIGNED(init_top_pgt) SYM_DATA_END(init_top_pgt) #endif =20 -#ifdef CONFIG_X86_5LEVEL SYM_DATA_START_PAGE_ALIGNED(level4_kernel_pgt) .fill 511,8,0 .quad level3_kernel_pgt - __START_KERNEL_map + _PAGE_TABLE_NOENC SYM_DATA_END(level4_kernel_pgt) -#endif =20 SYM_DATA_START_PAGE_ALIGNED(level3_kernel_pgt) .fill L3_START_KERNEL,8,0 diff --git a/arch/x86/mm/init.c b/arch/x86/mm/init.c index eb503f53c319..5a980a452f4c 100644 --- a/arch/x86/mm/init.c +++ b/arch/x86/mm/init.c @@ -173,11 +173,7 @@ __ref void *alloc_low_pages(unsigned int num) * randomization is enabled. */ =20 -#ifndef CONFIG_X86_5LEVEL -#define INIT_PGD_PAGE_TABLES 3 -#else #define INIT_PGD_PAGE_TABLES 4 -#endif =20 #ifndef CONFIG_RANDOMIZE_MEMORY #define INIT_PGD_PAGE_COUNT (2 * INIT_PGD_PAGE_TABLES) diff --git a/arch/x86/mm/pgtable.c b/arch/x86/mm/pgtable.c index 93e54ba91fbf..982775ef8b34 100644 --- a/arch/x86/mm/pgtable.c +++ b/arch/x86/mm/pgtable.c @@ -691,7 +691,6 @@ void native_set_fixmap(unsigned /* enum fixed_addresses= */ idx, } =20 #ifdef CONFIG_HAVE_ARCH_HUGE_VMAP -#ifdef CONFIG_X86_5LEVEL /** * p4d_set_huge - setup kernel P4D mapping * @@ -710,7 +709,6 @@ int p4d_set_huge(p4d_t *p4d, phys_addr_t addr, pgprot_t= prot) void p4d_clear_huge(p4d_t *p4d) { } -#endif =20 /** * pud_set_huge - setup kernel PUD mapping diff --git a/drivers/firmware/efi/libstub/x86-5lvl.c b/drivers/firmware/efi= /libstub/x86-5lvl.c index 77359e802181..f1c5fb45d5f7 100644 --- a/drivers/firmware/efi/libstub/x86-5lvl.c +++ b/drivers/firmware/efi/libstub/x86-5lvl.c @@ -62,7 +62,7 @@ efi_status_t efi_setup_5level_paging(void) =20 void efi_5level_switch(void) { - bool want_la57 =3D IS_ENABLED(CONFIG_X86_5LEVEL) && !efi_no5lvl; + bool want_la57 =3D !efi_no5lvl; bool have_la57 =3D native_read_cr4() & X86_CR4_LA57; bool need_toggle =3D want_la57 ^ have_la57; u64 *pgt =3D (void *)la57_toggle + PAGE_SIZE; diff --git a/tools/arch/x86/include/asm/disabled-features.h b/tools/arch/x8= 6/include/asm/disabled-features.h index c492bdc97b05..19cf1678fcaa 100644 --- a/tools/arch/x86/include/asm/disabled-features.h +++ b/tools/arch/x86/include/asm/disabled-features.h @@ -38,12 +38,6 @@ # define DISABLE_OSPKE (1<<(X86_FEATURE_OSPKE & 31)) #endif /* CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS */ =20 -#ifdef CONFIG_X86_5LEVEL -# define DISABLE_LA57 0 -#else -# define DISABLE_LA57 (1<<(X86_FEATURE_LA57 & 31)) -#endif - #ifdef CONFIG_MITIGATION_PAGE_TABLE_ISOLATION # define DISABLE_PTI 0 #else @@ -149,8 +143,7 @@ #define DISABLED_MASK13 0 #define DISABLED_MASK14 0 #define DISABLED_MASK15 0 -#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_LA57|DISABLE_UM= IP| \ - DISABLE_ENQCMD) +#define DISABLED_MASK16 (DISABLE_PKU|DISABLE_OSPKE|DISABLE_UMIP|DISABLE_EN= QCMD) #define DISABLED_MASK17 0 #define DISABLED_MASK18 (DISABLE_IBT) #define DISABLED_MASK19 (DISABLE_SEV_SNP) --=20 2.43.0