From nobody Fri May 8 05:17:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D2DBC433FE for ; Wed, 11 May 2022 02:29:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241275AbiEKC3z (ORCPT ); Tue, 10 May 2022 22:29:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50424 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241224AbiEKC3k (ORCPT ); Tue, 10 May 2022 22:29:40 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D665C21A959 for ; Tue, 10 May 2022 19:29:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652236179; x=1683772179; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=YJmOMZm2CAZNr36cdPGE67zJRph2jZlNrnhMlnzgytk=; b=F9YWNSUt2nJCH+gMHCHXhb0elARaQDhs4xEcZKsraSJ2T3qleyJRO1Zo FhHYUvZyWWH89mKDbi6RmUUtniw9VkO8cynne1QS/s2rWJlbUF0yjZcHm JUVl7rxPPIzlrSg64KNmVZvmzL6gucWHccZwMpWEuOgSFJX7gz/NkrJU1 mtlKwkoX11r9yJbAipV8VAB9cSKXi8+6tKMVkj2Iy5Ls3cOGGjQjRFuYg xU2LOvRcaEUi3a/VgXSVVl4BDSc9/D+0vB0qbXI2xq1eQPj6O3qd27NGl GE4idVZGqYQpl5O6htJov6Lc4mxG4YGySYEY4DdeldoXeZAMV9ofC4MJC A==; X-IronPort-AV: E=McAfee;i="6400,9594,10343"; a="251610728" X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="251610728" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2022 19:29:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="895145507" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga005.fm.intel.com with ESMTP; 10 May 2022 19:29:36 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 1CC52530; Wed, 11 May 2022 05:28:01 +0300 (EEST) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFCv2 01/10] x86/mm: Fix CR3_ADDR_MASK Date: Wed, 11 May 2022 05:27:42 +0300 Message-Id: <20220511022751.65540-3-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> References: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The mask must not include bits above physical address mask. These bits are reserved and can be used for other things. Bits 61 and 62 are used for Linear Address Masking. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/processor-flags.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/processor-flags.h b/arch/x86/include/asm/= processor-flags.h index 02c2cbda4a74..a7f3d9100adb 100644 --- a/arch/x86/include/asm/processor-flags.h +++ b/arch/x86/include/asm/processor-flags.h @@ -35,7 +35,7 @@ */ #ifdef CONFIG_X86_64 /* Mask off the address space ID and SME encryption bits. */ -#define CR3_ADDR_MASK __sme_clr(0x7FFFFFFFFFFFF000ull) +#define CR3_ADDR_MASK __sme_clr(PHYSICAL_PAGE_MASK) #define CR3_PCID_MASK 0xFFFull #define CR3_NOFLUSH BIT_ULL(63) =20 --=20 2.35.1 From nobody Fri May 8 05:17:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 25622C433EF for ; Wed, 11 May 2022 02:30:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241326AbiEKCaF (ORCPT ); Tue, 10 May 2022 22:30:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241241AbiEKC3o (ORCPT ); Tue, 10 May 2022 22:29:44 -0400 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 34B0721A961 for ; Tue, 10 May 2022 19:29:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652236180; x=1683772180; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zyE/+02x9Fj+FgL1GeYtDyYutpy6orbTRtxDnBCOxV4=; b=ZuprcYjgpYiDGzNC69bPz48Adpdfmq8ghcs8qSKIgwqVBrWjrO3QYxBt M8cF28Wsu7g46Dbdea/KSwqMiKyrpOLIcimGOdilfqMKO4PrRIQR9rmtP aEy7i8HgUFMLb1vqO1l9LteR64N2gP2cW4xl8XES5G2Q+pQlbV2hh3ot8 cXR9mEEHKrl+X0P2McOc94fE7k2miUsp0D+SQBaWgBC9BETBaS5x57Icj P4mEx9TuPGyZJwOKPZzBj6uW5CMOI1CyeMPPKCQpUWZk0lxlEzCuqx2Sw VhdwBpYMOX7psNXG/tnBucXtPSjTSRJxwAm8PovGPU1r4cDaSjvO+mDUX w==; X-IronPort-AV: E=McAfee;i="6400,9594,10343"; a="269497327" X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="269497327" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2022 19:29:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="542092629" Received: from black.fi.intel.com ([10.237.72.28]) by orsmga006.jf.intel.com with ESMTP; 10 May 2022 19:29:36 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 2444753B; Wed, 11 May 2022 05:28:01 +0300 (EEST) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFCv2 02/10] x86: CPUID and CR3/CR4 flags for Linear Address Masking Date: Wed, 11 May 2022 05:27:43 +0300 Message-Id: <20220511022751.65540-4-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> References: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Enumerate Linear Address Masking and provide defines for CR3 and CR4 flags. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/uapi/asm/processor-flags.h | 6 ++++++ 2 files changed, 7 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpuf= eatures.h index 73e643ae94b6..d443d1ba231a 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -299,6 +299,7 @@ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ #define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instruction= s */ +#define X86_FEATURE_LAM (12*32+26) /* Linear Address Masking */ =20 /* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */ #define X86_FEATURE_CLZERO (13*32+ 0) /* CLZERO instruction */ diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include= /uapi/asm/processor-flags.h index c47cc7f2feeb..d898432947ff 100644 --- a/arch/x86/include/uapi/asm/processor-flags.h +++ b/arch/x86/include/uapi/asm/processor-flags.h @@ -82,6 +82,10 @@ #define X86_CR3_PCID_BITS 12 #define X86_CR3_PCID_MASK (_AC((1UL << X86_CR3_PCID_BITS) - 1, UL)) =20 +#define X86_CR3_LAM_U57_BIT 61 /* Activate LAM for userspace, 62:57 bits m= asked */ +#define X86_CR3_LAM_U57 _BITULL(X86_CR3_LAM_U57_BIT) +#define X86_CR3_LAM_U48_BIT 62 /* Activate LAM for userspace, 62:48 bits m= asked */ +#define X86_CR3_LAM_U48 _BITULL(X86_CR3_LAM_U48_BIT) #define X86_CR3_PCID_NOFLUSH_BIT 63 /* Preserve old PCID */ #define X86_CR3_PCID_NOFLUSH _BITULL(X86_CR3_PCID_NOFLUSH_BIT) =20 @@ -132,6 +136,8 @@ #define X86_CR4_PKE _BITUL(X86_CR4_PKE_BIT) #define X86_CR4_CET_BIT 23 /* enable Control-flow Enforcement Technology = */ #define X86_CR4_CET _BITUL(X86_CR4_CET_BIT) +#define X86_CR4_LAM_SUP_BIT 28 /* LAM for supervisor pointers */ +#define X86_CR4_LAM_SUP _BITUL(X86_CR4_LAM_SUP_BIT) =20 /* * x86-64 Task Priority Register, CR8 --=20 2.35.1 From nobody Fri May 8 05:17:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9BF42C433EF for ; Wed, 11 May 2022 02:30:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241333AbiEKCaL (ORCPT ); Tue, 10 May 2022 22:30:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241244AbiEKC3p (ORCPT ); Tue, 10 May 2022 22:29:45 -0400 Received: from mga18.intel.com (mga18.intel.com [134.134.136.126]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CFAD3219C01 for ; Tue, 10 May 2022 19:29:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652236180; x=1683772180; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=5G0RO9Jzv5sMl5l3KPmaPojaLlcXXs9/0fmVZgdGWDk=; b=NLqC9Zr2qIDkvZQ2vNnnIW/ci0JMlX8OUV9uJOQmjK+TSCXShq7dC1nP WSBdHl6j0sqsdTveMqZSs0kc9Mjv3lMKWHrWRFYb8pFqgm91IRZt+5ZSo KRArKAQ3MGG4v11ZqOKb8nmlb4QZJZ7HPG+xfZB+j5unhmaCKR3Kqo5uE X02s974ZixwaEOhJnvEfHK4lMzsaPfx3ryBv/D7G14jkXIUaOhzw4aiwF iJ2vovKyzA2DPtdGP24ZSdrOWEnvmmyp/KZuh1RR3FQSqL/WtQp0Y9FxY IkU42T3JbaZ7gIrReGdITWL1E/mdRzmyp0LMgaC/fMQfM5XEJ/918KxcK A==; X-IronPort-AV: E=McAfee;i="6400,9594,10343"; a="251610732" X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="251610732" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by orsmga106.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2022 19:29:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="895145510" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga005.fm.intel.com with ESMTP; 10 May 2022 19:29:36 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 2BB1B556; Wed, 11 May 2022 05:28:01 +0300 (EEST) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFCv2 03/10] x86: Introduce userspace API to handle per-thread features Date: Wed, 11 May 2022 05:27:44 +0300 Message-Id: <20220511022751.65540-5-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> References: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add three new arch_prctl() handles: - ARCH_THREAD_FEATURE_ENABLE/DISABLE enables or disables the specified features. Returns what features are enabled after the operation. - ARCH_THREAD_FEATURE_LOCK prevents future disabling or enabling of the specified features. Returns the new set of locked features. The features handled per-thread and inherited over fork(2)/clone(2), but reset on exec(). This is preparation patch. It does not impelement any features. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/processor.h | 3 +++ arch/x86/include/uapi/asm/prctl.h | 5 +++++ arch/x86/kernel/process.c | 37 +++++++++++++++++++++++++++++++ 3 files changed, 45 insertions(+) diff --git a/arch/x86/include/asm/processor.h b/arch/x86/include/asm/proces= sor.h index 91d0f93a00c7..ff0c34e18cc6 100644 --- a/arch/x86/include/asm/processor.h +++ b/arch/x86/include/asm/processor.h @@ -530,6 +530,9 @@ struct thread_struct { */ u32 pkru; =20 + unsigned long features; + unsigned long features_locked; + /* Floating point and extended processor state */ struct fpu fpu; /* diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/= prctl.h index 500b96e71f18..67fc30d36c73 100644 --- a/arch/x86/include/uapi/asm/prctl.h +++ b/arch/x86/include/uapi/asm/prctl.h @@ -20,4 +20,9 @@ #define ARCH_MAP_VDSO_32 0x2002 #define ARCH_MAP_VDSO_64 0x2003 =20 +/* Never implement 0x3001, it will confuse old glibc's */ +#define ARCH_THREAD_FEATURE_ENABLE 0x3002 +#define ARCH_THREAD_FEATURE_DISABLE 0x3003 +#define ARCH_THREAD_FEATURE_LOCK 0x3004 + #endif /* _ASM_X86_PRCTL_H */ diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index b370767f5b19..cb8fc28f2eae 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -367,6 +367,10 @@ void arch_setup_new_exec(void) task_clear_spec_ssb_noexec(current); speculation_ctrl_update(read_thread_flags()); } + + /* Reset thread features on exec */ + current->thread.features =3D 0; + current->thread.features_locked =3D 0; } =20 #ifdef CONFIG_X86_IOPL_IOPERM @@ -985,6 +989,35 @@ unsigned long __get_wchan(struct task_struct *p) return addr; } =20 +static long thread_feature_prctl(struct task_struct *task, int option, + unsigned long features) +{ + const unsigned long known_features =3D 0; + + if (features & ~known_features) + return -EINVAL; + + if (option =3D=3D ARCH_THREAD_FEATURE_LOCK) { + task->thread.features_locked |=3D features; + return task->thread.features_locked; + } + + /* Do not allow to change locked features */ + if (features & task->thread.features_locked) + return -EPERM; + + if (option =3D=3D ARCH_THREAD_FEATURE_DISABLE) { + task->thread.features &=3D ~features; + goto out; + } + + /* Handle ARCH_THREAD_FEATURE_ENABLE */ + + task->thread.features |=3D features; +out: + return task->thread.features; +} + long do_arch_prctl_common(struct task_struct *task, int option, unsigned long arg2) { @@ -999,6 +1032,10 @@ long do_arch_prctl_common(struct task_struct *task, i= nt option, case ARCH_GET_XCOMP_GUEST_PERM: case ARCH_REQ_XCOMP_GUEST_PERM: return fpu_xstate_prctl(task, option, arg2); + case ARCH_THREAD_FEATURE_ENABLE: + case ARCH_THREAD_FEATURE_DISABLE: + case ARCH_THREAD_FEATURE_LOCK: + return thread_feature_prctl(task, option, arg2); } =20 return -EINVAL; --=20 2.35.1 From nobody Fri May 8 05:17:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D2421C433F5 for ; Wed, 11 May 2022 02:30:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241515AbiEKCaf (ORCPT ); Tue, 10 May 2022 22:30:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50870 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241264AbiEKC3p (ORCPT ); Tue, 10 May 2022 22:29:45 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2FBF221AAB9 for ; Tue, 10 May 2022 19:29:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652236184; x=1683772184; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=LZG1l0QEfB5T0caqDjpXl8PdwUwKifRjuJfvSf4teKw=; b=kCX14hDK8AVGGuK4SMJ6w2Tmv9OmAOq0qvZh7JrQB3/n0BLRqjl1L72S 3yzexrNPVfyfiK3BfW063MFG062PUXMH2ZkxtXCLQmnk7p9/qH6C96edy mehtPArM75xAcfP8vOFMredUaKaodT8KqZfbd/aa7mM1e1KCTNqUzWyme E0uzSJsLoPMmR4915uNZjS3J8gDMh0buizkt+dFdMOr4U9tEDcoMDJmzM AvCKK/JpkKcJ33pc8KFnWdE8LM9GjViVhMsYTAncuc1gzhm+M5pe5j0fG efDhJiN9c3NNWZKcT+vgAq8oe7ZoQcqz0gY25aefEavTm3JeYbX7ftt40 g==; X-IronPort-AV: E=McAfee;i="6400,9594,10343"; a="355985771" X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="355985771" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2022 19:29:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="636218430" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga004.fm.intel.com with ESMTP; 10 May 2022 19:29:40 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 3502357E; Wed, 11 May 2022 05:28:01 +0300 (EEST) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFCv2 04/10] x86/mm: Introduce X86_THREAD_LAM_U48 and X86_THREAD_LAM_U57 Date: Wed, 11 May 2022 05:27:45 +0300 Message-Id: <20220511022751.65540-6-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> References: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Linear Address Masking mode for userspace pointers encoded in CR3 bits. The mode is selected per-thread. Add new thread features indicate that the thread has Linear Address Masking enabled. switch_mm_irqs_off() now respects these flags and constructs CR3 accordingly. The active LAM mode gets recorded in the tlb_state. The thread features are not yet exposed via userpsace API. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/tlbflush.h | 5 ++ arch/x86/include/uapi/asm/prctl.h | 3 + arch/x86/mm/tlb.c | 95 ++++++++++++++++++++++++++----- 3 files changed, 88 insertions(+), 15 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflus= h.h index 98fa0a114074..77cae8623858 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -17,6 +17,10 @@ void __flush_tlb_all(void); =20 #define TLB_FLUSH_ALL -1UL =20 +#define LAM_NONE 0 +#define LAM_U57 1 +#define LAM_U48 2 + void cr4_update_irqsoff(unsigned long set, unsigned long clear); unsigned long cr4_read_shadow(void); =20 @@ -88,6 +92,7 @@ struct tlb_state { =20 u16 loaded_mm_asid; u16 next_asid; + u8 lam; =20 /* * If set we changed the page tables in such a way that we diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/= prctl.h index 67fc30d36c73..2dd16472d078 100644 --- a/arch/x86/include/uapi/asm/prctl.h +++ b/arch/x86/include/uapi/asm/prctl.h @@ -25,4 +25,7 @@ #define ARCH_THREAD_FEATURE_DISABLE 0x3003 #define ARCH_THREAD_FEATURE_LOCK 0x3004 =20 +#define X86_THREAD_LAM_U48 0x1 +#define X86_THREAD_LAM_U57 0x2 + #endif /* _ASM_X86_PRCTL_H */ diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 6eb4d91d5365..f9fe71d1f42c 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -18,6 +18,7 @@ #include #include #include +#include =20 #include "mm_internal.h" =20 @@ -154,17 +155,72 @@ static inline u16 user_pcid(u16 asid) return ret; } =20 -static inline unsigned long build_cr3(pgd_t *pgd, u16 asid) +#ifdef CONFIG_X86_64 +static inline unsigned long lam_to_cr3(u8 lam) +{ + switch (lam) { + case LAM_NONE: + return 0; + case LAM_U57: + return X86_CR3_LAM_U57; + case LAM_U48: + return X86_CR3_LAM_U48; + default: + WARN_ON_ONCE(1); + return 0; + } +} + +static inline u8 cr3_to_lam(unsigned long cr3) +{ + if (cr3 & X86_CR3_LAM_U57) + return LAM_U57; + if (cr3 & X86_CR3_LAM_U48) + return LAM_U48; + return 0; +} + +static u8 gen_lam(struct task_struct *tsk, struct mm_struct *mm) +{ + if (!tsk) + return LAM_NONE; + + if (tsk->thread.features & X86_THREAD_LAM_U57) + return LAM_U57; + if (tsk->thread.features & X86_THREAD_LAM_U48) + return LAM_U48; + return LAM_NONE; +} + +#else + +static inline unsigned long lam_to_cr3(u8 lam) +{ + return 0; +} + +static inline u8 cr3_to_lam(unsigned long cr3) +{ + return LAM_NONE; +} + +static u8 gen_lam(struct task_struct *tsk, struct mm_struct *mm) +{ + return LAM_NONE; +} +#endif + +static inline unsigned long build_cr3(pgd_t *pgd, u16 asid, u8 lam) { if (static_cpu_has(X86_FEATURE_PCID)) { - return __sme_pa(pgd) | kern_pcid(asid); + return __sme_pa(pgd) | kern_pcid(asid) | lam_to_cr3(lam); } else { VM_WARN_ON_ONCE(asid !=3D 0); - return __sme_pa(pgd); + return __sme_pa(pgd) | lam_to_cr3(lam); } } =20 -static inline unsigned long build_cr3_noflush(pgd_t *pgd, u16 asid) +static inline unsigned long build_cr3_noflush(pgd_t *pgd, u16 asid, u8 lam) { VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE); /* @@ -173,7 +229,7 @@ static inline unsigned long build_cr3_noflush(pgd_t *pg= d, u16 asid) * boot because all CPU's the have same capabilities: */ VM_WARN_ON_ONCE(!boot_cpu_has(X86_FEATURE_PCID)); - return __sme_pa(pgd) | kern_pcid(asid) | CR3_NOFLUSH; + return __sme_pa(pgd) | kern_pcid(asid) | CR3_NOFLUSH | lam_to_cr3(lam); } =20 /* @@ -274,15 +330,15 @@ static inline void invalidate_user_asid(u16 asid) (unsigned long *)this_cpu_ptr(&cpu_tlbstate.user_pcid_flush_mask)); } =20 -static void load_new_mm_cr3(pgd_t *pgdir, u16 new_asid, bool need_flush) +static void load_new_mm_cr3(pgd_t *pgdir, u16 new_asid, u8 lam, bool need_= flush) { unsigned long new_mm_cr3; =20 if (need_flush) { invalidate_user_asid(new_asid); - new_mm_cr3 =3D build_cr3(pgdir, new_asid); + new_mm_cr3 =3D build_cr3(pgdir, new_asid, lam); } else { - new_mm_cr3 =3D build_cr3_noflush(pgdir, new_asid); + new_mm_cr3 =3D build_cr3_noflush(pgdir, new_asid, lam); } =20 /* @@ -491,6 +547,8 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct = mm_struct *next, { struct mm_struct *real_prev =3D this_cpu_read(cpu_tlbstate.loaded_mm); u16 prev_asid =3D this_cpu_read(cpu_tlbstate.loaded_mm_asid); + u8 prev_lam =3D this_cpu_read(cpu_tlbstate.lam); + u8 new_lam =3D gen_lam(tsk, next); bool was_lazy =3D this_cpu_read(cpu_tlbstate_shared.is_lazy); unsigned cpu =3D smp_processor_id(); u64 next_tlb_gen; @@ -504,6 +562,9 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct = mm_struct *next, * cpu_tlbstate.loaded_mm) matches next. * * NB: leave_mm() calls us with prev =3D=3D NULL and tsk =3D=3D NULL. + * + * NB: Initial LAM enabling calls us with prev =3D=3D next. We must update + * CR3 if prev_lam doesn't match the new one. */ =20 /* We don't want flush_tlb_func() to run concurrently with us. */ @@ -520,7 +581,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct = mm_struct *next, * isn't free. */ #ifdef CONFIG_DEBUG_VM - if (WARN_ON_ONCE(__read_cr3() !=3D build_cr3(real_prev->pgd, prev_asid)))= { + if (WARN_ON_ONCE(__read_cr3() !=3D build_cr3(real_prev->pgd, prev_asid, p= rev_lam))) { /* * If we were to BUG here, we'd be very likely to kill * the system so hard that we don't see the call trace. @@ -551,7 +612,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct = mm_struct *next, * provides that full memory barrier and core serializing * instruction. */ - if (real_prev =3D=3D next) { + if (real_prev =3D=3D next && prev_lam =3D=3D new_lam) { VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) !=3D next->context.ctx_id); =20 @@ -622,15 +683,16 @@ void switch_mm_irqs_off(struct mm_struct *prev, struc= t mm_struct *next, barrier(); } =20 + this_cpu_write(cpu_tlbstate.lam, new_lam); if (need_flush) { this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id); this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen); - load_new_mm_cr3(next->pgd, new_asid, true); + load_new_mm_cr3(next->pgd, new_asid, new_lam, true); =20 trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL); } else { /* The new ASID is already up to date. */ - load_new_mm_cr3(next->pgd, new_asid, false); + load_new_mm_cr3(next->pgd, new_asid, new_lam, false); =20 trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, 0); } @@ -687,6 +749,7 @@ void initialize_tlbstate_and_flush(void) struct mm_struct *mm =3D this_cpu_read(cpu_tlbstate.loaded_mm); u64 tlb_gen =3D atomic64_read(&init_mm.context.tlb_gen); unsigned long cr3 =3D __read_cr3(); + u8 lam =3D cr3_to_lam(cr3); =20 /* Assert that CR3 already references the right mm. */ WARN_ON((cr3 & CR3_ADDR_MASK) !=3D __pa(mm->pgd)); @@ -700,7 +763,7 @@ void initialize_tlbstate_and_flush(void) !(cr4_read_shadow() & X86_CR4_PCIDE)); =20 /* Force ASID 0 and force a TLB flush. */ - write_cr3(build_cr3(mm->pgd, 0)); + write_cr3(build_cr3(mm->pgd, 0, lam)); =20 /* Reinitialize tlbstate. */ this_cpu_write(cpu_tlbstate.last_user_mm_spec, LAST_USER_MM_INIT); @@ -1074,8 +1137,10 @@ void flush_tlb_kernel_range(unsigned long start, uns= igned long end) */ unsigned long __get_current_cr3_fast(void) { - unsigned long cr3 =3D build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pg= d, - this_cpu_read(cpu_tlbstate.loaded_mm_asid)); + unsigned long cr3 =3D + build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd, + this_cpu_read(cpu_tlbstate.loaded_mm_asid), + this_cpu_read(cpu_tlbstate.lam)); =20 /* For now, be very restrictive about when this can be called. */ VM_WARN_ON(in_nmi() || preemptible()); --=20 2.35.1 From nobody Fri May 8 05:17:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C8BB8C433EF for ; Wed, 11 May 2022 02:30:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241274AbiEKCaQ (ORCPT ); Tue, 10 May 2022 22:30:16 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241255AbiEKC3p (ORCPT ); Tue, 10 May 2022 22:29:45 -0400 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A2E4D21AAA1 for ; Tue, 10 May 2022 19:29:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652236183; x=1683772183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=7o08EsiH9rBqx3Xez/FIIKap6Mag0l46E+Gl2UX4hIw=; b=VS35aDYzeK4iL2qnl2NhGKM5VOy8SwzrsJ/sLkhB0HpGOb0VgfiB/oep QMR6Wn9uNgpGNJYaz0KRY2l8iwn7oUB7YpjSirK+Jf0DoBxuqkzeW0zlB /KYHQHZpumtOznbJQvcBaex7X9Rs1PcgPISNKD2JLE+M4fglRqCrZDKZq 32jk2Pr+U7y8FP76vOFRUBe/CyA9id2PKE+k2OXxy0OADGJ4bECSuqN5P JMpmyOWGVL63TY4vuKEUlXQp1l+7NwaABf32gGBFqbQlumjHeZxiGCTf2 X3Rj+9kKbskoB198DH7CHEpPJExkbk+lX7GpZQyKg7HO4o+FLXnHTvKCo g==; X-IronPort-AV: E=McAfee;i="6400,9594,10343"; a="355985770" X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="355985770" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2022 19:29:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="636218429" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga004.fm.intel.com with ESMTP; 10 May 2022 19:29:40 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 3AAF1512; Wed, 11 May 2022 05:28:01 +0300 (EEST) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFCv2 05/10] x86/mm: Provide untagged_addr() helper Date: Wed, 11 May 2022 05:27:46 +0300 Message-Id: <20220511022751.65540-7-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> References: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The helper used by the core-mm to strip tag bits and get the address to the canonical shape. In only handles userspace addresses. For LAM, the address gets sanitized according to the thread features. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/page_32.h | 3 +++ arch/x86/include/asm/page_64.h | 20 ++++++++++++++++++++ 2 files changed, 23 insertions(+) diff --git a/arch/x86/include/asm/page_32.h b/arch/x86/include/asm/page_32.h index df42f8aa99e4..2d35059b90c1 100644 --- a/arch/x86/include/asm/page_32.h +++ b/arch/x86/include/asm/page_32.h @@ -15,6 +15,9 @@ extern unsigned long __phys_addr(unsigned long); #define __phys_addr_symbol(x) __phys_addr(x) #define __phys_reloc_hide(x) RELOC_HIDE((x), 0) =20 +#define untagged_addr(addr) (addr) +#define untagged_ptr(ptr) (ptr) + #ifdef CONFIG_FLATMEM #define pfn_valid(pfn) ((pfn) < max_mapnr) #endif /* CONFIG_FLATMEM */ diff --git a/arch/x86/include/asm/page_64.h b/arch/x86/include/asm/page_64.h index e9c86299b835..3a40c958b24a 100644 --- a/arch/x86/include/asm/page_64.h +++ b/arch/x86/include/asm/page_64.h @@ -7,6 +7,7 @@ #ifndef __ASSEMBLY__ #include #include +#include =20 /* duplicated to the one in bootmem.h */ extern unsigned long max_pfn; @@ -90,6 +91,25 @@ static __always_inline unsigned long task_size_max(void) } #endif /* CONFIG_X86_5LEVEL */ =20 +#define __untagged_addr(addr, n) \ + ((__force __typeof__(addr))sign_extend64((__force u64)(addr), n)) + +#define untagged_addr(addr) ({ \ + u64 __addr =3D (__force u64)(addr); \ + if (__addr >> 63 =3D=3D 0) { \ + if (current->thread.features & X86_THREAD_LAM_U57) \ + __addr &=3D __untagged_addr(__addr, 56); \ + else if (current->thread.features & X86_THREAD_LAM_U48) \ + __addr &=3D __untagged_addr(__addr, 47); \ + } \ + (__force __typeof__(addr))__addr; \ +}) + +#define untagged_ptr(ptr) ({ \ + u64 __ptrval =3D (__force u64)(ptr); \ + __ptrval =3D untagged_addr(__ptrval); \ + (__force __typeof__(*(ptr)) *)__ptrval; \ +}) #endif /* !__ASSEMBLY__ */ =20 #ifdef CONFIG_X86_VSYSCALL_EMULATION --=20 2.35.1 From nobody Fri May 8 05:17:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A32FC433F5 for ; Wed, 11 May 2022 02:30:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241549AbiEKCaw (ORCPT ); Tue, 10 May 2022 22:30:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50800 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241267AbiEKC3p (ORCPT ); Tue, 10 May 2022 22:29:45 -0400 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4DDD721AABC for ; Tue, 10 May 2022 19:29:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652236184; x=1683772184; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tVhiT6aKeF/Rpr/dJksjmlkONreh2kMGvoLBNiaVHv4=; b=E032CCyWLsZ6zZQXqH4b94PYGQ8IAR13GteQ5ZPd6vKWxRzzWi997eZT 7BICAZUNV/bSZlngp/luXJ9CUKzTmJa8u8WZFiuVOuJnM2Tmt8CSikFFU J+OFxIbUvrWD/wYoTrFQLMpMpUPfoqtlR9pffeoNkT2x3cnF0o9vkJk9g Sd73qGk0Xkunfm8XVuCKdtgqKC1eJsE7Uw09b6sblXuvKSnGxfXujNrLz qE9TKqLZD17/syHM/xxH+jKQOUPN7pH7ljDdhj6lX+SaNaNV/dGSQF9oc nWbw1xOriSAZLnpD6XKSPBgt9vwvx6MwMAg2K9gyflaG2+m0p2Bfe8bQi A==; X-IronPort-AV: E=McAfee;i="6400,9594,10343"; a="294798606" X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="294798606" Received: from orsmga002.jf.intel.com ([10.7.209.21]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2022 19:29:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="553112425" Received: from black.fi.intel.com ([10.237.72.28]) by orsmga002.jf.intel.com with ESMTP; 10 May 2022 19:29:40 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 495565C1; Wed, 11 May 2022 05:28:01 +0300 (EEST) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFCv2 06/10] x86/uaccess: Remove tags from the address before checking Date: Wed, 11 May 2022 05:27:47 +0300 Message-Id: <20220511022751.65540-8-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> References: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" The tags must not be included into check whether it's okay to access the userspace address. Strip tags in access_ok(). get_user() and put_user() don't use access_ok(), but check access against TASK_SIZE directly in assembly. Strip tags, before calling into the assembly helper. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/uaccess.h | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index f78e2b3501a1..0f5bf7db4ec9 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -40,7 +40,7 @@ static inline bool pagefault_disabled(void); #define access_ok(addr, size) \ ({ \ WARN_ON_IN_IRQ(); \ - likely(__access_ok(addr, size)); \ + likely(__access_ok(untagged_addr(addr), size)); \ }) =20 #include @@ -125,7 +125,12 @@ extern int __get_user_bad(void); * Return: zero on success, or -EFAULT on error. * On error, the variable @x is set to zero. */ -#define get_user(x,ptr) ({ might_fault(); do_get_user_call(get_user,x,ptr)= ; }) +#define get_user(x,ptr) \ +({ \ + __typeof__(*(ptr)) __user *__ptr_clean =3D untagged_ptr(ptr); \ + might_fault(); \ + do_get_user_call(get_user,x,__ptr_clean); \ +}) =20 /** * __get_user - Get a simple variable from user space, with less checking. @@ -222,7 +227,11 @@ extern void __put_user_nocheck_8(void); * * Return: zero on success, or -EFAULT on error. */ -#define put_user(x, ptr) ({ might_fault(); do_put_user_call(put_user,x,ptr= ); }) +#define put_user(x, ptr) ({ \ + __typeof__(*(ptr)) __user *__ptr_clean =3D untagged_ptr(ptr); \ + might_fault(); \ + do_put_user_call(put_user,x,__ptr_clean); \ +}) =20 /** * __put_user - Write a simple value into user space, with less checking. --=20 2.35.1 From nobody Fri May 8 05:17:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9CCDC433F5 for ; Wed, 11 May 2022 02:30:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241302AbiEKCaX (ORCPT ); Tue, 10 May 2022 22:30:23 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241257AbiEKC3p (ORCPT ); Tue, 10 May 2022 22:29:45 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9E4421AAA6 for ; Tue, 10 May 2022 19:29:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652236183; x=1683772183; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=/zG/vU7kvADc4pg2W2FY9O2Nfewu2j+VHOpdujenCaM=; b=gmk2r5wBt/7PsfYhVYBG67UJ0PRmh76OD0ED3YZH/6eYgHewzckcHqY4 ZRl7HAltD6SqAzYsbBvtqncgMLArj05PzTG0y3BBOWqHTlK8mKosnP38u sZ/56wz2+9SiS8eiq/3pnQNjIX7TlkrCmwuxLBh4lzbfmVVp1J3xbZp2G UgGOzGdD+eUfxCjq9B7yNaEd6iY3d53aBb84h7yf7knZnNltr7fZq1lF9 EOAuDdaiaD8RLguoh7ESPyD+U5ussBoey1kj7DKd8Xk6ngh88w5fjSEdX tYUl+WkX6RLHTve9kDFANmVlhnmH2Ui/aIl/APDrsoztftpnFWlHuBrDw Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10343"; a="250093485" X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="250093485" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2022 19:29:43 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="670166554" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga002.fm.intel.com with ESMTP; 10 May 2022 19:29:40 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 53BBA5D0; Wed, 11 May 2022 05:28:01 +0300 (EEST) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFCv2 07/10] x86/mm: Handle tagged memory accesses from kernel threads Date: Wed, 11 May 2022 05:27:48 +0300 Message-Id: <20220511022751.65540-9-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> References: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" When a kernel thread performs memory access on behalf of a process (like in async I/O, io_uring, etc.) it has to respect tagging setup of the process as user addresses can include tags. Normally, LAM setup is per-thread and recorded in thread features, but for this use case kernel also tracks LAM setup per-mm. mm->context.lam would record LAM that allows the most tag bits among the threads of the mm. The info used by switch_mm_irqs_off() to construct CR3 if the task is kernel thread. Thread featrues of the kernel thread get updated according to mm->context.lam. It allows untagged_addr() to work correctly. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/mmu.h | 1 + arch/x86/mm/tlb.c | 28 ++++++++++++++++++++++++++++ 2 files changed, 29 insertions(+) diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 5d7494631ea9..52f3749f14e8 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -40,6 +40,7 @@ typedef struct { =20 #ifdef CONFIG_X86_64 unsigned short flags; + u8 lam; #endif =20 struct mutex lock; diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index f9fe71d1f42c..b320556e1c22 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -185,6 +185,34 @@ static u8 gen_lam(struct task_struct *tsk, struct mm_s= truct *mm) if (!tsk) return LAM_NONE; =20 + if (tsk->flags & PF_KTHREAD) { + /* + * For kernel thread use the most permissive LAM + * used by the mm. It's required to handle kernel thread + * memory accesses on behalf of a process. + * + * Adjust thread flags accodringly, so untagged_addr() would + * work correctly. + */ + + tsk->thread.features &=3D ~(X86_THREAD_LAM_U48 | + X86_THREAD_LAM_U57); + + switch (mm->context.lam) { + case LAM_NONE: + return LAM_NONE; + case LAM_U57: + tsk->thread.features |=3D X86_THREAD_LAM_U57; + return LAM_U57; + case LAM_U48: + tsk->thread.features |=3D X86_THREAD_LAM_U48; + return LAM_U48; + default: + WARN_ON_ONCE(1); + return LAM_NONE; + } + } + if (tsk->thread.features & X86_THREAD_LAM_U57) return LAM_U57; if (tsk->thread.features & X86_THREAD_LAM_U48) --=20 2.35.1 From nobody Fri May 8 05:17:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23B44C433F5 for ; Wed, 11 May 2022 02:31:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241564AbiEKCa4 (ORCPT ); Tue, 10 May 2022 22:30:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51230 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241287AbiEKC3u (ORCPT ); Tue, 10 May 2022 22:29:50 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C38A121A961 for ; Tue, 10 May 2022 19:29:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652236188; x=1683772188; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yh4/dFHOtWSE9iF+GUg2XMGu0/7JiKaWrQ/zWPdOw6Y=; b=EBDK8770Tn3fCw36mY697BFKERR1p3uKyqvE89dKNAnCawo+6pnvgLI3 shgKwcNAPP1Cr1OS//0ZrB/SPpJSZpz02lyNLUdAmQUJSj6byvptkq6yM Q8c1dIWbzEzgxL14754Hy4iwzau4FlpivtQqz+O1XHcQAdsE+e8B9GuUq BVPVb3ly37l05/1qlsdS6mjfFcwqzGhBAAB4MBOj7flhpexmm08ww34Fd +AuyFQ1iOy5FjqM4yWznYTLfBeCRibL1YJo2hRBppu/KgX0ZTI0Z7Zggw KIlJ6A6AnmpCxyELuP88WYUFNHRvFfbXW1nlNTVGeiof9QcluG5XjznqM g==; X-IronPort-AV: E=McAfee;i="6400,9594,10343"; a="330158673" X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="330158673" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2022 19:29:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="711294411" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga001.fm.intel.com with ESMTP; 10 May 2022 19:29:40 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 5EA78630; Wed, 11 May 2022 05:28:01 +0300 (EEST) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFCv2 08/10] x86/mm: Make LAM_U48 and mappings above 47-bits mutually exclusive Date: Wed, 11 May 2022 05:27:49 +0300 Message-Id: <20220511022751.65540-10-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> References: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" LAM_U48 steals bits above 47-bit for tags and makes it impossible for userspace to use full address space on 5-level paging machine. Make these features mutually exclusive: whichever gets enabled first blocks the othe one. Signed-off-by: Kirill A. Shutemov --- arch/x86/include/asm/elf.h | 3 ++- arch/x86/include/asm/mmu_context.h | 13 +++++++++++++ arch/x86/kernel/sys_x86_64.c | 5 +++-- arch/x86/mm/hugetlbpage.c | 6 ++++-- arch/x86/mm/mmap.c | 9 ++++++++- 5 files changed, 30 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/elf.h b/arch/x86/include/asm/elf.h index 29fea180a665..53b96b0c8cc3 100644 --- a/arch/x86/include/asm/elf.h +++ b/arch/x86/include/asm/elf.h @@ -328,7 +328,8 @@ static inline int mmap_is_ia32(void) extern unsigned long task_size_32bit(void); extern unsigned long task_size_64bit(int full_addr_space); extern unsigned long get_mmap_base(int is_legacy); -extern bool mmap_address_hint_valid(unsigned long addr, unsigned long len); +extern bool mmap_address_hint_valid(struct mm_struct *mm, + unsigned long addr, unsigned long len); extern unsigned long get_sigframe_size(void); =20 #ifdef CONFIG_X86_32 diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_= context.h index 27516046117a..c8a6d80dfec3 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -218,6 +218,19 @@ static inline bool arch_vma_access_permitted(struct vm= _area_struct *vma, =20 unsigned long __get_current_cr3_fast(void); =20 +#ifdef CONFIG_X86_5LEVEL +static inline bool full_va_allowed(struct mm_struct *mm) +{ + /* LAM_U48 steals VA bits abouve 47-bit for tags */ + return mm->context.lam !=3D LAM_U48; +} +#else +static inline bool full_va_allowed(struct mm_struct *mm) +{ + return false; +} +#endif + #include =20 #endif /* _ASM_X86_MMU_CONTEXT_H */ diff --git a/arch/x86/kernel/sys_x86_64.c b/arch/x86/kernel/sys_x86_64.c index 660b78827638..4526e8fadfd2 100644 --- a/arch/x86/kernel/sys_x86_64.c +++ b/arch/x86/kernel/sys_x86_64.c @@ -21,6 +21,7 @@ =20 #include #include +#include =20 /* * Align a virtual address to avoid aliasing in the I$ on AMD F15h. @@ -185,7 +186,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const= unsigned long addr0, /* requesting a specific address */ if (addr) { addr &=3D PAGE_MASK; - if (!mmap_address_hint_valid(addr, len)) + if (!mmap_address_hint_valid(mm, addr, len)) goto get_unmapped_area; =20 vma =3D find_vma(mm, addr); @@ -206,7 +207,7 @@ arch_get_unmapped_area_topdown(struct file *filp, const= unsigned long addr0, * !in_32bit_syscall() check to avoid high addresses for x32 * (and make it no op on native i386). */ - if (addr > DEFAULT_MAP_WINDOW && !in_32bit_syscall()) + if (addr > DEFAULT_MAP_WINDOW && !in_32bit_syscall() && full_va_allowed(m= m)) info.high_limit +=3D TASK_SIZE_MAX - DEFAULT_MAP_WINDOW; =20 info.align_mask =3D 0; diff --git a/arch/x86/mm/hugetlbpage.c b/arch/x86/mm/hugetlbpage.c index a0d023cb4292..9fdc8db42365 100644 --- a/arch/x86/mm/hugetlbpage.c +++ b/arch/x86/mm/hugetlbpage.c @@ -18,6 +18,7 @@ #include #include #include +#include =20 #if 0 /* This is just for testing */ struct page * @@ -103,6 +104,7 @@ static unsigned long hugetlb_get_unmapped_area_topdown(= struct file *file, unsigned long pgoff, unsigned long flags) { struct hstate *h =3D hstate_file(file); + struct mm_struct *mm =3D current->mm; struct vm_unmapped_area_info info; =20 info.flags =3D VM_UNMAPPED_AREA_TOPDOWN; @@ -114,7 +116,7 @@ static unsigned long hugetlb_get_unmapped_area_topdown(= struct file *file, * If hint address is above DEFAULT_MAP_WINDOW, look for unmapped area * in the full address space. */ - if (addr > DEFAULT_MAP_WINDOW && !in_32bit_syscall()) + if (addr > DEFAULT_MAP_WINDOW && !in_32bit_syscall() && full_va_allowed(m= m)) info.high_limit +=3D TASK_SIZE_MAX - DEFAULT_MAP_WINDOW; =20 info.align_mask =3D PAGE_MASK & ~huge_page_mask(h); @@ -161,7 +163,7 @@ hugetlb_get_unmapped_area(struct file *file, unsigned l= ong addr, =20 if (addr) { addr &=3D huge_page_mask(h); - if (!mmap_address_hint_valid(addr, len)) + if (!mmap_address_hint_valid(mm, addr, len)) goto get_unmapped_area; =20 vma =3D find_vma(mm, addr); diff --git a/arch/x86/mm/mmap.c b/arch/x86/mm/mmap.c index c90c20904a60..f9ca824729de 100644 --- a/arch/x86/mm/mmap.c +++ b/arch/x86/mm/mmap.c @@ -21,6 +21,7 @@ #include #include #include +#include =20 #include "physaddr.h" =20 @@ -35,6 +36,8 @@ unsigned long task_size_32bit(void) =20 unsigned long task_size_64bit(int full_addr_space) { + if (!full_va_allowed(current->mm)) + return DEFAULT_MAP_WINDOW; return full_addr_space ? TASK_SIZE_MAX : DEFAULT_MAP_WINDOW; } =20 @@ -206,11 +209,15 @@ const char *arch_vma_name(struct vm_area_struct *vma) * the failure of such a fixed mapping request, so the restriction is not * applied. */ -bool mmap_address_hint_valid(unsigned long addr, unsigned long len) +bool mmap_address_hint_valid(struct mm_struct *mm, + unsigned long addr, unsigned long len) { if (TASK_SIZE - len < addr) return false; =20 + if (addr + len > DEFAULT_MAP_WINDOW && !full_va_allowed(mm)) + return false; + return (addr > DEFAULT_MAP_WINDOW) =3D=3D (addr + len > DEFAULT_MAP_WINDO= W); } =20 --=20 2.35.1 From nobody Fri May 8 05:17:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F703C433F5 for ; Wed, 11 May 2022 02:30:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241421AbiEKCaZ (ORCPT ); Tue, 10 May 2022 22:30:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50812 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241263AbiEKC3p (ORCPT ); Tue, 10 May 2022 22:29:45 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4EFDA21B140 for ; Tue, 10 May 2022 19:29:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652236184; x=1683772184; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Gd+nZWXT2Vj5sId1tBKnAzjGgf10Zdvpv1KhXHK183k=; b=Zx6av7+RxIk25NKdtYZHace8dWbp+R9syyquj262x8RQyA5SXdO8ihen eBlJV87l2ddRe2+I2x2rjzKj7LZGHq6JUYJFKT2yQbntxUtgw/9Yi4C8I Rg1LcTvc7tM8Ro7Boi75cUPQu9uffxgWQ7hX3wkzcplgwYQqxj97nwTJ5 AtWpfn+WRpWLPW512kjOozI10u2epzgJ575oSfYoqci59jNHIQlZNTE1P mkcomXyRMqv/scXUQ3KzW6zz5jyMDZLSoUkc1qxpJJzVmqpiPOoiiWNhv 7EXlpfRmivMBngMxrZxYKZZ/URj+miR7jelODqAsdVSVTk5Hlh9ZDNCcL A==; X-IronPort-AV: E=McAfee;i="6400,9594,10343"; a="250093488" X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="250093488" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2022 19:29:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="670166557" Received: from black.fi.intel.com ([10.237.72.28]) by fmsmga002.fm.intel.com with ESMTP; 10 May 2022 19:29:40 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 69FD1646; Wed, 11 May 2022 05:28:01 +0300 (EEST) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFCv2 09/10] x86/mm: Add userspace API to enable Linear Address Masking Date: Wed, 11 May 2022 05:27:50 +0300 Message-Id: <20220511022751.65540-11-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> References: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Allow to enable Linear Address Masking via ARCH_THREAD_FEATURE_ENABLE arch_prctl(2). Signed-off-by: Kirill A. Shutemov --- arch/x86/kernel/process.c | 21 +++++++++++++++- arch/x86/kernel/process.h | 2 ++ arch/x86/kernel/process_64.c | 46 ++++++++++++++++++++++++++++++++++++ 3 files changed, 68 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index cb8fc28f2eae..911c24321312 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -46,6 +46,8 @@ #include #include #include +#include +#include =20 #include "process.h" =20 @@ -992,7 +994,9 @@ unsigned long __get_wchan(struct task_struct *p) static long thread_feature_prctl(struct task_struct *task, int option, unsigned long features) { - const unsigned long known_features =3D 0; + const unsigned long known_features =3D + X86_THREAD_LAM_U48 | + X86_THREAD_LAM_U57; =20 if (features & ~known_features) return -EINVAL; @@ -1013,8 +1017,23 @@ static long thread_feature_prctl(struct task_struct = *task, int option, =20 /* Handle ARCH_THREAD_FEATURE_ENABLE */ =20 + if (features & (X86_THREAD_LAM_U48 | X86_THREAD_LAM_U57)) { + long ret; + + /* LAM is only available in long mode */ + if (in_32bit_syscall()) + return -EINVAL; + + ret =3D enable_lam(task, features); + if (ret) + return ret; + } + task->thread.features |=3D features; out: + /* Update CR3 to get LAM active */ + switch_mm(task->mm, task->mm, task); + return task->thread.features; } =20 diff --git a/arch/x86/kernel/process.h b/arch/x86/kernel/process.h index 76b547b83232..b8fa0e599c6e 100644 --- a/arch/x86/kernel/process.h +++ b/arch/x86/kernel/process.h @@ -4,6 +4,8 @@ =20 #include =20 +long enable_lam(struct task_struct *task, unsigned long features); + void __switch_to_xtra(struct task_struct *prev_p, struct task_struct *next= _p); =20 /* diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index e459253649be..a25c51da7005 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -729,6 +729,52 @@ void set_personality_ia32(bool x32) } EXPORT_SYMBOL_GPL(set_personality_ia32); =20 +static bool lam_u48_allowed(void) +{ + struct mm_struct *mm =3D current->mm; + + if (!full_va_allowed(mm)) + return true; + + return find_vma(mm, DEFAULT_MAP_WINDOW) =3D=3D NULL; +} + +long enable_lam(struct task_struct *task, unsigned long features) +{ + features |=3D task->thread.features; + + /* LAM_U48 and LAM_U57 are mutually exclusive */ + if ((features & X86_THREAD_LAM_U48) && (features & X86_THREAD_LAM_U57)) + return -EINVAL; + + if (!cpu_feature_enabled(X86_FEATURE_LAM)) + return -ENXIO; + + if (mmap_write_lock_killable(task->mm)) + return -EINTR; + + if ((features & X86_THREAD_LAM_U48) && !lam_u48_allowed()) { + mmap_write_unlock(task->mm); + return -EINVAL; + } + + /* + * Record the most permissive (allowing the widest tags) LAM + * mode to the mm context. It determinates if a mappings above + * 47 bit is allowed for the process. + * + * The mode is also used by a kernel thread when it does work + * on behalf of the process (like async I/O, io_uring, etc.) + */ + if (features & X86_THREAD_LAM_U48) + current->mm->context.lam =3D LAM_U48; + else if (current->mm->context.lam =3D=3D LAM_NONE) + current->mm->context.lam =3D LAM_U57; + + mmap_write_unlock(task->mm); + return 0; +} + #ifdef CONFIG_CHECKPOINT_RESTORE static long prctl_map_vdso(const struct vdso_image *image, unsigned long a= ddr) { --=20 2.35.1 From nobody Fri May 8 05:17:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15A4EC433EF for ; Wed, 11 May 2022 02:30:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233863AbiEKCan (ORCPT ); Tue, 10 May 2022 22:30:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50826 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241276AbiEKC3q (ORCPT ); Tue, 10 May 2022 22:29:46 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3E873219C16 for ; Tue, 10 May 2022 19:29:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652236185; x=1683772185; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=frV6V5e6eki4023LcoU7tb6ei2BMdOM57Nm1uh2kLPA=; b=R5oR4wvs4PZtmjVaAWeM04vHSgexJjd09A+RYTQagJaIWKRH+Oeh+y5I krO8VGcyANxR0Q+Hp1bKzAHRWmrgTrl/MDtNaxbX6wmyLxhGHLVMp5RBN M+/o/0x6+0zML7pEPvOLexNJfXyUf397Dq1qGxzwJKxFwt1Yr1q9X6XY/ 5H7GPkkXIaNBYaK0O8cMA8EMgNLRbQyFoUUuN+vQCzf2lgH9hJqPrzXZG 7dMqUPnWvHngb4sfS0V1suP1MLWZ29xozgmp/12wis5ndTtlUFwpBLT/r naopg0O5BbmA1NHhArWGGWVjfQwNugl+ws+WlhvJieDYLEb9xe9ntqOq6 Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10343"; a="250093491" X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="250093491" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2022 19:29:44 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="697359194" Received: from black.fi.intel.com ([10.237.72.28]) by orsmga004.jf.intel.com with ESMTP; 10 May 2022 19:29:40 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 6FF5B590; Wed, 11 May 2022 05:28:01 +0300 (EEST) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [RFCv2 10/10] x86: Expose thread features status in /proc/$PID/arch_status Date: Wed, 11 May 2022 05:27:51 +0300 Message-Id: <20220511022751.65540-12-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> References: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Add two lines in /proc/$PID/arch_status to report enabled and locked features. Signed-off-by: Kirill A. Shutemov --- arch/x86/kernel/Makefile | 2 ++ arch/x86/kernel/fpu/xstate.c | 47 --------------------------- arch/x86/kernel/proc.c | 63 ++++++++++++++++++++++++++++++++++++ 3 files changed, 65 insertions(+), 47 deletions(-) create mode 100644 arch/x86/kernel/proc.c diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index c41ef42adbe8..19dae7a4201b 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -149,6 +149,8 @@ obj-$(CONFIG_UNWINDER_GUESS) +=3D unwind_guess.o =20 obj-$(CONFIG_AMD_MEM_ENCRYPT) +=3D sev.o =20 +obj-$(CONFIG_PROC_FS) +=3D proc.o + ### # 64 bit specific files ifeq ($(CONFIG_X86_64),y) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 39e1c8626ab9..789a7a1429df 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -10,8 +10,6 @@ #include #include #include -#include -#include #include =20 #include @@ -1730,48 +1728,3 @@ long fpu_xstate_prctl(struct task_struct *tsk, int o= ption, unsigned long arg2) return -EINVAL; } } - -#ifdef CONFIG_PROC_PID_ARCH_STATUS -/* - * Report the amount of time elapsed in millisecond since last AVX512 - * use in the task. - */ -static void avx512_status(struct seq_file *m, struct task_struct *task) -{ - unsigned long timestamp =3D READ_ONCE(task->thread.fpu.avx512_timestamp); - long delta; - - if (!timestamp) { - /* - * Report -1 if no AVX512 usage - */ - delta =3D -1; - } else { - delta =3D (long)(jiffies - timestamp); - /* - * Cap to LONG_MAX if time difference > LONG_MAX - */ - if (delta < 0) - delta =3D LONG_MAX; - delta =3D jiffies_to_msecs(delta); - } - - seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta); - seq_putc(m, '\n'); -} - -/* - * Report architecture specific information - */ -int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns, - struct pid *pid, struct task_struct *task) -{ - /* - * Report AVX512 state if the processor and build option supported. - */ - if (cpu_feature_enabled(X86_FEATURE_AVX512F)) - avx512_status(m, task); - - return 0; -} -#endif /* CONFIG_PROC_PID_ARCH_STATUS */ diff --git a/arch/x86/kernel/proc.c b/arch/x86/kernel/proc.c new file mode 100644 index 000000000000..7b2f39031d8a --- /dev/null +++ b/arch/x86/kernel/proc.c @@ -0,0 +1,63 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include + +/* + * Report the amount of time elapsed in millisecond since last AVX512 + * use in the task. + */ +static void avx512_status(struct seq_file *m, struct task_struct *task) +{ + unsigned long timestamp =3D READ_ONCE(task->thread.fpu.avx512_timestamp); + long delta; + + if (!timestamp) { + /* + * Report -1 if no AVX512 usage + */ + delta =3D -1; + } else { + delta =3D (long)(jiffies - timestamp); + /* + * Cap to LONG_MAX if time difference > LONG_MAX + */ + if (delta < 0) + delta =3D LONG_MAX; + delta =3D jiffies_to_msecs(delta); + } + + seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta); + seq_putc(m, '\n'); +} + +static void dump_features(struct seq_file *m, unsigned long features) +{ + if (features & X86_THREAD_LAM_U48) + seq_puts(m, "lam_u48 "); + if (features & X86_THREAD_LAM_U57) + seq_puts(m, "lam_u57 "); +} + +/* + * Report architecture specific information + */ +int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns, + struct pid *pid, struct task_struct *task) +{ + /* + * Report AVX512 state if the processor and build option supported. + */ + if (cpu_feature_enabled(X86_FEATURE_AVX512F)) + avx512_status(m, task); + + seq_puts(m, "Thread_features:\t"); + dump_features(m, task->thread.features); + seq_putc(m, '\n'); + + seq_puts(m, "Thread_features_locked:\t"); + dump_features(m, task->thread.features_locked); + seq_putc(m, '\n'); + + return 0; +} --=20 2.35.1 From nobody Fri May 8 05:17:26 2026 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14763C433EF for ; Wed, 11 May 2022 02:29:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241235AbiEKC3w (ORCPT ); Tue, 10 May 2022 22:29:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50392 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241219AbiEKC3k (ORCPT ); Tue, 10 May 2022 22:29:40 -0400 Received: from mga17.intel.com (mga17.intel.com [192.55.52.151]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 412D2219C16 for ; Tue, 10 May 2022 19:29:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1652236179; x=1683772179; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=WwIk/yoRfzkNhlZBUapCzd0ECs+rZYsn0Cy66q5a7cU=; b=OgueJi33LS/2KVInJQ2H0Sxz1pu/uqMKgXYRtqZSFso9Rgt6YVtJSoqB h/NS3crHyaWq/5YqwMIVhPoM95eXNhJ2805lm6SM8SlMeVRTYxfB2hV2Q AVlinCHKcqjs/0bYoZJAQOL0yXG8aIurfnIzsaEQkY6HmATsVgX+rzd8D jF+WvbtxTSUsnyGWxgXnoRGS2ZdDFJ4VNr+5z5ZreWHuVlTu0i1u+eUTL Qo2i2ulHraZen4pSQmPws2dXvN4hlr0B7i1ROi6IgY/pNX7ZIDi5Kk45a ektf/hxJF0kEt/OigbdoqMtFdoyOaEg4TevQbXZRuNQL2soQWe4hJQp6n A==; X-IronPort-AV: E=McAfee;i="6400,9594,10343"; a="250093466" X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="250093466" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 May 2022 19:29:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,215,1647327600"; d="scan'208";a="657937556" Received: from black.fi.intel.com ([10.237.72.28]) by FMSMGA003.fm.intel.com with ESMTP; 10 May 2022 19:29:35 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 12C4281; Wed, 11 May 2022 05:28:01 +0300 (EEST) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Andrey Ryabinin , Alexander Potapenko , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCH] x86: Implement Linear Address Masking support Date: Wed, 11 May 2022 05:27:41 +0300 Message-Id: <20220511022751.65540-2-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> References: <20220511022751.65540-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Type: text/plain; charset="utf-8" Linear Address Masking feature makes CPU ignore some bits of the virtual address. These bits can be used to encode metadata. The feature is enumerated with CPUID.(EAX=3D07H, ECX=3D01H):EAX.LAM[bit 26]. CR3.LAM_U57[bit 62] allows to encode 6 bits of metadata in bits 62:57 of user pointers. CR3.LAM_U48[bit 61] allows to encode 15 bits of metadata in bits 62:48 of user pointers. CR4.LAM_SUP[bit 28] allows to encode metadata of supervisor pointers. If 5-level paging is in use, 6 bits of metadata can be encoded in 62:57. For 4-level paging, 15 bits of metadata can be encoded in bits 62:48. QEMU strips address from the metadata bits and gets it to canonical shape before handling memory access. It has to be done very early before TLB lookup. Signed-off-by: Kirill A. Shutemov --- accel/tcg/cputlb.c | 20 +++++++++++++++++--- include/hw/core/tcg-cpu-ops.h | 5 +++++ target/i386/cpu.c | 4 ++-- target/i386/cpu.h | 26 +++++++++++++++++++++++++- target/i386/helper.c | 2 +- target/i386/tcg/helper-tcg.h | 1 + target/i386/tcg/sysemu/excp_helper.c | 28 +++++++++++++++++++++++++++- target/i386/tcg/sysemu/misc_helper.c | 3 +-- target/i386/tcg/sysemu/svm_helper.c | 3 +-- target/i386/tcg/tcg-cpu.c | 1 + 10 files changed, 81 insertions(+), 12 deletions(-) diff --git a/accel/tcg/cputlb.c b/accel/tcg/cputlb.c index 2035b2ac0ac0..15eff0df39c1 100644 --- a/accel/tcg/cputlb.c +++ b/accel/tcg/cputlb.c @@ -1295,6 +1295,17 @@ static inline ram_addr_t qemu_ram_addr_from_host_nof= ail(void *ptr) return ram_addr; } =20 +static vaddr clean_addr(CPUArchState *env, vaddr addr) +{ + CPUClass *cc =3D CPU_GET_CLASS(env_cpu(env)); + + if (cc->tcg_ops->do_clean_addr) { + addr =3D cc->tcg_ops->do_clean_addr(env_cpu(env), addr); + } + + return addr; +} + /* * Note: tlb_fill() can trigger a resize of the TLB. This means that all o= f the * caller's prior references to the TLB table (e.g. CPUTLBEntry pointers) = must @@ -1757,10 +1768,11 @@ bool tlb_plugin_lookup(CPUState *cpu, target_ulong = addr, int mmu_idx, * * @prot may be PAGE_READ, PAGE_WRITE, or PAGE_READ|PAGE_WRITE. */ -static void *atomic_mmu_lookup(CPUArchState *env, target_ulong addr, +static void *atomic_mmu_lookup(CPUArchState *env, target_ulong address, MemOpIdx oi, int size, int prot, uintptr_t retaddr) { + target_ulong addr =3D clean_addr(env, address); size_t mmu_idx =3D get_mmuidx(oi); MemOp mop =3D get_memop(oi); int a_bits =3D get_alignment_bits(mop); @@ -1904,10 +1916,11 @@ load_memop(const void *haddr, MemOp op) } =20 static inline uint64_t QEMU_ALWAYS_INLINE -load_helper(CPUArchState *env, target_ulong addr, MemOpIdx oi, +load_helper(CPUArchState *env, target_ulong address, MemOpIdx oi, uintptr_t retaddr, MemOp op, bool code_read, FullLoadHelper *full_load) { + target_ulong addr =3D clean_addr(env, address); uintptr_t mmu_idx =3D get_mmuidx(oi); uintptr_t index =3D tlb_index(env, mmu_idx, addr); CPUTLBEntry *entry =3D tlb_entry(env, mmu_idx, addr); @@ -2307,9 +2320,10 @@ store_helper_unaligned(CPUArchState *env, target_ulo= ng addr, uint64_t val, } =20 static inline void QEMU_ALWAYS_INLINE -store_helper(CPUArchState *env, target_ulong addr, uint64_t val, +store_helper(CPUArchState *env, target_ulong address, uint64_t val, MemOpIdx oi, uintptr_t retaddr, MemOp op) { + target_ulong addr =3D clean_addr(env, address); uintptr_t mmu_idx =3D get_mmuidx(oi); uintptr_t index =3D tlb_index(env, mmu_idx, addr); CPUTLBEntry *entry =3D tlb_entry(env, mmu_idx, addr); diff --git a/include/hw/core/tcg-cpu-ops.h b/include/hw/core/tcg-cpu-ops.h index e13898553aff..8e81f45510bf 100644 --- a/include/hw/core/tcg-cpu-ops.h +++ b/include/hw/core/tcg-cpu-ops.h @@ -82,6 +82,11 @@ struct TCGCPUOps { MMUAccessType access_type, int mmu_idx, uintptr_t retaddr) QEMU_NORET= URN; =20 + /** + * @do_clean_addr: Callback for clearing metadata/tags from the addres= s. + */ + vaddr (*do_clean_addr)(CPUState *cpu, vaddr addr); + /** * @adjust_watchpoint_address: hack for cpu_check_watchpoint used by A= RM */ diff --git a/target/i386/cpu.c b/target/i386/cpu.c index cb6b5467d067..6e3e8473bf04 100644 --- a/target/i386/cpu.c +++ b/target/i386/cpu.c @@ -662,7 +662,7 @@ void x86_cpu_vendor_words2str(char *dst, uint32_t vendo= r1, /* CPUID_7_0_ECX_OSPKE is dynamic */ \ CPUID_7_0_ECX_LA57 | CPUID_7_0_ECX_PKS) #define TCG_7_0_EDX_FEATURES 0 -#define TCG_7_1_EAX_FEATURES 0 +#define TCG_7_1_EAX_FEATURES CPUID_7_1_EAX_LAM #define TCG_APM_FEATURES 0 #define TCG_6_EAX_FEATURES CPUID_6_EAX_ARAT #define TCG_XSAVE_FEATURES (CPUID_XSAVE_XSAVEOPT | CPUID_XSAVE_XGETBV1) @@ -876,7 +876,7 @@ FeatureWordInfo feature_word_info[FEATURE_WORDS] =3D { NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, NULL, - NULL, NULL, NULL, NULL, + NULL, NULL, "lam", NULL, NULL, NULL, NULL, NULL, }, .cpuid =3D { diff --git a/target/i386/cpu.h b/target/i386/cpu.h index 982c5323537c..5d6cc8efb7da 100644 --- a/target/i386/cpu.h +++ b/target/i386/cpu.h @@ -232,6 +232,9 @@ typedef enum X86Seg { #define CR0_CD_MASK (1U << 30) #define CR0_PG_MASK (1U << 31) =20 +#define CR3_LAM_U57 (1ULL << 61) +#define CR3_LAM_U48 (1ULL << 62) + #define CR4_VME_MASK (1U << 0) #define CR4_PVI_MASK (1U << 1) #define CR4_TSD_MASK (1U << 2) @@ -255,6 +258,7 @@ typedef enum X86Seg { #define CR4_SMAP_MASK (1U << 21) #define CR4_PKE_MASK (1U << 22) #define CR4_PKS_MASK (1U << 24) +#define CR4_LAM_SUP (1U << 28) =20 #define CR4_RESERVED_MASK \ (~(target_ulong)(CR4_VME_MASK | CR4_PVI_MASK | CR4_TSD_MASK \ @@ -263,7 +267,8 @@ typedef enum X86Seg { | CR4_OSFXSR_MASK | CR4_OSXMMEXCPT_MASK | CR4_UMIP_MASK \ | CR4_LA57_MASK \ | CR4_FSGSBASE_MASK | CR4_PCIDE_MASK | CR4_OSXSAVE_MASK \ - | CR4_SMEP_MASK | CR4_SMAP_MASK | CR4_PKE_MASK | CR4_PKS_M= ASK)) + | CR4_SMEP_MASK | CR4_SMAP_MASK | CR4_PKE_MASK | CR4_PKS_M= ASK \ + | CR4_LAM_SUP)) =20 #define DR6_BD (1 << 13) #define DR6_BS (1 << 14) @@ -877,6 +882,8 @@ uint64_t x86_cpu_get_supported_feature_word(FeatureWord= w, #define CPUID_7_1_EAX_AVX_VNNI (1U << 4) /* AVX512 BFloat16 Instruction */ #define CPUID_7_1_EAX_AVX512_BF16 (1U << 5) +/* Linear Address Masking */ +#define CPUID_7_1_EAX_LAM (1U << 26) /* XFD Extend Feature Disabled */ #define CPUID_D_1_EAX_XFD (1U << 4) =20 @@ -2287,6 +2294,23 @@ static inline bool hyperv_feat_enabled(X86CPU *cpu, = int feat) return !!(cpu->hyperv_features & BIT(feat)); } =20 +static inline uint64_t cr3_reserved_bits(CPUX86State *env) +{ + uint64_t reserved_bits; + + if (!(env->efer & MSR_EFER_LMA)) { + return 0; + } + + reserved_bits =3D (~0ULL) << env_archcpu(env)->phys_bits; + + if (env->features[FEAT_7_1_EAX] & CPUID_7_1_EAX_LAM) { + reserved_bits &=3D ~(CR3_LAM_U48 | CR3_LAM_U57); + } + + return reserved_bits; +} + static inline uint64_t cr4_reserved_bits(CPUX86State *env) { uint64_t reserved_bits =3D CR4_RESERVED_MASK; diff --git a/target/i386/helper.c b/target/i386/helper.c index fa409e9c44a8..f91ebab840d6 100644 --- a/target/i386/helper.c +++ b/target/i386/helper.c @@ -247,7 +247,7 @@ hwaddr x86_cpu_get_phys_page_attrs_debug(CPUState *cs, = vaddr addr, } =20 if (la57) { - pml5e_addr =3D ((env->cr[3] & ~0xfff) + + pml5e_addr =3D ((env->cr[3] & PG_ADDRESS_MASK) + (((addr >> 48) & 0x1ff) << 3)) & a20_mask; pml5e =3D x86_ldq_phys(cs, pml5e_addr); if (!(pml5e & PG_PRESENT_MASK)) { diff --git a/target/i386/tcg/helper-tcg.h b/target/i386/tcg/helper-tcg.h index 0a4401e917f9..03ab858598d2 100644 --- a/target/i386/tcg/helper-tcg.h +++ b/target/i386/tcg/helper-tcg.h @@ -51,6 +51,7 @@ void x86_cpu_record_sigsegv(CPUState *cs, vaddr addr, bool x86_cpu_tlb_fill(CPUState *cs, vaddr address, int size, MMUAccessType access_type, int mmu_idx, bool probe, uintptr_t retaddr); +vaddr x86_cpu_clean_addr(CPUState *cpu, vaddr addr); #endif =20 void breakpoint_handler(CPUState *cs); diff --git a/target/i386/tcg/sysemu/excp_helper.c b/target/i386/tcg/sysemu/= excp_helper.c index e1b6d8868338..caaab413381b 100644 --- a/target/i386/tcg/sysemu/excp_helper.c +++ b/target/i386/tcg/sysemu/excp_helper.c @@ -64,7 +64,7 @@ static int mmu_translate(CPUState *cs, hwaddr addr, MMUTr= anslateFunc get_hphys_f uint64_t pml4e_addr, pml4e; =20 if (la57) { - pml5e_addr =3D ((cr3 & ~0xfff) + + pml5e_addr =3D ((cr3 & PG_ADDRESS_MASK) + (((addr >> 48) & 0x1ff) << 3)) & a20_mask; pml5e_addr =3D GET_HPHYS(cs, pml5e_addr, MMU_DATA_STORE, N= ULL); pml5e =3D x86_ldq_phys(cs, pml5e_addr); @@ -437,3 +437,29 @@ bool x86_cpu_tlb_fill(CPUState *cs, vaddr addr, int si= ze, } return true; } + +static inline int64_t sign_extend64(uint64_t value, int index) +{ + int shift =3D 63 - index; + return (int64_t)(value << shift) >> shift; +} + +vaddr x86_cpu_clean_addr(CPUState *cs, vaddr addr) +{ + CPUX86State *env =3D &X86_CPU(cs)->env; + bool la57 =3D env->cr[4] & CR4_LA57_MASK; + + if (addr >> 63) { + if (env->cr[4] & CR4_LAM_SUP) { + return sign_extend64(addr, la57 ? 56 : 47); + } + } else { + if (env->cr[3] & CR3_LAM_U57) { + return sign_extend64(addr, 56); + } else if (env->cr[3] & CR3_LAM_U48) { + return sign_extend64(addr, 47); + } + } + + return addr; +} diff --git a/target/i386/tcg/sysemu/misc_helper.c b/target/i386/tcg/sysemu/= misc_helper.c index 3715c1e2625b..faeb4a16383c 100644 --- a/target/i386/tcg/sysemu/misc_helper.c +++ b/target/i386/tcg/sysemu/misc_helper.c @@ -97,8 +97,7 @@ void helper_write_crN(CPUX86State *env, int reg, target_u= long t0) cpu_x86_update_cr0(env, t0); break; case 3: - if ((env->efer & MSR_EFER_LMA) && - (t0 & ((~0ULL) << env_archcpu(env)->phys_bits))) { + if (t0 & cr3_reserved_bits(env)) { cpu_vmexit(env, SVM_EXIT_ERR, 0, GETPC()); } if (!(env->efer & MSR_EFER_LMA)) { diff --git a/target/i386/tcg/sysemu/svm_helper.c b/target/i386/tcg/sysemu/s= vm_helper.c index 2b6f450af959..cbd99f240bb8 100644 --- a/target/i386/tcg/sysemu/svm_helper.c +++ b/target/i386/tcg/sysemu/svm_helper.c @@ -287,8 +287,7 @@ void helper_vmrun(CPUX86State *env, int aflag, int next= _eip_addend) cpu_vmexit(env, SVM_EXIT_ERR, 0, GETPC()); } new_cr3 =3D x86_ldq_phys(cs, env->vm_vmcb + offsetof(struct vmcb, save= .cr3)); - if ((env->efer & MSR_EFER_LMA) && - (new_cr3 & ((~0ULL) << cpu->phys_bits))) { + if (new_cr3 & cr3_reserved_bits(env)) { cpu_vmexit(env, SVM_EXIT_ERR, 0, GETPC()); } new_cr4 =3D x86_ldq_phys(cs, env->vm_vmcb + offsetof(struct vmcb, save= .cr4)); diff --git a/target/i386/tcg/tcg-cpu.c b/target/i386/tcg/tcg-cpu.c index 6fdfdf959899..754454d19041 100644 --- a/target/i386/tcg/tcg-cpu.c +++ b/target/i386/tcg/tcg-cpu.c @@ -77,6 +77,7 @@ static const struct TCGCPUOps x86_tcg_ops =3D { .record_sigsegv =3D x86_cpu_record_sigsegv, #else .tlb_fill =3D x86_cpu_tlb_fill, + .do_clean_addr =3D x86_cpu_clean_addr, .do_interrupt =3D x86_cpu_do_interrupt, .cpu_exec_interrupt =3D x86_cpu_exec_interrupt, .debug_excp_handler =3D breakpoint_handler, --=20 2.35.1