From nobody Tue Dec 16 07:41:39 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A0504A2B60 for ; Tue, 6 May 2025 00:48:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492495; cv=none; b=CfuGXHYd4J9AKmmu9MeBYuTCoMrlnBccsiBvppNq7vaJoBpzCAd1VKCHk29J3hDyKCK4bND1BfTgXlNdIYgvME8jdbKwqFJrRdhuzNfZyQpjVMvXJYmZSbwDkddozT2dgwKYJOJQb0HKRtYRLbwsrqHovM1fLDoEDlF4gzvf4QE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492495; c=relaxed/simple; bh=+wIbL1X55MSJzHqFS9CRNQXdCUM1VY3bZFOCVZqzNQA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qMz/AiI24g9XxML/t05WpiYlsKSh8B2hxX3HZeWsvHmgixqih0s3BQwYV66DY2kA9i2JGEGOnMni1DbeI+Au7jWai6Ts3HDBnQDBVpVD7ZkEAxg4OaBQJHEoJOzLW9zmMiLGTf97T7+EujC5E3bGlmUOLzWFY+UkdnP3bUPr+eA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uC6K6-000000000IF-3CQ2; Mon, 05 May 2025 20:38:14 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, Yu-cheng Yu , Rik van Riel Subject: [RFC PATCH 1/9] x86/mm: Introduce MSR_IA32_CORE_CAPABILITIES Date: Mon, 5 May 2025 20:37:39 -0400 Message-ID: <20250506003811.92405-2-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250506003811.92405-1-riel@surriel.com> References: <20250506003811.92405-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Yu-cheng Yu MSR_IA32_CORE_CAPABILITIES indicates the existence of other MSRs. Bit[1] indicates Remote Action Request (RAR) TLB registers. Signed-off-by: Yu-cheng Yu Signed-off-by: Rik van Riel --- arch/x86/include/asm/msr-index.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index ac21dc19dde2..0828b891fe2e 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -212,6 +212,12 @@ * File. */ =20 +#define MSR_IA32_CORE_CAPABILITIES 0x000000cf +#define CORE_CAP_RAR BIT(1) /* + * Remote Action Request. Used to directly + * flush the TLB on remote CPUs. + */ + #define MSR_IA32_FLUSH_CMD 0x0000010b #define L1D_FLUSH BIT(0) /* * Writeback and invalidate the --=20 2.49.0 From nobody Tue Dec 16 07:41:39 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B26064A5A25 for ; Tue, 6 May 2025 00:48:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492539; cv=none; b=EdKj6q1ES6KfdLEWCUPbSGgmtaRqghRmcp9Yq+CbqWRphEfTqPCZ1W0EbOmtx0TlJ3PWiJ424cw6nW5CGmsjZTimDOqiWn9KNC5XUppSJ9lAkySmajZeKEgfkdCt7A2eVbUkn3BY6geOhy8S+A5XPiWRC2fuYNVMXFaI9fKRTrY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492539; c=relaxed/simple; bh=/kAYR5itm1l4OmW5bmbutIOLpbCH9d8o5NyuhovS5Hc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=r8sfjqTLIfQ84NBeTq7oGQWQ0RwRu3QVXl//GSNq44n62C5VKOaCnxiOs9pRGIz/kMkVdW3f1E0/5QcpSFYJG6ZO/SFL1cYBcL+aln/Ey9KaRvIehQHUSTPo+IDD5LdRTHR0LMbpq2ZhlnIiiF0+mdnQ5haa7BdHxdNxqqMaGig= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uC6K6-000000000IF-3INh; Mon, 05 May 2025 20:38:14 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, Yu-cheng Yu , Rik van Riel Subject: [RFC PATCH 2/9] x86/mm: Introduce Remote Action Request MSRs Date: Mon, 5 May 2025 20:37:40 -0400 Message-ID: <20250506003811.92405-3-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250506003811.92405-1-riel@surriel.com> References: <20250506003811.92405-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Yu-cheng Yu Remote Action Request (RAR) is a TLB flushing broadcast facility. This patch introduces RAR MSRs. RAR is introduced in later patches. There are five RAR MSRs: MSR_CORE_CAPABILITIES MSR_IA32_RAR_CTRL MSR_IA32_RAR_ACT_VEC MSR_IA32_RAR_PAYLOAD_BASE MSR_IA32_RAR_INFO Signed-off-by: Yu-cheng Yu Signed-off-by: Rik van Riel --- arch/x86/include/asm/msr-index.h | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index 0828b891fe2e..923e17462712 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -122,6 +122,17 @@ #define SNB_C3_AUTO_UNDEMOTE (1UL << 27) #define SNB_C1_AUTO_UNDEMOTE (1UL << 28) =20 +/* + * Remote Action Requests (RAR) MSRs + */ +#define MSR_IA32_RAR_CTRL 0x000000ed +#define MSR_IA32_RAR_ACT_VEC 0x000000ee +#define MSR_IA32_RAR_PAYLOAD_BASE 0x000000ef +#define MSR_IA32_RAR_INFO 0x000000f0 + +#define RAR_CTRL_ENABLE BIT(31) +#define RAR_CTRL_IGNORE_IF BIT(30) + #define MSR_MTRRcap 0x000000fe =20 #define MSR_IA32_ARCH_CAPABILITIES 0x0000010a --=20 2.49.0 From nobody Tue Dec 16 07:41:39 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C2A5829A9FB for ; Tue, 6 May 2025 00:47:51 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492474; cv=none; b=ZGeZ/N53Qcwk1cABJkcLOQ/LHKfBmrbCtvuSihay/tE+zJ5/NgpEDl9LQAbPA0t3xxGCa+wztrtmi06IKwL4e83Jgy0kgFhXA8Dx8uxhM+kqgBWVA8N6PiPT1org6s7UrTS+g+P8ak8863PrnYQn0TFYH/Xpem9eNDqmruLz1Ew= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492474; c=relaxed/simple; bh=71WvzsdPoyrjXq8l+mrusg+sx3T4Ajw8dHlsaDfvdQE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=kysyd0laAvCaC6bJ2IFl90e6n+Kx1UiplFCKS97GiCNtkZrc6haboH6nIrbQCk76o1v1n8DmG+h+HeUynj7qaJr/WasUqC/jaQ3i5Wmf26490ozXRSDZk4Z2oNCbXP5boNgdZ2pT7kni8qfy9UFTaW2U5muALrxZvQ2agPJrgNM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uC6K6-000000000IF-3NvK; Mon, 05 May 2025 20:38:14 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, Rik van Riel , Rik van Riel Subject: [RFC PATCH 3/9] x86/mm: enable BROADCAST_TLB_FLUSH on Intel, too Date: Mon, 5 May 2025 20:37:41 -0400 Message-ID: <20250506003811.92405-4-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250506003811.92405-1-riel@surriel.com> References: <20250506003811.92405-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Rik van Riel Much of the code for Intel RAR and AMD INVLPGB is shared. Place both under the same config option. Signed-off-by: Rik van Riel --- arch/x86/Kconfig.cpu | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu index f928cf6e3252..f9cdd145abba 100644 --- a/arch/x86/Kconfig.cpu +++ b/arch/x86/Kconfig.cpu @@ -360,7 +360,7 @@ menuconfig PROCESSOR_SELECT =20 config BROADCAST_TLB_FLUSH def_bool y - depends on CPU_SUP_AMD && 64BIT + depends on (CPU_SUP_AMD || CPU_SUP_INTEL) && 64BIT =20 config CPU_SUP_INTEL default y --=20 2.49.0 From nobody Tue Dec 16 07:41:39 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 14C9E49FA33 for ; Tue, 6 May 2025 00:47:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492458; cv=none; b=s+Clr+Cx7UL5PCq8ab89NKoSjX3rbTVA48fqesuGM31F+SQg+h+jEYi3DFjOL3MULZJyZtn6YVmsXmWzTk5lUDEh3oOLGrbxwjlkbbirwK9r157tYnDzOKGyxbhbxVaPkaS23Kud4LSNKIut8gQKmDwNRNuW+ZtfaJjbb72pLGc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492458; c=relaxed/simple; bh=+pC0MyfYnCdcvbMCgpYsaCP5hIslJP7GASUOkIvSzuw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=NMuoydgiHxFNnpvo5g6M8fsS2KHgZpMffHNFiuJdYuPGwg3JZmV59sfoazXHECy0YKWEJBNq1W2/ttosWYuVcL9HWnOBrFeynx9qLzIhVIN7IBrDyTQZziJBXupmHHn29NYFvJAyN5doPAwc/geWDlV6HkJhhPf1o8DhiNT5+FQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uC6K6-000000000IF-3UXR; Mon, 05 May 2025 20:38:14 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, Rik van Riel , Yu-cheng Yu , Rik van Riel Subject: [RFC PATCH 4/9] x86/mm: Introduce X86_FEATURE_RAR Date: Mon, 5 May 2025 20:37:42 -0400 Message-ID: <20250506003811.92405-5-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250506003811.92405-1-riel@surriel.com> References: <20250506003811.92405-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Rik van Riel Introduce X86_FEATURE_RAR and enumeration of the feature. [riel: move disabling to Kconfig.cpufeatures] Signed-off-by: Yu-cheng Yu Signed-off-by: Rik van Riel --- arch/x86/Kconfig.cpufeatures | 4 ++++ arch/x86/include/asm/cpufeatures.h | 2 +- arch/x86/kernel/cpu/common.c | 13 +++++++++++++ 3 files changed, 18 insertions(+), 1 deletion(-) diff --git a/arch/x86/Kconfig.cpufeatures b/arch/x86/Kconfig.cpufeatures index e12d5b7e39a2..60042f8c2837 100644 --- a/arch/x86/Kconfig.cpufeatures +++ b/arch/x86/Kconfig.cpufeatures @@ -199,3 +199,7 @@ config X86_DISABLED_FEATURE_SEV_SNP config X86_DISABLED_FEATURE_INVLPGB def_bool y depends on !BROADCAST_TLB_FLUSH + +config X86_DISABLED_FEATURE_RAR + def_bool y + depends on !BROADCAST_TLB_FLUSH diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpuf= eatures.h index 7642310276a8..06732c872998 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -75,7 +75,7 @@ #define X86_FEATURE_CENTAUR_MCR ( 3*32+ 3) /* "centaur_mcr" Centaur MCRs = (=3D MTRRs) */ #define X86_FEATURE_K8 ( 3*32+ 4) /* Opteron, Athlon64 */ #define X86_FEATURE_ZEN5 ( 3*32+ 5) /* CPU based on Zen5 microarchitectur= e */ -/* Free ( 3*32+ 6) */ +#define X86_FEATURE_RAR ( 3*32+ 6) /* Intel Remote Action Request */ /* Free ( 3*32+ 7) */ #define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* "constant_tsc" TSC ticks at= a constant rate */ #define X86_FEATURE_UP ( 3*32+ 9) /* "up" SMP kernel running on UP */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index b73e09315413..5666620e7153 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1502,6 +1502,18 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x= 86 *c) setup_force_cpu_bug(X86_BUG_L1TF); } =20 +static void __init detect_rar(struct cpuinfo_x86 *c) +{ + u64 msr; + + if (cpu_has(c, X86_FEATURE_CORE_CAPABILITIES)) { + rdmsrl(MSR_IA32_CORE_CAPABILITIES, msr); + + if (msr & CORE_CAP_RAR) + setup_force_cpu_cap(X86_FEATURE_RAR); + } +} + /* * The NOPL instruction is supposed to exist on all CPUs of family >=3D 6; * unfortunately, that's not true in practice because of early VIA @@ -1728,6 +1740,7 @@ static void __init early_identify_cpu(struct cpuinfo_= x86 *c) setup_clear_cpu_cap(X86_FEATURE_LA57); =20 detect_nopl(); + detect_rar(c); } =20 void __init init_cpu_devs(void) --=20 2.49.0 From nobody Tue Dec 16 07:41:39 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 44EA034AA88 for ; Tue, 6 May 2025 00:49:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492569; cv=none; b=eTZa1kPymNvJZFoQA/grmGAk7RWiIT2xCb+lBd6+9wUgOyWw3BotGd6nqHCErcAhhEjqj32rw0SLwiaAL8/EbjukK0OvskvAGEx5LAOy1ScfIdZoflrxCm91IlPNLKGYkyEzoNOEZ+2PLOtajOmIaAzJklk1+J6S+aHO4/fpRDI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492569; c=relaxed/simple; bh=NPJNsGlptd0n6PNy0NgeClvR6xPjLXuDITIWdR/228E=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qlUTWgwNATecdBafL++B631n5ocOb/FjbRO4ihxUDNuz+j3TLosqhDfAOW4V+oeJOQH+CAjG1/iI+T1L7vSbrZUAJSs7e9C+WF+hAOjoVr7ID0pHLfpfkhyDFSSn52VTuSRR79nHdZfhkveff9uKlEN619NEael/8xkLbZAinf8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uC6K6-000000000IF-3aS7; Mon, 05 May 2025 20:38:14 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, Rik van Riel , Yu-cheng Yu , Rik van Riel Subject: [RFC PATCH 5/9] x86/mm: Change cpa_flush() to call flush_kernel_range() directly Date: Mon, 5 May 2025 20:37:43 -0400 Message-ID: <20250506003811.92405-6-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250506003811.92405-1-riel@surriel.com> References: <20250506003811.92405-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Rik van Riel The function cpa_flush() calls __flush_tlb_one_kernel() and flush_tlb_all(). Replacing that with a call to flush_tlb_kernel_range() allows cpa_flush() to make use of INVLPGB or RAR without any additional changes. Initialize invlpgb_count_max to 1, since flush_tlb_kernel_range() can now be called before invlpgb_count_max has been initialized to the value read from CPUID. [riel: remove now unused __cpa_flush_tlb] Signed-off-by: Yu-cheng Yu Signed-off-by: Rik van Riel --- arch/x86/kernel/cpu/amd.c | 2 +- arch/x86/mm/pat/set_memory.c | 20 +++++++------------- 2 files changed, 8 insertions(+), 14 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 13a48ec28f32..c85ecde786f3 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -31,7 +31,7 @@ =20 #include "cpu.h" =20 -u16 invlpgb_count_max __ro_after_init; +u16 invlpgb_count_max __ro_after_init =3D 1; =20 static inline int rdmsrq_amd_safe(unsigned msr, u64 *p) { diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index 30ab4aced761..2454f5249329 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -399,15 +399,6 @@ static void cpa_flush_all(unsigned long cache) on_each_cpu(__cpa_flush_all, (void *) cache, 1); } =20 -static void __cpa_flush_tlb(void *data) -{ - struct cpa_data *cpa =3D data; - unsigned int i; - - for (i =3D 0; i < cpa->numpages; i++) - flush_tlb_one_kernel(fix_addr(__cpa_addr(cpa, i))); -} - static int collapse_large_pages(unsigned long addr, struct list_head *pgta= bles); =20 static void cpa_collapse_large_pages(struct cpa_data *cpa) @@ -444,6 +435,7 @@ static void cpa_collapse_large_pages(struct cpa_data *c= pa) =20 static void cpa_flush(struct cpa_data *cpa, int cache) { + unsigned long start, end; unsigned int i; =20 BUG_ON(irqs_disabled() && !early_boot_irqs_disabled); @@ -453,10 +445,12 @@ static void cpa_flush(struct cpa_data *cpa, int cache) goto collapse_large_pages; } =20 - if (cpa->force_flush_all || cpa->numpages > tlb_single_page_flush_ceiling) - flush_tlb_all(); - else - on_each_cpu(__cpa_flush_tlb, cpa, 1); + start =3D fix_addr(__cpa_addr(cpa, 0)); + end =3D fix_addr(__cpa_addr(cpa, cpa->numpages)); + if (cpa->force_flush_all) + end =3D TLB_FLUSH_ALL; + + flush_tlb_kernel_range(start, end); =20 if (!cache) goto collapse_large_pages; --=20 2.49.0 From nobody Tue Dec 16 07:41:39 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E73C14A3C4B for ; Tue, 6 May 2025 00:48:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492506; cv=none; b=Gdy6DRt4Rl2Mwqz+x9bJplUfeJVRKJoU/KOG6XMYYvBScLDAvFo2oCN+RTV8dUYn5feRZklmqpzLyvL9RV97FIjKJzsrwys4SiEm9NKdi/DGRTvdzejjCProv4D0IDOeKlW7que1hMuV0JlE4QjeK6MZVtDnPqrfGdHXoXNfF0c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492506; c=relaxed/simple; bh=8Rv2NThq3GpIF740ywbYkJK48AvMiLVuweYORk9NC8U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TFNI2Li46iOZvAH++jS5vzTkwbfDm2jNoMEd1hXGpR6nm35VV0amTPtQeqQilDHO9l8hy7mfpg8VPXLszENSUX1zN+3I37wIMWzTM3HReZXZ76cJXdkxg+VKsoY6Hurv7ExPGSw6RwPjgQjqpUBABa1QP+OYi0miumLotVWkf0k= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uC6K6-000000000IF-3gGA; Mon, 05 May 2025 20:38:14 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, Rik van Riel , Yu-cheng Yu , Rik van Riel Subject: [RFC PATCH 6/9] x86/apic: Introduce Remote Action Request Operations Date: Mon, 5 May 2025 20:37:44 -0400 Message-ID: <20250506003811.92405-7-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250506003811.92405-1-riel@surriel.com> References: <20250506003811.92405-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Rik van Riel RAR TLB flushing is started by sending a command to the APIC. This patch adds Remote Action Request commands. [riel: move some things around to acount for 6 years of changes] Signed-off-by: Yu-cheng Yu Signed-off-by: Rik van Riel --- arch/x86/include/asm/apicdef.h | 1 + arch/x86/include/asm/irq_vectors.h | 5 +++++ arch/x86/include/asm/smp.h | 15 +++++++++++++++ arch/x86/kernel/apic/ipi.c | 23 +++++++++++++++++++---- arch/x86/kernel/apic/local.h | 3 +++ arch/x86/kernel/smp.c | 3 +++ 6 files changed, 46 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/apicdef.h b/arch/x86/include/asm/apicdef.h index 094106b6a538..b152d45af91a 100644 --- a/arch/x86/include/asm/apicdef.h +++ b/arch/x86/include/asm/apicdef.h @@ -92,6 +92,7 @@ #define APIC_DM_LOWEST 0x00100 #define APIC_DM_SMI 0x00200 #define APIC_DM_REMRD 0x00300 +#define APIC_DM_RAR 0x00300 #define APIC_DM_NMI 0x00400 #define APIC_DM_INIT 0x00500 #define APIC_DM_STARTUP 0x00600 diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_= vectors.h index 47051871b436..c417b0015304 100644 --- a/arch/x86/include/asm/irq_vectors.h +++ b/arch/x86/include/asm/irq_vectors.h @@ -103,6 +103,11 @@ */ #define POSTED_MSI_NOTIFICATION_VECTOR 0xeb =20 +/* + * RAR (remote action request) TLB flush + */ +#define RAR_VECTOR 0xe0 + #define NR_VECTORS 256 =20 #ifdef CONFIG_X86_LOCAL_APIC diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h index 0c1c68039d6f..1ab9f5fcac8a 100644 --- a/arch/x86/include/asm/smp.h +++ b/arch/x86/include/asm/smp.h @@ -40,6 +40,9 @@ struct smp_ops { =20 void (*send_call_func_ipi)(const struct cpumask *mask); void (*send_call_func_single_ipi)(int cpu); + + void (*send_rar_ipi)(const struct cpumask *mask); + void (*send_rar_single_ipi)(int cpu); }; =20 /* Globals due to paravirt */ @@ -100,6 +103,16 @@ static inline void arch_send_call_function_ipi_mask(co= nst struct cpumask *mask) smp_ops.send_call_func_ipi(mask); } =20 +static inline void arch_send_rar_single_ipi(int cpu) +{ + smp_ops.send_rar_single_ipi(cpu); +} + +static inline void arch_send_rar_ipi_mask(const struct cpumask *mask) +{ + smp_ops.send_rar_ipi(mask); +} + void cpu_disable_common(void); void native_smp_prepare_boot_cpu(void); void smp_prepare_cpus_common(void); @@ -120,6 +133,8 @@ void __noreturn mwait_play_dead(unsigned int eax_hint); void native_smp_send_reschedule(int cpu); void native_send_call_func_ipi(const struct cpumask *mask); void native_send_call_func_single_ipi(int cpu); +void native_send_rar_ipi(const struct cpumask *mask); +void native_send_rar_single_ipi(int cpu); =20 asmlinkage __visible void smp_reboot_interrupt(void); __visible void smp_reschedule_interrupt(struct pt_regs *regs); diff --git a/arch/x86/kernel/apic/ipi.c b/arch/x86/kernel/apic/ipi.c index 98a57cb4aa86..e5e9fc08f86c 100644 --- a/arch/x86/kernel/apic/ipi.c +++ b/arch/x86/kernel/apic/ipi.c @@ -79,7 +79,7 @@ void native_send_call_func_single_ipi(int cpu) __apic_send_IPI(cpu, CALL_FUNCTION_SINGLE_VECTOR); } =20 -void native_send_call_func_ipi(const struct cpumask *mask) +static void do_native_send_ipi(const struct cpumask *mask, int vector) { if (static_branch_likely(&apic_use_ipi_shorthand)) { unsigned int cpu =3D smp_processor_id(); @@ -88,14 +88,19 @@ void native_send_call_func_ipi(const struct cpumask *ma= sk) goto sendmask; =20 if (cpumask_test_cpu(cpu, mask)) - __apic_send_IPI_all(CALL_FUNCTION_VECTOR); + __apic_send_IPI_all(vector); else if (num_online_cpus() > 1) - __apic_send_IPI_allbutself(CALL_FUNCTION_VECTOR); + __apic_send_IPI_allbutself(vector); return; } =20 sendmask: - __apic_send_IPI_mask(mask, CALL_FUNCTION_VECTOR); + __apic_send_IPI_mask(mask, vector); +} + +void native_send_call_func_ipi(const struct cpumask *mask) +{ + do_native_send_ipi(mask, CALL_FUNCTION_VECTOR); } =20 void apic_send_nmi_to_offline_cpu(unsigned int cpu) @@ -106,6 +111,16 @@ void apic_send_nmi_to_offline_cpu(unsigned int cpu) return; apic->send_IPI(cpu, NMI_VECTOR); } + +void native_send_rar_single_ipi(int cpu) +{ + apic->send_IPI_mask(cpumask_of(cpu), RAR_VECTOR); +} + +void native_send_rar_ipi(const struct cpumask *mask) +{ + do_native_send_ipi(mask, RAR_VECTOR); +} #endif /* CONFIG_SMP */ =20 static inline int __prepare_ICR2(unsigned int mask) diff --git a/arch/x86/kernel/apic/local.h b/arch/x86/kernel/apic/local.h index bdcf609eb283..833669174267 100644 --- a/arch/x86/kernel/apic/local.h +++ b/arch/x86/kernel/apic/local.h @@ -38,6 +38,9 @@ static inline unsigned int __prepare_ICR(unsigned int sho= rtcut, int vector, case NMI_VECTOR: icr |=3D APIC_DM_NMI; break; + case RAR_VECTOR: + icr |=3D APIC_DM_RAR; + break; } return icr; } diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index 18266cc3d98c..2c51ed6aaf03 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -297,5 +297,8 @@ struct smp_ops smp_ops =3D { =20 .send_call_func_ipi =3D native_send_call_func_ipi, .send_call_func_single_ipi =3D native_send_call_func_single_ipi, + + .send_rar_ipi =3D native_send_rar_ipi, + .send_rar_single_ipi =3D native_send_rar_single_ipi, }; EXPORT_SYMBOL_GPL(smp_ops); --=20 2.49.0 From nobody Tue Dec 16 07:41:39 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 688F74A4E62 for ; Tue, 6 May 2025 00:48:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492525; cv=none; b=a28ByPSIDPv8vYS2eSevfw0jk/RlNb4MNLAsVoPDilXZi40krx98r7gUykaw/t/qbJeXswq6oVVrxm5s9gFyF6doNE7OOoT3FWAyjMty5b7uJrekcynaR7UIqZH117GhcQRp1ZNs9fUCD9+zT5qJ4tVnr6Onz6BlJfmrQdiA/yA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492525; c=relaxed/simple; bh=6V+cb9MTePAxf7N7XVuHjske1yK2Lk8LFcjCNXlCz3I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=AEprnuJcMMBYyMn2fHqr69ClyvS1+dgDzFMViNk6OkJQO3vrBLYe+3jhGCQTdXiPM9ISixWIZS04C9GFjKUGVaEtc0hbeo9kSHTFat8Wqm/HlX7P8VsHOQRa/A+Ih2q0erJZTD/fP1Mu+CbjqiTUeTz2fM0EeGW5u/I9bHARp6w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uC6K6-000000000IF-3ncU; Mon, 05 May 2025 20:38:14 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, Yu-cheng Yu , Rik van Riel Subject: [RFC PATCH 7/9] x86/mm: Introduce Remote Action Request Date: Mon, 5 May 2025 20:37:45 -0400 Message-ID: <20250506003811.92405-8-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250506003811.92405-1-riel@surriel.com> References: <20250506003811.92405-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Yu-cheng Yu Remote Action Request (RAR) is a TLB flushing broadcast facility. To start a TLB flush, the initiator CPU creates a RAR payload and sends a command to the APIC. The receiving CPUs automatically flush TLBs as specified in the payload without the kernel's involement. [ riel: - add pcid parameter to smp_call_rar_many so other mms can be flushed - ensure get_payload only allocates valid indices - make sure rar_cpu_init does not write reserved bits - fix overflow in range vs full flush decision ] Signed-off-by: Yu-cheng Yu Signed-off-by: Rik van Riel --- arch/x86/include/asm/rar.h | 69 +++++++++++ arch/x86/kernel/cpu/common.c | 4 + arch/x86/mm/Makefile | 1 + arch/x86/mm/rar.c | 226 +++++++++++++++++++++++++++++++++++ 4 files changed, 300 insertions(+) create mode 100644 arch/x86/include/asm/rar.h create mode 100644 arch/x86/mm/rar.c diff --git a/arch/x86/include/asm/rar.h b/arch/x86/include/asm/rar.h new file mode 100644 index 000000000000..b5ba856fcaa8 --- /dev/null +++ b/arch/x86/include/asm/rar.h @@ -0,0 +1,69 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_RAR_H +#define _ASM_X86_RAR_H + +/* + * RAR payload types + */ +#define RAR_TYPE_INVPG 0 +#define RAR_TYPE_INVPG_NO_CR3 1 +#define RAR_TYPE_INVPCID 2 +#define RAR_TYPE_INVEPT 3 +#define RAR_TYPE_INVVPID 4 +#define RAR_TYPE_WRMSR 5 + +/* + * Subtypes for RAR_TYPE_INVLPG + */ +#define RAR_INVPG_ADDR 0 /* address specific */ +#define RAR_INVPG_ALL 2 /* all, include global */ +#define RAR_INVPG_ALL_NO_GLOBAL 3 /* all, exclude global */ + +/* + * Subtypes for RAR_TYPE_INVPCID + */ +#define RAR_INVPCID_ADDR 0 /* address specific */ +#define RAR_INVPCID_PCID 1 /* all of PCID */ +#define RAR_INVPCID_ALL 2 /* all, include global */ +#define RAR_INVPCID_ALL_NO_GLOBAL 3 /* all, exclude global */ + +/* + * Page size for RAR_TYPE_INVLPG + */ +#define RAR_INVLPG_PAGE_SIZE_4K 0 +#define RAR_INVLPG_PAGE_SIZE_2M 1 +#define RAR_INVLPG_PAGE_SIZE_1G 2 + +/* + * Max number of pages per payload + */ +#define RAR_INVLPG_MAX_PAGES 63 + +typedef struct { + uint64_t for_sw : 8; + uint64_t type : 8; + uint64_t must_be_zero_1 : 16; + uint64_t subtype : 3; + uint64_t page_size: 2; + uint64_t num_pages : 6; + uint64_t must_be_zero_2 : 21; + + uint64_t must_be_zero_3; + + /* + * Starting address + */ + uint64_t initiator_cr3; + uint64_t linear_address; + + /* + * Padding + */ + uint64_t padding[4]; +} rar_payload_t; + +void rar_cpu_init(void); +void smp_call_rar_many(const struct cpumask *mask, u16 pcid, + unsigned long start, unsigned long end); + +#endif /* _ASM_X86_RAR_H */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 5666620e7153..75b43db0b129 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -71,6 +71,7 @@ #include #include #include +#include =20 #include "cpu.h" =20 @@ -2395,6 +2396,9 @@ void cpu_init(void) if (is_uv_system()) uv_cpu_init(); =20 + if (cpu_feature_enabled(X86_FEATURE_RAR)) + rar_cpu_init(); + load_fixmap_gdt(cpu); } =20 diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index cebe5812d78d..d49d16412569 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -54,6 +54,7 @@ obj-$(CONFIG_ACPI_NUMA) +=3D srat.o obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) +=3D pkeys.o obj-$(CONFIG_RANDOMIZE_MEMORY) +=3D kaslr.o obj-$(CONFIG_MITIGATION_PAGE_TABLE_ISOLATION) +=3D pti.o +obj-$(CONFIG_BROADCAST_TLB_FLUSH) +=3D rar.o =20 obj-$(CONFIG_X86_MEM_ENCRYPT) +=3D mem_encrypt.o obj-$(CONFIG_AMD_MEM_ENCRYPT) +=3D mem_encrypt_amd.o diff --git a/arch/x86/mm/rar.c b/arch/x86/mm/rar.c new file mode 100644 index 000000000000..77a334f1e212 --- /dev/null +++ b/arch/x86/mm/rar.c @@ -0,0 +1,226 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * RAR Tlb shootdown + */ + +#include +#include +#include +#include +#include +#include +#include +#include + +static DEFINE_PER_CPU(int, rar_lock); +static DEFINE_PER_CPU(struct cpumask, rar_cpu_mask); + +#define RAR_ACTION_OK 0x00 +#define RAR_ACTION_START 0x01 +#define RAR_ACTION_ACKED 0x02 +#define RAR_ACTION_FAIL 0x80 + +#define RAR_MAX_PAYLOADS 32UL + +static unsigned long rar_in_use =3D ~(RAR_MAX_PAYLOADS - 1); +static rar_payload_t rar_payload[RAR_MAX_PAYLOADS] __page_aligned_bss; +static DEFINE_PER_CPU_ALIGNED(u64[(RAR_MAX_PAYLOADS + 8) / 8], rar_action); + +static __always_inline void lock(int *lock) +{ + smp_cond_load_acquire(lock, !(VAL & 1)); + *lock |=3D 1; + + /* + * prevent CPU from reordering the above assignment + * to ->flags with any subsequent assignments to other + * fields of the specified call_single_data structure: + */ + smp_wmb(); +} + +static __always_inline void unlock(int *lock) +{ + WARN_ON(!(*lock & 1)); + + /* + * ensure we're all done before releasing data: + */ + smp_store_release(lock, 0); +} + +static unsigned long get_payload(void) +{ + while (1) { + unsigned long bit; + + /* + * Find a free bit and confirm it with + * test_and_set_bit() below. + */ + bit =3D ffz(READ_ONCE(rar_in_use)); + + if (bit >=3D RAR_MAX_PAYLOADS) + continue; + + if (!test_and_set_bit((long)bit, &rar_in_use)) + return bit; + } +} + +static void free_payload(unsigned long idx) +{ + clear_bit(idx, &rar_in_use); +} + +static void set_payload(unsigned long idx, u16 pcid, unsigned long start, + uint32_t pages) +{ + rar_payload_t *p =3D &rar_payload[idx]; + + p->must_be_zero_1 =3D 0; + p->must_be_zero_2 =3D 0; + p->must_be_zero_3 =3D 0; + p->page_size =3D RAR_INVLPG_PAGE_SIZE_4K; + p->type =3D RAR_TYPE_INVPCID; + p->num_pages =3D pages; + p->initiator_cr3 =3D pcid; + p->linear_address =3D start; + + if (pcid) { + /* RAR invalidation of the mapping of a specific process. */ + if (pages >=3D RAR_INVLPG_MAX_PAGES) + p->subtype =3D RAR_INVPCID_PCID; + else + p->subtype =3D RAR_INVPCID_ADDR; + } else { + /* + * Unfortunately RAR_INVPCID_ADDR excludes global translations. + * Always do a full flush for kernel invalidations. + */ + p->subtype =3D RAR_INVPCID_ALL; + } + + smp_wmb(); +} + +static void set_action_entry(unsigned long idx, int target_cpu) +{ + u8 *bitmap =3D (u8 *)per_cpu(rar_action, target_cpu); + + WRITE_ONCE(bitmap[idx], RAR_ACTION_START); +} + +static void wait_for_done(unsigned long idx, int target_cpu) +{ + u8 status; + u8 *bitmap =3D (u8 *)per_cpu(rar_action, target_cpu); + + status =3D READ_ONCE(bitmap[idx]); + + while ((status !=3D RAR_ACTION_OK) && (status !=3D RAR_ACTION_FAIL)) { + cpu_relax(); + status =3D READ_ONCE(bitmap[idx]); + } + + WARN_ON_ONCE(bitmap[idx] =3D=3D RAR_ACTION_FAIL); +} + +void rar_cpu_init(void) +{ + u64 r; + u8 *bitmap; + int this_cpu =3D smp_processor_id(); + + per_cpu(rar_lock, this_cpu) =3D 0; + cpumask_clear(&per_cpu(rar_cpu_mask, this_cpu)); + + rdmsrl(MSR_IA32_RAR_INFO, r); + pr_info_once("RAR: support %lld payloads\n", r >> 32); + + bitmap =3D (u8 *)per_cpu(rar_action, this_cpu); + memset(bitmap, 0, RAR_MAX_PAYLOADS); + wrmsrl(MSR_IA32_RAR_ACT_VEC, (u64)virt_to_phys(bitmap)); + wrmsrl(MSR_IA32_RAR_PAYLOAD_BASE, (u64)virt_to_phys(rar_payload)); + + r =3D RAR_CTRL_ENABLE | RAR_CTRL_IGNORE_IF; + // reserved bits!!! r |=3D (RAR_VECTOR & 0xff); + wrmsrl(MSR_IA32_RAR_CTRL, r); +} + +/* + * This is a modified version of smp_call_function_many() of kernel/smp.c, + * without a function pointer, because the RAR handler is the ucode. + */ +void smp_call_rar_many(const struct cpumask *mask, u16 pcid, + unsigned long start, unsigned long end) +{ + unsigned long pages =3D (end - start + PAGE_SIZE) / PAGE_SIZE; + int cpu, next_cpu, this_cpu =3D smp_processor_id(); + cpumask_t *dest_mask; + unsigned long idx; + + if (pages > RAR_INVLPG_MAX_PAGES || end =3D=3D TLB_FLUSH_ALL) + pages =3D RAR_INVLPG_MAX_PAGES; + + /* + * Can deadlock when called with interrupts disabled. + * We allow cpu's that are not yet online though, as no one else can + * send smp call function interrupt to this cpu and as such deadlocks + * can't happen. + */ + WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() + && !oops_in_progress && !early_boot_irqs_disabled); + + /* Try to fastpath. So, what's a CPU they want? Ignoring this one. */ + cpu =3D cpumask_first_and(mask, cpu_online_mask); + if (cpu =3D=3D this_cpu) + cpu =3D cpumask_next_and(cpu, mask, cpu_online_mask); + + /* No online cpus? We're done. */ + if (cpu >=3D nr_cpu_ids) + return; + + /* Do we have another CPU which isn't us? */ + next_cpu =3D cpumask_next_and(cpu, mask, cpu_online_mask); + if (next_cpu =3D=3D this_cpu) + next_cpu =3D cpumask_next_and(next_cpu, mask, cpu_online_mask); + + /* Fastpath: do that cpu by itself. */ + if (next_cpu >=3D nr_cpu_ids) { + lock(this_cpu_ptr(&rar_lock)); + idx =3D get_payload(); + set_payload(idx, pcid, start, pages); + set_action_entry(idx, cpu); + arch_send_rar_single_ipi(cpu); + wait_for_done(idx, cpu); + free_payload(idx); + unlock(this_cpu_ptr(&rar_lock)); + return; + } + + dest_mask =3D this_cpu_ptr(&rar_cpu_mask); + cpumask_and(dest_mask, mask, cpu_online_mask); + cpumask_clear_cpu(this_cpu, dest_mask); + + /* Some callers race with other cpus changing the passed mask */ + if (unlikely(!cpumask_weight(dest_mask))) + return; + + lock(this_cpu_ptr(&rar_lock)); + idx =3D get_payload(); + set_payload(idx, pcid, start, pages); + + for_each_cpu(cpu, dest_mask) + set_action_entry(idx, cpu); + + /* Send a message to all CPUs in the map */ + arch_send_rar_ipi_mask(dest_mask); + + for_each_cpu(cpu, dest_mask) + wait_for_done(idx, cpu); + + free_payload(idx); + unlock(this_cpu_ptr(&rar_lock)); +} +EXPORT_SYMBOL(smp_call_rar_many); --=20 2.49.0 From nobody Tue Dec 16 07:41:39 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 747E14A6539 for ; Tue, 6 May 2025 00:49:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492553; cv=none; b=aQfuqyyMmntHl/PB49YZOrebTL7VkwEZ+Cfi2btmir3GjVyV83DZ3mAk9VHuf4y6BoBQvw5nwYcxIwGhTSEt20UIpKO/8LegYwv5eYX73gh/LcTJrbW7yHZPLcK/7M826jEKypbM8GtAt5/Z5NT9WoMIngH8Vv7SrOnfZrHWPC0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492553; c=relaxed/simple; bh=6u/tJ3IaLwF0dUse1JpxumzgYcS9QpDg2Qj+cUBlQPw=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=TkaoeMQH3ENKEiv2RlfO9APeseC6LtDglo8TUKMh3cV9S1yVr8yoY3BMCiCBX2CYdroGXxqsn68/DWoKrCq4KgMiAnRQuXfV7UVDeayUEckg/D0scT1DMgAtMwD4SeKv1AsUBWoP9xvST9EEJ5EuEYCchG0krZ4MxNdQvGkYjKQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uC6K6-000000000IF-3uS9; Mon, 05 May 2025 20:38:14 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, Rik van Riel , Rik van Riel Subject: [RFC PATCH 8/9] x86/mm: use RAR for kernel TLB flushes Date: Mon, 5 May 2025 20:37:46 -0400 Message-ID: <20250506003811.92405-9-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250506003811.92405-1-riel@surriel.com> References: <20250506003811.92405-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Rik van Riel Use Intel RAR for kernel TLB flushes, when enabled. Pass in PCID 0 to smp_call_rar_many() to flush kernel memory. Unfortunately RAR_INVPCID_ADDR excludes global PTE mappings, so only full flushes with RAR_INVPCID_ALL will flush kernel mappings. Signed-off-by: Rik van Riel --- arch/x86/mm/tlb.c | 38 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 38 insertions(+) diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 7c61bf11d472..a4f3941281b6 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -21,6 +21,7 @@ #include #include #include +#include #include =20 #include "mm_internal.h" @@ -1451,6 +1452,18 @@ static void do_flush_tlb_all(void *info) __flush_tlb_all(); } =20 +static void rar_full_flush(const cpumask_t *cpumask) +{ + guard(preempt)(); + smp_call_rar_many(cpumask, 0, 0, TLB_FLUSH_ALL); + invpcid_flush_all(); +} + +static void rar_flush_all(void) +{ + rar_full_flush(cpu_online_mask); +} + void flush_tlb_all(void) { count_vm_tlb_event(NR_TLB_REMOTE_FLUSH); @@ -1458,6 +1471,8 @@ void flush_tlb_all(void) /* First try (faster) hardware-assisted TLB invalidation. */ if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) invlpgb_flush_all(); + else if (cpu_feature_enabled(X86_FEATURE_RAR)) + rar_flush_all(); else /* Fall back to the IPI-based invalidation. */ on_each_cpu(do_flush_tlb_all, NULL, 1); @@ -1487,15 +1502,36 @@ static void do_kernel_range_flush(void *info) struct flush_tlb_info *f =3D info; unsigned long addr; =20 + /* + * With PTI kernel TLB entries in all PCIDs need to be flushed. + * With RAR the PCID space becomes so large, we might as well flush it al= l. + * + * Either of the two by itself works with targeted flushes. + */ + if (cpu_feature_enabled(X86_FEATURE_RAR) && + cpu_feature_enabled(X86_FEATURE_PTI)) { + invpcid_flush_all(); + return; + } + /* flush range by one by one 'invlpg' */ for (addr =3D f->start; addr < f->end; addr +=3D PAGE_SIZE) flush_tlb_one_kernel(addr); } =20 +static void rar_kernel_range_flush(struct flush_tlb_info *info) +{ + guard(preempt)(); + smp_call_rar_many(cpu_online_mask, 0, info->start, info->end); + do_kernel_range_flush(info); +} + static void kernel_tlb_flush_all(struct flush_tlb_info *info) { if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) invlpgb_flush_all(); + else if (cpu_feature_enabled(X86_FEATURE_RAR)) + rar_flush_all(); else on_each_cpu(do_flush_tlb_all, NULL, 1); } @@ -1504,6 +1540,8 @@ static void kernel_tlb_flush_range(struct flush_tlb_i= nfo *info) { if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) invlpgb_kernel_range_flush(info); + else if (cpu_feature_enabled(X86_FEATURE_RAR)) + rar_kernel_range_flush(info); else on_each_cpu(do_kernel_range_flush, info, 1); } --=20 2.49.0 From nobody Tue Dec 16 07:41:39 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE49734AA71 for ; Tue, 6 May 2025 00:49:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492565; cv=none; b=rS93WZRSC1lMbC9YYuTusV/3SxN02f5CuOyVU2o8ZEcaxIxDD4oGKebSqeMXIG/TegmUCSo8tn3aUYY7p3P2I98EhC6Qe42wW2gJravO0kMjuEld3aXpMGQ/P9Zdh2K7pWfO/3bC9NX8SW2olEMByE7ocy8abCUyoIsLg01tpW8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746492565; c=relaxed/simple; bh=EV8op9KpgnmZmV/JxT0ECHLeduQDSWJev2uNMIO9cZo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dAF142/BIPFnL9rhO0flLbbZ8NydtWujzP7ea0AauoHv8KNSfoFScKWRDjNZKDK06AVTeILTMVArOLljSdrVUo89r/DnKN6d72snAzcwOyz7pH2xlec2N5P5bfrVuv6+NOKtuwMv6doQhYSxJ/WY6lVWYb/OE8ziPFBvGk1+J0g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uC6K6-000000000IF-40O7; Mon, 05 May 2025 20:38:14 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, Rik van Riel , Rik van Riel Subject: [RFC PATCH 9/9] x86/mm: userspace & pageout flushing using Intel RAR Date: Mon, 5 May 2025 20:37:47 -0400 Message-ID: <20250506003811.92405-10-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250506003811.92405-1-riel@surriel.com> References: <20250506003811.92405-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Rik van Riel Use Intel RAR to flush userspace mappings. Because RAR flushes are targeted using a cpu bitmap, the rules are a little bit different than for true broadcast TLB invalidation. For true broadcast TLB invalidation, like done with AMD INVLPGB, a global ASID always has up to date TLB entries on every CPU. The context switch code never has to flush the TLB when switching to a global ASID on any CPU with INVLPGB. For RAR, the TLB mappings for a global ASID are kept up to date only on CPUs within the mm_cpumask, which lazily follows the threads around the system. The context switch code does not need to flush the TLB if the CPU is in the mm_cpumask, and the PCID used stays the same. However, a CPU that falls outside of the mm_cpumask can have out of date TLB mappings for this task. When switching to that task on a CPU not in the mm_cpumask, the TLB does need to be flushed. Signed-off-by: Rik van Riel --- arch/x86/include/asm/tlbflush.h | 9 ++- arch/x86/mm/tlb.c | 119 +++++++++++++++++++++++++------- 2 files changed, 99 insertions(+), 29 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflus= h.h index e9b81876ebe4..1940d51f95a9 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -250,7 +250,8 @@ static inline u16 mm_global_asid(struct mm_struct *mm) { u16 asid; =20 - if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB) && + !cpu_feature_enabled(X86_FEATURE_RAR)) return 0; =20 asid =3D smp_load_acquire(&mm->context.global_asid); @@ -263,7 +264,8 @@ static inline u16 mm_global_asid(struct mm_struct *mm) =20 static inline void mm_init_global_asid(struct mm_struct *mm) { - if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB) && + !cpu_feature_enabled(X86_FEATURE_RAR)) { mm->context.global_asid =3D 0; mm->context.asid_transition =3D false; } @@ -287,7 +289,8 @@ static inline void mm_clear_asid_transition(struct mm_s= truct *mm) =20 static inline bool mm_in_asid_transition(struct mm_struct *mm) { - if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB) && + !cpu_feature_enabled(X86_FEATURE_RAR)) return false; =20 return mm && READ_ONCE(mm->context.asid_transition); diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index a4f3941281b6..724359be3f97 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -235,9 +235,11 @@ static struct new_asid choose_new_asid(struct mm_struc= t *next, u64 next_tlb_gen) =20 /* * TLB consistency for global ASIDs is maintained with hardware assisted - * remote TLB flushing. Global ASIDs are always up to date. + * remote TLB flushing. Global ASIDs are always up to date with INVLPGB, + * and up to date for CPUs in the mm_cpumask with RAR.. */ - if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { + if (cpu_feature_enabled(X86_FEATURE_INVLPGB) || + cpu_feature_enabled(X86_FEATURE_RAR)) { u16 global_asid =3D mm_global_asid(next); =20 if (global_asid) { @@ -300,7 +302,14 @@ static void reset_global_asid_space(void) { lockdep_assert_held(&global_asid_lock); =20 - invlpgb_flush_all_nonglobals(); + /* + * The global flush ensures that a freshly allocated global ASID + * has no entries in any TLB, and can be used immediately. + * With Intel RAR, the TLB may still need to be flushed at context + * switch time when dealing with a CPU that was not in the mm_cpumask + * for the process, and may have missed flushes along the way. + */ + flush_tlb_all(); =20 /* * The TLB flush above makes it safe to re-use the previously @@ -377,7 +386,7 @@ static void use_global_asid(struct mm_struct *mm) { u16 asid; =20 - guard(raw_spinlock_irqsave)(&global_asid_lock); + guard(raw_spinlock)(&global_asid_lock); =20 /* This process is already using broadcast TLB invalidation. */ if (mm_global_asid(mm)) @@ -403,13 +412,14 @@ static void use_global_asid(struct mm_struct *mm) =20 void mm_free_global_asid(struct mm_struct *mm) { - if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB) && + !cpu_feature_enabled(X86_FEATURE_RAR)) return; =20 if (!mm_global_asid(mm)) return; =20 - guard(raw_spinlock_irqsave)(&global_asid_lock); + guard(raw_spinlock)(&global_asid_lock); =20 /* The global ASID can be re-used only after flush at wrap-around. */ #ifdef CONFIG_BROADCAST_TLB_FLUSH @@ -427,7 +437,8 @@ static bool mm_needs_global_asid(struct mm_struct *mm, = u16 asid) { u16 global_asid =3D mm_global_asid(mm); =20 - if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB) && + !cpu_feature_enabled(X86_FEATURE_RAR)) return false; =20 /* Process is transitioning to a global ASID */ @@ -445,7 +456,8 @@ static bool mm_needs_global_asid(struct mm_struct *mm, = u16 asid) */ static void consider_global_asid(struct mm_struct *mm) { - if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB) && + !cpu_feature_enabled(X86_FEATURE_RAR)) return; =20 /* Check every once in a while. */ @@ -499,7 +511,7 @@ static void finish_asid_transition(struct flush_tlb_inf= o *info) mm_clear_asid_transition(mm); } =20 -static void broadcast_tlb_flush(struct flush_tlb_info *info) +static void invlpgb_tlb_flush(struct flush_tlb_info *info) { bool pmd =3D info->stride_shift =3D=3D PMD_SHIFT; unsigned long asid =3D mm_global_asid(info->mm); @@ -865,13 +877,6 @@ void switch_mm_irqs_off(struct mm_struct *unused, stru= ct mm_struct *next, goto reload_tlb; } =20 - /* - * Broadcast TLB invalidation keeps this ASID up to date - * all the time. - */ - if (is_global_asid(prev_asid)) - return; - /* * If the CPU is not in lazy TLB mode, we are just switching * from one thread in a process to another thread in the same @@ -880,6 +885,15 @@ void switch_mm_irqs_off(struct mm_struct *unused, stru= ct mm_struct *next, if (!was_lazy) return; =20 + /* + * Broadcast TLB invalidation keeps this ASID up to date + * all the time with AMD INVLPGB. Intel RAR may need a TLB + * flush if the CPU was in lazy TLB mode.. + */ + if (cpu_feature_enabled(X86_FEATURE_INVLPGB) && + is_global_asid(prev_asid)) + return; + /* * Read the tlb_gen to check whether a flush is needed. * If the TLB is up to date, just use it. @@ -912,20 +926,27 @@ void switch_mm_irqs_off(struct mm_struct *unused, str= uct mm_struct *next, this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING); barrier(); =20 - /* Start receiving IPIs and then read tlb_gen (and LAM below) */ - if (next !=3D &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(next))) - cpumask_set_cpu(cpu, mm_cpumask(next)); + /* A TLB flush started during a context switch is harmless. */ next_tlb_gen =3D atomic64_read(&next->context.tlb_gen); =20 ns =3D choose_new_asid(next, next_tlb_gen); + + /* Start receiving IPIs and RAR invalidations */ + if (next !=3D &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(next))) { + cpumask_set_cpu(cpu, mm_cpumask(next)); + /* CPUs outside mm_cpumask may be out of date. */ + if (cpu_feature_enabled(X86_FEATURE_RAR)) + ns.need_flush =3D true; + } } =20 reload_tlb: new_lam =3D mm_lam_cr3_mask(next); if (ns.need_flush) { - VM_WARN_ON_ONCE(is_global_asid(ns.asid)); - this_cpu_write(cpu_tlbstate.ctxs[ns.asid].ctx_id, next->context.ctx_id); - this_cpu_write(cpu_tlbstate.ctxs[ns.asid].tlb_gen, next_tlb_gen); + if (is_dyn_asid(ns.asid)) { + this_cpu_write(cpu_tlbstate.ctxs[ns.asid].ctx_id, next->context.ctx_id); + this_cpu_write(cpu_tlbstate.ctxs[ns.asid].tlb_gen, next_tlb_gen); + } load_new_mm_cr3(next->pgd, ns.asid, new_lam, true); =20 trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL); @@ -1142,8 +1163,12 @@ static void flush_tlb_func(void *info) loaded_mm_asid =3D this_cpu_read(cpu_tlbstate.loaded_mm_asid); } =20 - /* Broadcast ASIDs are always kept up to date with INVLPGB. */ - if (is_global_asid(loaded_mm_asid)) + /* + * Broadcast ASIDs are always kept up to date with INVLPGB; with + * Intel RAR IPI based flushes are used periodically to trim the + * mm_cpumask. Make sure those flushes are processed here. + */ + if (cpu_feature_enabled(X86_FEATURE_INVLPGB) && is_global_asid(loaded_mm_= asid)) return; =20 VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].ctx_id) !=3D @@ -1363,6 +1388,33 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(struct flush_tl= b_info, flush_tlb_info); static DEFINE_PER_CPU(unsigned int, flush_tlb_info_idx); #endif =20 +static void rar_tlb_flush(struct flush_tlb_info *info) +{ + unsigned long asid =3D mm_global_asid(info->mm); + u16 pcid =3D kern_pcid(asid); + + /* Flush the remote CPUs. */ + smp_call_rar_many(mm_cpumask(info->mm), pcid, info->start, info->end); + if (cpu_feature_enabled(X86_FEATURE_PTI)) + smp_call_rar_many(mm_cpumask(info->mm), user_pcid(asid), info->start, in= fo->end); + + /* Flush the local TLB, if needed. */ + if (cpumask_test_cpu(smp_processor_id(), mm_cpumask(info->mm))) { + lockdep_assert_irqs_enabled(); + local_irq_disable(); + flush_tlb_func(info); + local_irq_enable(); + } +} + +static void broadcast_tlb_flush(struct flush_tlb_info *info) +{ + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) + invlpgb_tlb_flush(info); + else /* Intel RAR */ + rar_tlb_flush(info); +} + static struct flush_tlb_info *get_flush_tlb_info(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned int stride_shift, bool freed_tables, @@ -1423,15 +1475,22 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsig= ned long start, info =3D get_flush_tlb_info(mm, start, end, stride_shift, freed_tables, new_tlb_gen); =20 + /* + * IPIs and RAR can be targeted to a cpumask. Periodically trim that + * mm_cpumask by sending TLB flush IPIs, even when most TLB flushes + * are done with RAR. + */ + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB) || !mm_global_asid(mm)) + info->trim_cpumask =3D should_trim_cpumask(mm); + /* * flush_tlb_multi() is not optimized for the common case in which only * a local TLB flush is needed. Optimize this use-case by calling * flush_tlb_func_local() directly in this case. */ - if (mm_global_asid(mm)) { + if (mm_global_asid(mm) && !info->trim_cpumask) { broadcast_tlb_flush(info); } else if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { - info->trim_cpumask =3D should_trim_cpumask(mm); flush_tlb_multi(mm_cpumask(mm), info); consider_global_asid(mm); } else if (mm =3D=3D this_cpu_read(cpu_tlbstate.loaded_mm)) { @@ -1742,6 +1801,14 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_= batch *batch) if (cpu_feature_enabled(X86_FEATURE_INVLPGB) && batch->unmapped_pages) { invlpgb_flush_all_nonglobals(); batch->unmapped_pages =3D false; + } else if (cpu_feature_enabled(X86_FEATURE_RAR) && cpumask_any(&batch->cp= umask) < nr_cpu_ids) { + rar_full_flush(&batch->cpumask); + if (cpumask_test_cpu(cpu, &batch->cpumask)) { + lockdep_assert_irqs_enabled(); + local_irq_disable(); + invpcid_flush_all_nonglobals(); + local_irq_enable(); + } } else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { flush_tlb_multi(&batch->cpumask, info); } else if (cpumask_test_cpu(cpu, &batch->cpumask)) { --=20 2.49.0