From nobody Mon Dec 15 21:27:40 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E0551C3306 for ; Tue, 20 May 2025 01:04:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703059; cv=none; b=tghp/gvl3LPdJBLnT/rVRIG8hZJ7dMiIUBZtIFMFMVSE3zAMB5PJCsm2fPXPmaQ/P+a1Zqovkfpy1Z/5B+xe0EBsWpoIpgY4ceH20imcqRwhySxCzfKvLrvQ00KqUu2QenxTXVXXCvpMEABJ+UWaobLf416S5r65enpRT0RcNUM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703059; c=relaxed/simple; bh=tPmLpGvLJtw7+JasLKgDCdEkuGhtqgstYWemAQ/fUSY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=fjCVxcVtV/AQVcUHeWuLcZZZo7Mq21iaN1HaFtKsPR65MDfyUytG2wBG+EvKVs3C5guuwfQ5s4SNl6Opj9PNa8oLfwWB1RS0rZg3kDBdSbyV+8oONv4IuXDZxpV4F4ykW2iYN9yWVaq7dYZuMtUwOo+9xHk7SD3nTonsnIO9d6Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uHBOc-000000000aB-3wmG; Mon, 19 May 2025 21:03:54 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, nadav.amit@gmail.com, Yu-cheng Yu , Rik van Riel Subject: [RFC v2 1/9] x86/mm: Introduce MSR_IA32_CORE_CAPABILITIES Date: Mon, 19 May 2025 21:02:26 -0400 Message-ID: <20250520010350.1740223-2-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250520010350.1740223-1-riel@surriel.com> References: <20250520010350.1740223-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Yu-cheng Yu MSR_IA32_CORE_CAPABILITIES indicates the existence of other MSRs. Bit[1] indicates Remote Action Request (RAR) TLB registers. Signed-off-by: Yu-cheng Yu Signed-off-by: Rik van Riel --- arch/x86/include/asm/msr-index.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index b7dded3c8113..c848dd4bfceb 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -220,6 +220,12 @@ * their affected status. */ =20 +#define MSR_IA32_CORE_CAPABILITIES 0x000000cf +#define CORE_CAP_RAR BIT(1) /* + * Remote Action Request. Used to directly + * flush the TLB on remote CPUs. + */ + #define MSR_IA32_FLUSH_CMD 0x0000010b #define L1D_FLUSH BIT(0) /* * Writeback and invalidate the --=20 2.49.0 From nobody Mon Dec 15 21:27:40 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E34B1D5ADE for ; Tue, 20 May 2025 01:04:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703058; cv=none; b=H5WBibpDPpCXoFW+etIL5WntJUmOjHcDbt52A51XN3xKpcrSi2VVGmn5JW9gXs4kbOA5Z8LKj6LbkTqDiVjxlbwjSJhnUKpGis1EuJrICNFFAXmsDw8rHsjnyBOjRQjlaXo99wapSPIvlqzN4BYDBUHh5Vt87nyc+qCsADzTAes= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703058; c=relaxed/simple; bh=GYOrWHOgXKndKAS+zujXkprS9J2OU3xFf/JcbdsGsUA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=qikK9UWqQdYuLMlYPzLEw5G0NNsP3s5SMnq7DY2YCHZY8OJAmRSOCeTQfE3SirQGKnRm1ztL4bQy6KfBIUTLfuGaPuZ9MQpQ9q9wKrSv5rjK3JFP5brlDyHkbltjuZSBsCx/xKHyrCKCcXEV0dpUNfW0WM288omKn3Bn8NFNuZo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uHBOc-000000000aB-42pv; Mon, 19 May 2025 21:03:54 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, nadav.amit@gmail.com, Yu-cheng Yu , Rik van Riel Subject: [RFC v2 2/9] x86/mm: Introduce Remote Action Request MSRs Date: Mon, 19 May 2025 21:02:27 -0400 Message-ID: <20250520010350.1740223-3-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250520010350.1740223-1-riel@surriel.com> References: <20250520010350.1740223-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Yu-cheng Yu Remote Action Request (RAR) is a TLB flushing broadcast facility. This patch introduces RAR MSRs. RAR is introduced in later patches. There are five RAR MSRs: MSR_CORE_CAPABILITIES MSR_IA32_RAR_CTRL MSR_IA32_RAR_ACT_VEC MSR_IA32_RAR_PAYLOAD_BASE MSR_IA32_RAR_INFO Signed-off-by: Yu-cheng Yu Signed-off-by: Rik van Riel --- arch/x86/include/asm/msr-index.h | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-in= dex.h index c848dd4bfceb..adff8f0dc7bb 100644 --- a/arch/x86/include/asm/msr-index.h +++ b/arch/x86/include/asm/msr-index.h @@ -122,6 +122,17 @@ #define SNB_C3_AUTO_UNDEMOTE (1UL << 27) #define SNB_C1_AUTO_UNDEMOTE (1UL << 28) =20 +/* + * Remote Action Requests (RAR) MSRs + */ +#define MSR_IA32_RAR_CTRL 0x000000ed +#define MSR_IA32_RAR_ACT_VEC 0x000000ee +#define MSR_IA32_RAR_PAYLOAD_BASE 0x000000ef +#define MSR_IA32_RAR_INFO 0x000000f0 + +#define RAR_CTRL_ENABLE BIT(31) +#define RAR_CTRL_IGNORE_IF BIT(30) + #define MSR_MTRRcap 0x000000fe =20 #define MSR_IA32_ARCH_CAPABILITIES 0x0000010a --=20 2.49.0 From nobody Mon Dec 15 21:27:40 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E5681DF256 for ; Tue, 20 May 2025 01:04:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703057; cv=none; b=NAba6nZOSV197uVFPb1GT0igHzBUN4DNEZYHyKnn9oZusr1Dei9Lk3VaO3yTsjWyF0oxsovFiDqN3PYMn51jW88lBctvtjh49HtMU3NgABPXQ0Lt2+VI2KhOF0TZquezmqRWbJf2td99uOc5Z3nv+eE2uXd8NqBxakG38bewDN8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703057; c=relaxed/simple; bh=71WvzsdPoyrjXq8l+mrusg+sx3T4Ajw8dHlsaDfvdQE=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=litI1QRFWIILI7YnaxmKrkVucXPjuxHWKvx4/xTPDu/SJGHSP/nV10uzcbXh0zhO7A6ixWf/NPM7KlYZ5mZWXUgEUVw40ye3F3dgrEzhL8+F2u7CwueNoxlZM5KPH3r7/myXzoYY7vPSlkcAN1Q6tK1QOvncUvNzRLcsDhZtRR4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uHBOc-000000000aB-48rl; Mon, 19 May 2025 21:03:54 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, nadav.amit@gmail.com, Rik van Riel , Rik van Riel Subject: [RFC v2 3/9] x86/mm: enable BROADCAST_TLB_FLUSH on Intel, too Date: Mon, 19 May 2025 21:02:28 -0400 Message-ID: <20250520010350.1740223-4-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250520010350.1740223-1-riel@surriel.com> References: <20250520010350.1740223-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Rik van Riel Much of the code for Intel RAR and AMD INVLPGB is shared. Place both under the same config option. Signed-off-by: Rik van Riel --- arch/x86/Kconfig.cpu | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/Kconfig.cpu b/arch/x86/Kconfig.cpu index f928cf6e3252..f9cdd145abba 100644 --- a/arch/x86/Kconfig.cpu +++ b/arch/x86/Kconfig.cpu @@ -360,7 +360,7 @@ menuconfig PROCESSOR_SELECT =20 config BROADCAST_TLB_FLUSH def_bool y - depends on CPU_SUP_AMD && 64BIT + depends on (CPU_SUP_AMD || CPU_SUP_INTEL) && 64BIT =20 config CPU_SUP_INTEL default y --=20 2.49.0 From nobody Mon Dec 15 21:27:40 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E3ED1DC9B1 for ; Tue, 20 May 2025 01:04:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703058; cv=none; b=j4KiS5LIzdB9Aldbg+1LpHQQQmXPmM0H7f/1W/uKrxraDZ4evLWM9aKlJ9cMNAPUP3oCHieKv/IltiRj36R+TXAU7a/RzvJe2W5Ehsm3HaMngxTR7x0SM4Yb4cAWg3MkunnQhZzN75yu3p2lVlcc5VLXLl8RQDQ+2VFCcoOb1N4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703058; c=relaxed/simple; bh=X8dh9hO3CCUq3DhRKU19kb/HWl+KRx3RIQxdqsZIzzA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=XjzegOGQaofVkhyZSfdka6Yq9OrbPqXZLPDGJRCxTDkodO6/QO6iSwVp6QbXirqnyqfzxUILcBWgMciFJoGcO6mizNSWQZIESXF5HoOMh00goX3ENVIoZ34eFRfWogELpUbWgVfRMnjTrPdauPWNE9KFXbQZvRflAE4AiRPBnIw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uHBOd-000000000aB-02i9; Mon, 19 May 2025 21:03:55 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, nadav.amit@gmail.com, Rik van Riel , Yu-cheng Yu , Rik van Riel Subject: [RFC v2 4/9] x86/mm: Introduce X86_FEATURE_RAR Date: Mon, 19 May 2025 21:02:29 -0400 Message-ID: <20250520010350.1740223-5-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250520010350.1740223-1-riel@surriel.com> References: <20250520010350.1740223-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Rik van Riel Introduce X86_FEATURE_RAR and enumeration of the feature. [riel: moved initialization to intel.c and disabling to Kconfig.cpufeatures] Signed-off-by: Yu-cheng Yu Signed-off-by: Rik van Riel --- arch/x86/Kconfig.cpufeatures | 4 ++++ arch/x86/include/asm/cpufeatures.h | 2 +- arch/x86/kernel/cpu/common.c | 13 +++++++++++++ 3 files changed, 18 insertions(+), 1 deletion(-) diff --git a/arch/x86/Kconfig.cpufeatures b/arch/x86/Kconfig.cpufeatures index 250c10627ab3..7d459b5f47f7 100644 --- a/arch/x86/Kconfig.cpufeatures +++ b/arch/x86/Kconfig.cpufeatures @@ -195,3 +195,7 @@ config X86_DISABLED_FEATURE_SEV_SNP config X86_DISABLED_FEATURE_INVLPGB def_bool y depends on !BROADCAST_TLB_FLUSH + +config X86_DISABLED_FEATURE_RAR + def_bool y + depends on !BROADCAST_TLB_FLUSH diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpuf= eatures.h index 5b50e0e35129..0729c2d54109 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -76,7 +76,7 @@ #define X86_FEATURE_K8 ( 3*32+ 4) /* Opteron, Athlon64 */ #define X86_FEATURE_ZEN5 ( 3*32+ 5) /* CPU based on Zen5 microarchitectur= e */ #define X86_FEATURE_ZEN6 ( 3*32+ 6) /* CPU based on Zen6 microarchitectur= e */ -/* Free ( 3*32+ 7) */ +#define X86_FEATURE_RAR ( 3*32+ 7) /* Intel Remote Action Request */ #define X86_FEATURE_CONSTANT_TSC ( 3*32+ 8) /* "constant_tsc" TSC ticks at= a constant rate */ #define X86_FEATURE_UP ( 3*32+ 9) /* "up" SMP kernel running on UP */ #define X86_FEATURE_ART ( 3*32+10) /* "art" Always running timer (ART) */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 8feb8fd2957a..dd662c42f510 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -1545,6 +1545,18 @@ static void __init cpu_set_bug_bits(struct cpuinfo_x= 86 *c) setup_force_cpu_bug(X86_BUG_L1TF); } =20 +static void __init detect_rar(struct cpuinfo_x86 *c) +{ + u64 msr; + + if (cpu_has(c, X86_FEATURE_CORE_CAPABILITIES)) { + rdmsrl(MSR_IA32_CORE_CAPABILITIES, msr); + + if (msr & CORE_CAP_RAR) + setup_force_cpu_cap(X86_FEATURE_RAR); + } +} + /* * The NOPL instruction is supposed to exist on all CPUs of family >=3D 6; * unfortunately, that's not true in practice because of early VIA @@ -1771,6 +1783,7 @@ static void __init early_identify_cpu(struct cpuinfo_= x86 *c) setup_clear_cpu_cap(X86_FEATURE_LA57); =20 detect_nopl(); + detect_rar(c); } =20 void __init init_cpu_devs(void) --=20 2.49.0 From nobody Mon Dec 15 21:27:40 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E668254865 for ; Tue, 20 May 2025 01:04:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703058; cv=none; b=dUJ+8H8wurejf89GKjB/ia9v/uKN7WKyJdnFWReE4CgU0MMYjrSJ226DPbinl8LIRwV/QiHCbxdlWJQhgXy0wLcqDeSlx0mGAYOLODDfug2Le26EaiVzg/m3Hzhqnvrr2CVwgQlneNGbs+J8voR5PjFucnv94hsMgOLUf8MVz1c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703058; c=relaxed/simple; bh=BNRMadEtMm+CLc0PhVylpd+limEhSjVbfj0hHfE1AQo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=rmiK68ov1XaGmQCucr54+EyxJBRMK32OaX1l6v8WXcxg9D6IsZsd8D3PAFvecvxtHat5CZXx5fZQOVL83GYmnIxhVCBewo2AwJTbMST1OIHA6FZ++pBAI7nTEJBD+ZPTYj2NzLpywbXDE1qPc7+Rc3TKdkg9eJySpKyDZVkH7Ys= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uHBOd-000000000aB-08xY; Mon, 19 May 2025 21:03:55 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, nadav.amit@gmail.com, Rik van Riel , Yu-cheng Yu , Rik van Riel Subject: [RFC v2 5/9] x86/mm: Change cpa_flush() to call flush_kernel_range() directly Date: Mon, 19 May 2025 21:02:30 -0400 Message-ID: <20250520010350.1740223-6-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250520010350.1740223-1-riel@surriel.com> References: <20250520010350.1740223-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Rik van Riel The function cpa_flush() calls __flush_tlb_one_kernel() and flush_tlb_all(). Replacing that with a call to flush_tlb_kernel_range() allows cpa_flush() to make use of INVLPGB or RAR without any additional changes. Initialize invlpgb_count_max to 1, since flush_tlb_kernel_range() can now be called before invlpgb_count_max has been initialized to the value read from CPUID. [riel: remove now unused __cpa_flush_tlb] Signed-off-by: Yu-cheng Yu Signed-off-by: Rik van Riel --- arch/x86/kernel/cpu/amd.c | 2 +- arch/x86/mm/pat/set_memory.c | 20 +++++++------------- 2 files changed, 8 insertions(+), 14 deletions(-) diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 93da466dfe2c..b2ad8d13211a 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -31,7 +31,7 @@ =20 #include "cpu.h" =20 -u16 invlpgb_count_max __ro_after_init; +u16 invlpgb_count_max __ro_after_init =3D 1; =20 static inline int rdmsrq_amd_safe(unsigned msr, u64 *p) { diff --git a/arch/x86/mm/pat/set_memory.c b/arch/x86/mm/pat/set_memory.c index 30ab4aced761..2454f5249329 100644 --- a/arch/x86/mm/pat/set_memory.c +++ b/arch/x86/mm/pat/set_memory.c @@ -399,15 +399,6 @@ static void cpa_flush_all(unsigned long cache) on_each_cpu(__cpa_flush_all, (void *) cache, 1); } =20 -static void __cpa_flush_tlb(void *data) -{ - struct cpa_data *cpa =3D data; - unsigned int i; - - for (i =3D 0; i < cpa->numpages; i++) - flush_tlb_one_kernel(fix_addr(__cpa_addr(cpa, i))); -} - static int collapse_large_pages(unsigned long addr, struct list_head *pgta= bles); =20 static void cpa_collapse_large_pages(struct cpa_data *cpa) @@ -444,6 +435,7 @@ static void cpa_collapse_large_pages(struct cpa_data *c= pa) =20 static void cpa_flush(struct cpa_data *cpa, int cache) { + unsigned long start, end; unsigned int i; =20 BUG_ON(irqs_disabled() && !early_boot_irqs_disabled); @@ -453,10 +445,12 @@ static void cpa_flush(struct cpa_data *cpa, int cache) goto collapse_large_pages; } =20 - if (cpa->force_flush_all || cpa->numpages > tlb_single_page_flush_ceiling) - flush_tlb_all(); - else - on_each_cpu(__cpa_flush_tlb, cpa, 1); + start =3D fix_addr(__cpa_addr(cpa, 0)); + end =3D fix_addr(__cpa_addr(cpa, cpa->numpages)); + if (cpa->force_flush_all) + end =3D TLB_FLUSH_ALL; + + flush_tlb_kernel_range(start, end); =20 if (!cache) goto collapse_large_pages; --=20 2.49.0 From nobody Mon Dec 15 21:27:40 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E5E6253F07 for ; Tue, 20 May 2025 01:04:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703058; cv=none; b=kCS2bTtZFxnZVYM6Cpf1yF6y1a6FDSvJbf3U3AsJBPv4t80DthBYrbdS92aF8yl+lrkUW0nGj7KXyCnUZ0ZY7OaPLzRR0VZoOAjkPa+4KTSHC5r9GOVpYRl1RVNYfWkZtuJyBpkS4B+e36QwsBaf+uL/mPmrMx0jdJIXaSJafOc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703058; c=relaxed/simple; bh=8Rv2NThq3GpIF740ywbYkJK48AvMiLVuweYORk9NC8U=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=Odrpzjb/RHU429Scq158oz35lZyFLw0Qlarwlo7PIdJ0d8YF5W0OxhlG+mPe49l46Hj2G+xEe4yE3HEPhlCg32El/qbYTevOrJK1Q4eK+ZZYcAFTfb33f6yNYVY645U2tpxbPLyCHNqGTt5+niYB0NHaGarQO0AoYC7PMfivK8g= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uHBOd-000000000aB-0G7W; Mon, 19 May 2025 21:03:55 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, nadav.amit@gmail.com, Rik van Riel , Yu-cheng Yu , Rik van Riel Subject: [RFC v2 6/9] x86/apic: Introduce Remote Action Request Operations Date: Mon, 19 May 2025 21:02:31 -0400 Message-ID: <20250520010350.1740223-7-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250520010350.1740223-1-riel@surriel.com> References: <20250520010350.1740223-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Rik van Riel RAR TLB flushing is started by sending a command to the APIC. This patch adds Remote Action Request commands. [riel: move some things around to acount for 6 years of changes] Signed-off-by: Yu-cheng Yu Signed-off-by: Rik van Riel --- arch/x86/include/asm/apicdef.h | 1 + arch/x86/include/asm/irq_vectors.h | 5 +++++ arch/x86/include/asm/smp.h | 15 +++++++++++++++ arch/x86/kernel/apic/ipi.c | 23 +++++++++++++++++++---- arch/x86/kernel/apic/local.h | 3 +++ arch/x86/kernel/smp.c | 3 +++ 6 files changed, 46 insertions(+), 4 deletions(-) diff --git a/arch/x86/include/asm/apicdef.h b/arch/x86/include/asm/apicdef.h index 094106b6a538..b152d45af91a 100644 --- a/arch/x86/include/asm/apicdef.h +++ b/arch/x86/include/asm/apicdef.h @@ -92,6 +92,7 @@ #define APIC_DM_LOWEST 0x00100 #define APIC_DM_SMI 0x00200 #define APIC_DM_REMRD 0x00300 +#define APIC_DM_RAR 0x00300 #define APIC_DM_NMI 0x00400 #define APIC_DM_INIT 0x00500 #define APIC_DM_STARTUP 0x00600 diff --git a/arch/x86/include/asm/irq_vectors.h b/arch/x86/include/asm/irq_= vectors.h index 47051871b436..c417b0015304 100644 --- a/arch/x86/include/asm/irq_vectors.h +++ b/arch/x86/include/asm/irq_vectors.h @@ -103,6 +103,11 @@ */ #define POSTED_MSI_NOTIFICATION_VECTOR 0xeb =20 +/* + * RAR (remote action request) TLB flush + */ +#define RAR_VECTOR 0xe0 + #define NR_VECTORS 256 =20 #ifdef CONFIG_X86_LOCAL_APIC diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h index 0c1c68039d6f..1ab9f5fcac8a 100644 --- a/arch/x86/include/asm/smp.h +++ b/arch/x86/include/asm/smp.h @@ -40,6 +40,9 @@ struct smp_ops { =20 void (*send_call_func_ipi)(const struct cpumask *mask); void (*send_call_func_single_ipi)(int cpu); + + void (*send_rar_ipi)(const struct cpumask *mask); + void (*send_rar_single_ipi)(int cpu); }; =20 /* Globals due to paravirt */ @@ -100,6 +103,16 @@ static inline void arch_send_call_function_ipi_mask(co= nst struct cpumask *mask) smp_ops.send_call_func_ipi(mask); } =20 +static inline void arch_send_rar_single_ipi(int cpu) +{ + smp_ops.send_rar_single_ipi(cpu); +} + +static inline void arch_send_rar_ipi_mask(const struct cpumask *mask) +{ + smp_ops.send_rar_ipi(mask); +} + void cpu_disable_common(void); void native_smp_prepare_boot_cpu(void); void smp_prepare_cpus_common(void); @@ -120,6 +133,8 @@ void __noreturn mwait_play_dead(unsigned int eax_hint); void native_smp_send_reschedule(int cpu); void native_send_call_func_ipi(const struct cpumask *mask); void native_send_call_func_single_ipi(int cpu); +void native_send_rar_ipi(const struct cpumask *mask); +void native_send_rar_single_ipi(int cpu); =20 asmlinkage __visible void smp_reboot_interrupt(void); __visible void smp_reschedule_interrupt(struct pt_regs *regs); diff --git a/arch/x86/kernel/apic/ipi.c b/arch/x86/kernel/apic/ipi.c index 98a57cb4aa86..e5e9fc08f86c 100644 --- a/arch/x86/kernel/apic/ipi.c +++ b/arch/x86/kernel/apic/ipi.c @@ -79,7 +79,7 @@ void native_send_call_func_single_ipi(int cpu) __apic_send_IPI(cpu, CALL_FUNCTION_SINGLE_VECTOR); } =20 -void native_send_call_func_ipi(const struct cpumask *mask) +static void do_native_send_ipi(const struct cpumask *mask, int vector) { if (static_branch_likely(&apic_use_ipi_shorthand)) { unsigned int cpu =3D smp_processor_id(); @@ -88,14 +88,19 @@ void native_send_call_func_ipi(const struct cpumask *ma= sk) goto sendmask; =20 if (cpumask_test_cpu(cpu, mask)) - __apic_send_IPI_all(CALL_FUNCTION_VECTOR); + __apic_send_IPI_all(vector); else if (num_online_cpus() > 1) - __apic_send_IPI_allbutself(CALL_FUNCTION_VECTOR); + __apic_send_IPI_allbutself(vector); return; } =20 sendmask: - __apic_send_IPI_mask(mask, CALL_FUNCTION_VECTOR); + __apic_send_IPI_mask(mask, vector); +} + +void native_send_call_func_ipi(const struct cpumask *mask) +{ + do_native_send_ipi(mask, CALL_FUNCTION_VECTOR); } =20 void apic_send_nmi_to_offline_cpu(unsigned int cpu) @@ -106,6 +111,16 @@ void apic_send_nmi_to_offline_cpu(unsigned int cpu) return; apic->send_IPI(cpu, NMI_VECTOR); } + +void native_send_rar_single_ipi(int cpu) +{ + apic->send_IPI_mask(cpumask_of(cpu), RAR_VECTOR); +} + +void native_send_rar_ipi(const struct cpumask *mask) +{ + do_native_send_ipi(mask, RAR_VECTOR); +} #endif /* CONFIG_SMP */ =20 static inline int __prepare_ICR2(unsigned int mask) diff --git a/arch/x86/kernel/apic/local.h b/arch/x86/kernel/apic/local.h index bdcf609eb283..833669174267 100644 --- a/arch/x86/kernel/apic/local.h +++ b/arch/x86/kernel/apic/local.h @@ -38,6 +38,9 @@ static inline unsigned int __prepare_ICR(unsigned int sho= rtcut, int vector, case NMI_VECTOR: icr |=3D APIC_DM_NMI; break; + case RAR_VECTOR: + icr |=3D APIC_DM_RAR; + break; } return icr; } diff --git a/arch/x86/kernel/smp.c b/arch/x86/kernel/smp.c index 18266cc3d98c..2c51ed6aaf03 100644 --- a/arch/x86/kernel/smp.c +++ b/arch/x86/kernel/smp.c @@ -297,5 +297,8 @@ struct smp_ops smp_ops =3D { =20 .send_call_func_ipi =3D native_send_call_func_ipi, .send_call_func_single_ipi =3D native_send_call_func_single_ipi, + + .send_rar_ipi =3D native_send_rar_ipi, + .send_rar_single_ipi =3D native_send_rar_single_ipi, }; EXPORT_SYMBOL_GPL(smp_ops); --=20 2.49.0 From nobody Mon Dec 15 21:27:40 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8E39B1D9A54 for ; Tue, 20 May 2025 01:04:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703058; cv=none; b=VFq7aJk9P6RhCCNowPCR6vtoHa9EYSGm0pZBUNU5qt2UNSTwLqSvMMybuBmgtOfk3MFb1gWmsSlzZgED5wFeGKCdJ54TSsQyYKfdRc+DiIDdXdV+Bxeh3xp1W4l8AvkA8sJZZo7DqCL48DZ5XThdifPjPZSJQJGizlSTgCo3pqg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703058; c=relaxed/simple; bh=5Rvkn2hH269ajlSSkiHddJB8MsDkB8k7A1C8M1wbv3I=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=lkZXpoH0tOmkgWcil9HjiflMJFqRkNR6pOEaI6UnWMB37QxgutoQLhpNy2U3QTwtC6AFVVDxQLsE+UctltTqmyhikvAcZGMEggjaijYodwMxYNyLLwzD/xZPqXE55JsXlOA7ge7AVnOkCBur8vnWeaBeQ8+bTInJPP2GZZwltmw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uHBOd-000000000aB-0Mtu; Mon, 19 May 2025 21:03:55 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, nadav.amit@gmail.com, Yu-cheng Yu , Rik van Riel Subject: [RFC v2 7/9] x86/mm: Introduce Remote Action Request Date: Mon, 19 May 2025 21:02:32 -0400 Message-ID: <20250520010350.1740223-8-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250520010350.1740223-1-riel@surriel.com> References: <20250520010350.1740223-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Yu-cheng Yu Remote Action Request (RAR) is a TLB flushing broadcast facility. To start a TLB flush, the initiator CPU creates a RAR payload and sends a command to the APIC. The receiving CPUs automatically flush TLBs as specified in the payload without the kernel's involement. [ riel: add pcid parameter to smp_call_rar_many so other mms can be flushed= ] Signed-off-by: Yu-cheng Yu Signed-off-by: Rik van Riel --- arch/x86/include/asm/rar.h | 69 +++++++++++++ arch/x86/kernel/cpu/common.c | 4 + arch/x86/mm/Makefile | 1 + arch/x86/mm/rar.c | 195 +++++++++++++++++++++++++++++++++++ 4 files changed, 269 insertions(+) create mode 100644 arch/x86/include/asm/rar.h create mode 100644 arch/x86/mm/rar.c diff --git a/arch/x86/include/asm/rar.h b/arch/x86/include/asm/rar.h new file mode 100644 index 000000000000..78c039e40e81 --- /dev/null +++ b/arch/x86/include/asm/rar.h @@ -0,0 +1,69 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _ASM_X86_RAR_H +#define _ASM_X86_RAR_H + +/* + * RAR payload types + */ +#define RAR_TYPE_INVPG 0 +#define RAR_TYPE_INVPG_NO_CR3 1 +#define RAR_TYPE_INVPCID 2 +#define RAR_TYPE_INVEPT 3 +#define RAR_TYPE_INVVPID 4 +#define RAR_TYPE_WRMSR 5 + +/* + * Subtypes for RAR_TYPE_INVLPG + */ +#define RAR_INVPG_ADDR 0 /* address specific */ +#define RAR_INVPG_ALL 2 /* all, include global */ +#define RAR_INVPG_ALL_NO_GLOBAL 3 /* all, exclude global */ + +/* + * Subtypes for RAR_TYPE_INVPCID + */ +#define RAR_INVPCID_ADDR 0 /* address specific */ +#define RAR_INVPCID_PCID 1 /* all of PCID */ +#define RAR_INVPCID_ALL 2 /* all, include global */ +#define RAR_INVPCID_ALL_NO_GLOBAL 3 /* all, exclude global */ + +/* + * Page size for RAR_TYPE_INVLPG + */ +#define RAR_INVLPG_PAGE_SIZE_4K 0 +#define RAR_INVLPG_PAGE_SIZE_2M 1 +#define RAR_INVLPG_PAGE_SIZE_1G 2 + +/* + * Max number of pages per payload + */ +#define RAR_INVLPG_MAX_PAGES 63 + +struct rar_payload { + u64 for_sw : 8; + u64 type : 8; + u64 must_be_zero_1 : 16; + u64 subtype : 3; + u64 page_size : 2; + u64 num_pages : 6; + u64 must_be_zero_2 : 21; + + u64 must_be_zero_3; + + /* + * Starting address + */ + u64 initiator_cr3; + u64 linear_address; + + /* + * Padding + */ + u64 padding[4]; +}; + +void rar_cpu_init(void); +void smp_call_rar_many(const struct cpumask *mask, u16 pcid, + unsigned long start, unsigned long end); + +#endif /* _ASM_X86_RAR_H */ diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index dd662c42f510..b1e1b9afb2ac 100644 --- a/arch/x86/kernel/cpu/common.c +++ b/arch/x86/kernel/cpu/common.c @@ -71,6 +71,7 @@ #include #include #include +#include =20 #include "cpu.h" =20 @@ -2438,6 +2439,9 @@ void cpu_init(void) if (is_uv_system()) uv_cpu_init(); =20 + if (cpu_feature_enabled(X86_FEATURE_RAR)) + rar_cpu_init(); + load_fixmap_gdt(cpu); } =20 diff --git a/arch/x86/mm/Makefile b/arch/x86/mm/Makefile index 5b9908f13dcf..f36fc99e8b10 100644 --- a/arch/x86/mm/Makefile +++ b/arch/x86/mm/Makefile @@ -52,6 +52,7 @@ obj-$(CONFIG_ACPI_NUMA) +=3D srat.o obj-$(CONFIG_X86_INTEL_MEMORY_PROTECTION_KEYS) +=3D pkeys.o obj-$(CONFIG_RANDOMIZE_MEMORY) +=3D kaslr.o obj-$(CONFIG_MITIGATION_PAGE_TABLE_ISOLATION) +=3D pti.o +obj-$(CONFIG_BROADCAST_TLB_FLUSH) +=3D rar.o =20 obj-$(CONFIG_X86_MEM_ENCRYPT) +=3D mem_encrypt.o obj-$(CONFIG_AMD_MEM_ENCRYPT) +=3D mem_encrypt_amd.o diff --git a/arch/x86/mm/rar.c b/arch/x86/mm/rar.c new file mode 100644 index 000000000000..16dc9b889cbd --- /dev/null +++ b/arch/x86/mm/rar.c @@ -0,0 +1,195 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * RAR TLB shootdown + */ +#include +#include +#include +#include +#include +#include +#include + +static DEFINE_PER_CPU(struct cpumask, rar_cpu_mask); + +#define RAR_ACTION_OK 0x00 +#define RAR_ACTION_START 0x01 +#define RAR_ACTION_ACKED 0x02 +#define RAR_ACTION_FAIL 0x80 + +#define RAR_MAX_PAYLOADS 32UL + +static unsigned long rar_in_use =3D ~(RAR_MAX_PAYLOADS - 1); +static struct rar_payload rar_payload[RAR_MAX_PAYLOADS] __page_aligned_bss; +static DEFINE_PER_CPU_ALIGNED(u8[RAR_MAX_PAYLOADS], rar_action); + +static unsigned long get_payload(void) +{ + while (1) { + unsigned long bit; + + /* + * Find a free bit and confirm it with + * test_and_set_bit() below. + */ + bit =3D ffz(READ_ONCE(rar_in_use)); + + if (bit >=3D RAR_MAX_PAYLOADS) + continue; + + if (!test_and_set_bit((long)bit, &rar_in_use)) + return bit; + } +} + +static void free_payload(unsigned long idx) +{ + clear_bit(idx, &rar_in_use); +} + +static void set_payload(unsigned long idx, u16 pcid, unsigned long start, + uint32_t pages) +{ + struct rar_payload *p =3D &rar_payload[idx]; + + p->must_be_zero_1 =3D 0; + p->must_be_zero_2 =3D 0; + p->must_be_zero_3 =3D 0; + p->page_size =3D RAR_INVLPG_PAGE_SIZE_4K; + p->type =3D RAR_TYPE_INVPCID; + p->num_pages =3D pages; + p->initiator_cr3 =3D pcid; + p->linear_address =3D start; + + if (pcid) { + /* RAR invalidation of the mapping of a specific process. */ + if (pages >=3D RAR_INVLPG_MAX_PAGES) + p->subtype =3D RAR_INVPCID_PCID; + else + p->subtype =3D RAR_INVPCID_ADDR; + } else { + /* + * Unfortunately RAR_INVPCID_ADDR excludes global translations. + * Always do a full flush for kernel invalidations. + */ + p->subtype =3D RAR_INVPCID_ALL; + } + + smp_wmb(); +} + +static void set_action_entry(unsigned long idx, int target_cpu) +{ + u8 *bitmap =3D per_cpu(rar_action, target_cpu); + + WRITE_ONCE(bitmap[idx], RAR_ACTION_START); +} + +static void wait_for_done(unsigned long idx, int target_cpu) +{ + u8 status; + u8 *rar_actions =3D per_cpu(rar_action, target_cpu); + + status =3D READ_ONCE(rar_actions[idx]); + + while ((status !=3D RAR_ACTION_OK) && (status !=3D RAR_ACTION_FAIL)) { + cpu_relax(); + status =3D READ_ONCE(rar_actions[idx]); + } + + WARN_ON_ONCE(rar_actions[idx] =3D=3D RAR_ACTION_FAIL); +} + +void rar_cpu_init(void) +{ + u64 r; + u8 *bitmap; + int this_cpu =3D smp_processor_id(); + + cpumask_clear(&per_cpu(rar_cpu_mask, this_cpu)); + + rdmsrl(MSR_IA32_RAR_INFO, r); + pr_info_once("RAR: support %lld payloads\n", r >> 32); + + bitmap =3D (u8 *)per_cpu(rar_action, this_cpu); + memset(bitmap, 0, RAR_MAX_PAYLOADS); + wrmsrl(MSR_IA32_RAR_ACT_VEC, (u64)virt_to_phys(bitmap)); + wrmsrl(MSR_IA32_RAR_PAYLOAD_BASE, (u64)virt_to_phys(rar_payload)); + + r =3D RAR_CTRL_ENABLE | RAR_CTRL_IGNORE_IF; + // reserved bits!!! r |=3D (RAR_VECTOR & 0xff); + wrmsrl(MSR_IA32_RAR_CTRL, r); +} + +/* + * This is a modified version of smp_call_function_many() of kernel/smp.c, + * without a function pointer, because the RAR handler is the ucode. + */ +void smp_call_rar_many(const struct cpumask *mask, u16 pcid, + unsigned long start, unsigned long end) +{ + unsigned long pages =3D (end - start + PAGE_SIZE) / PAGE_SIZE; + int cpu, next_cpu, this_cpu =3D smp_processor_id(); + cpumask_t *dest_mask; + unsigned long idx; + + if (pages > RAR_INVLPG_MAX_PAGES || end =3D=3D TLB_FLUSH_ALL) + pages =3D RAR_INVLPG_MAX_PAGES; + + /* + * Can deadlock when called with interrupts disabled. + * We allow cpu's that are not yet online though, as no one else can + * send smp call function interrupt to this cpu and as such deadlocks + * can't happen. + */ + WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() + && !oops_in_progress && !early_boot_irqs_disabled); + + /* Try to fastpath. So, what's a CPU they want? Ignoring this one. */ + cpu =3D cpumask_first_and(mask, cpu_online_mask); + if (cpu =3D=3D this_cpu) + cpu =3D cpumask_next_and(cpu, mask, cpu_online_mask); + + /* No online cpus? We're done. */ + if (cpu >=3D nr_cpu_ids) + return; + + /* Do we have another CPU which isn't us? */ + next_cpu =3D cpumask_next_and(cpu, mask, cpu_online_mask); + if (next_cpu =3D=3D this_cpu) + next_cpu =3D cpumask_next_and(next_cpu, mask, cpu_online_mask); + + /* Fastpath: do that cpu by itself. */ + if (next_cpu >=3D nr_cpu_ids) { + idx =3D get_payload(); + set_payload(idx, pcid, start, pages); + set_action_entry(idx, cpu); + arch_send_rar_single_ipi(cpu); + wait_for_done(idx, cpu); + free_payload(idx); + return; + } + + dest_mask =3D this_cpu_ptr(&rar_cpu_mask); + cpumask_and(dest_mask, mask, cpu_online_mask); + cpumask_clear_cpu(this_cpu, dest_mask); + + /* Some callers race with other cpus changing the passed mask */ + if (unlikely(!cpumask_weight(dest_mask))) + return; + + idx =3D get_payload(); + set_payload(idx, pcid, start, pages); + + for_each_cpu(cpu, dest_mask) + set_action_entry(idx, cpu); + + /* Send a message to all CPUs in the map */ + arch_send_rar_ipi_mask(dest_mask); + + for_each_cpu(cpu, dest_mask) + wait_for_done(idx, cpu); + + free_payload(idx); +} +EXPORT_SYMBOL(smp_call_rar_many); --=20 2.49.0 From nobody Mon Dec 15 21:27:40 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 90AA1254871 for ; Tue, 20 May 2025 01:04:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703059; cv=none; b=H4qsX8yDOnbjQsUaO6noLbimIU14UV62gwI2jTtcSNNHx8JrLFgsS6yPC4UrS07rOt0Jk7xmo9T51mhESYEcq8MqiWZ7g2ETdEgy5dNhJQ3PWvyG9tbPLGUTBKNEBrNSqlr+cD6nVizu1x29lyjcCZw6EqaTmhGmv+1HQIsUAC0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703059; c=relaxed/simple; bh=+4lCb+L3hsek/DK0bB2KYv8u8TmmG0TuORKcg/bHmiI=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=D9wFeq2LR6YwtB6FwpZ8V4v+Hy2cxGG0pb8ypb6Prjz0NY0pvAkZmLsRwaFffmzUj5+PX672KKn3sz1z4wCMpzEf8X9lVRM14BvPKmWUStaI//IyDzmUHGHfLqKqUTPSIqVkMGr6F+82u78yGufeXsAVef5fScGyFFzaVAGwP3o= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uHBOd-000000000aB-0U7Z; Mon, 19 May 2025 21:03:55 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, nadav.amit@gmail.com, Rik van Riel , Rik van Riel Subject: [RFC v2 8/9] x86/mm: use RAR for kernel TLB flushes Date: Mon, 19 May 2025 21:02:33 -0400 Message-ID: <20250520010350.1740223-9-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250520010350.1740223-1-riel@surriel.com> References: <20250520010350.1740223-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Rik van Riel Use Intel RAR for kernel TLB flushes, when enabled. Pass in PCID 0 to smp_call_rar_many() to flush the specified addresses, regardless of which PCID they might be cached under in any destination CPU. Signed-off-by: Rik van Riel --- arch/x86/mm/rar.c | 4 ++-- arch/x86/mm/tlb.c | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 40 insertions(+), 2 deletions(-) diff --git a/arch/x86/mm/rar.c b/arch/x86/mm/rar.c index 16dc9b889cbd..9a18c926ea7b 100644 --- a/arch/x86/mm/rar.c +++ b/arch/x86/mm/rar.c @@ -142,8 +142,8 @@ void smp_call_rar_many(const struct cpumask *mask, u16 = pcid, * send smp call function interrupt to this cpu and as such deadlocks * can't happen. */ - WARN_ON_ONCE(cpu_online(this_cpu) && irqs_disabled() - && !oops_in_progress && !early_boot_irqs_disabled); + if (cpu_online(this_cpu) && !oops_in_progress && !early_boot_irqs_disable= d) + lockdep_assert_irqs_enabled(); =20 /* Try to fastpath. So, what's a CPU they want? Ignoring this one. */ cpu =3D cpumask_first_and(mask, cpu_online_mask); diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index f5761e8be77f..35489df811dc 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -21,6 +21,7 @@ #include #include #include +#include #include =20 #include "mm_internal.h" @@ -1446,6 +1447,18 @@ static void do_flush_tlb_all(void *info) __flush_tlb_all(); } =20 +static void rar_full_flush(const cpumask_t *cpumask) +{ + guard(preempt)(); + smp_call_rar_many(cpumask, 0, 0, TLB_FLUSH_ALL); + invpcid_flush_all(); +} + +static void rar_flush_all(void) +{ + rar_full_flush(cpu_online_mask); +} + void flush_tlb_all(void) { count_vm_tlb_event(NR_TLB_REMOTE_FLUSH); @@ -1453,6 +1466,8 @@ void flush_tlb_all(void) /* First try (faster) hardware-assisted TLB invalidation. */ if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) invlpgb_flush_all(); + else if (cpu_feature_enabled(X86_FEATURE_RAR)) + rar_flush_all(); else /* Fall back to the IPI-based invalidation. */ on_each_cpu(do_flush_tlb_all, NULL, 1); @@ -1482,15 +1497,36 @@ static void do_kernel_range_flush(void *info) struct flush_tlb_info *f =3D info; unsigned long addr; =20 + /* + * With PTI kernel TLB entries in all PCIDs need to be flushed. + * With RAR the PCID space becomes so large, we might as well flush it al= l. + * + * Either of the two by itself works with targeted flushes. + */ + if (cpu_feature_enabled(X86_FEATURE_RAR) && + cpu_feature_enabled(X86_FEATURE_PTI)) { + invpcid_flush_all(); + return; + } + /* flush range by one by one 'invlpg' */ for (addr =3D f->start; addr < f->end; addr +=3D PAGE_SIZE) flush_tlb_one_kernel(addr); } =20 +static void rar_kernel_range_flush(struct flush_tlb_info *info) +{ + guard(preempt)(); + smp_call_rar_many(cpu_online_mask, 0, info->start, info->end); + do_kernel_range_flush(info); +} + static void kernel_tlb_flush_all(struct flush_tlb_info *info) { if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) invlpgb_flush_all(); + else if (cpu_feature_enabled(X86_FEATURE_RAR)) + rar_flush_all(); else on_each_cpu(do_flush_tlb_all, NULL, 1); } @@ -1499,6 +1535,8 @@ static void kernel_tlb_flush_range(struct flush_tlb_i= nfo *info) { if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) invlpgb_kernel_range_flush(info); + else if (cpu_feature_enabled(X86_FEATURE_RAR)) + rar_kernel_range_flush(info); else on_each_cpu(do_kernel_range_flush, info, 1); } --=20 2.49.0 From nobody Mon Dec 15 21:27:40 2025 Received: from shelob.surriel.com (shelob.surriel.com [96.67.55.147]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8DFF9137930 for ; Tue, 20 May 2025 01:04:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=96.67.55.147 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703058; cv=none; b=u3CSqv3bnYeekETRMyDm4ALDqptLyOERI8e9NVt+pDP/LCHzo0FYIlwdKlbApKzybXyc8mgqbYDuVzcB51l+3fLb899qaA0Muksn9g7bHnZ7oCh2eD9et3kIpwiCADIOzAgOXyx7QT3kYVqKOPHGQ8H0GbhVvghAvQ7C2ttjnuM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1747703058; c=relaxed/simple; bh=qCtsDb32dfG2YNexRODUsmUl/RujP6ZxdhcdU6uJrHA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZSexWZXrS8JW0FTuwK7md+kS51dNmbmLDMWlvnEzktLGd6UX7vCTbfNGdb/1hz0tTdEtDTlf6AdlDvF4ADrR5Ezwl7W/hxEHKIABa0UldQGNPa+jJKZ+BsrX1gsYtCurZEh3Buec3C4RQRHRZ02Agf9ccL5KfTMRGXkri1ke6NI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com; spf=pass smtp.mailfrom=shelob.surriel.com; arc=none smtp.client-ip=96.67.55.147 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=surriel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=shelob.surriel.com Received: from fangorn.home.surriel.com ([10.0.13.7]) by shelob.surriel.com with esmtpsa (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.97.1) (envelope-from ) id 1uHBOd-000000000aB-0adx; Mon, 19 May 2025 21:03:55 -0400 From: Rik van Riel To: linux-kernel@vger.kernel.org Cc: linux-mm@kvack.org, x86@kernel.org, kernel-team@meta.com, dave.hansen@linux.intel.com, luto@kernel.org, peterz@infradead.org, tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, nadav.amit@gmail.com, Rik van Riel , Rik van Riel Subject: [RFC v2 9/9] x86/mm: userspace & pageout flushing using Intel RAR Date: Mon, 19 May 2025 21:02:34 -0400 Message-ID: <20250520010350.1740223-10-riel@surriel.com> X-Mailer: git-send-email 2.49.0 In-Reply-To: <20250520010350.1740223-1-riel@surriel.com> References: <20250520010350.1740223-1-riel@surriel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Sender: riel@surriel.com Content-Type: text/plain; charset="utf-8" From: Rik van Riel Use Intel RAR to flush userspace mappings. Because RAR flushes are targeted using a cpu bitmap, the rules are a little bit different than for true broadcast TLB invalidation. For true broadcast TLB invalidation, like done with AMD INVLPGB, a global ASID always has up to date TLB entries on every CPU. The context switch code never has to flush the TLB when switching to a global ASID on any CPU with INVLPGB. For RAR, the TLB mappings for a global ASID are kept up to date only on CPUs within the mm_cpumask, which lazily follows the threads around the system. The context switch code does not need to flush the TLB if the CPU is in the mm_cpumask, and the PCID used stays the same. However, a CPU that falls outside of the mm_cpumask can have out of date TLB mappings for this task. When switching to that task on a CPU not in the mm_cpumask, the TLB does need to be flushed. Signed-off-by: Rik van Riel --- arch/x86/include/asm/tlbflush.h | 9 ++- arch/x86/mm/tlb.c | 121 ++++++++++++++++++++++++++------ 2 files changed, 104 insertions(+), 26 deletions(-) diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflus= h.h index cc9935bbbd45..bdde3ce6c9b1 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -276,7 +276,8 @@ static inline u16 mm_global_asid(struct mm_struct *mm) { u16 asid; =20 - if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB) && + !cpu_feature_enabled(X86_FEATURE_RAR)) return 0; =20 asid =3D smp_load_acquire(&mm->context.global_asid); @@ -289,7 +290,8 @@ static inline u16 mm_global_asid(struct mm_struct *mm) =20 static inline void mm_init_global_asid(struct mm_struct *mm) { - if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { + if (cpu_feature_enabled(X86_FEATURE_INVLPGB) || + cpu_feature_enabled(X86_FEATURE_RAR)) { mm->context.global_asid =3D 0; mm->context.asid_transition =3D false; } @@ -313,7 +315,8 @@ static inline void mm_clear_asid_transition(struct mm_s= truct *mm) =20 static inline bool mm_in_asid_transition(struct mm_struct *mm) { - if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB) && + !cpu_feature_enabled(X86_FEATURE_RAR)) return false; =20 return mm && READ_ONCE(mm->context.asid_transition); diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index 35489df811dc..51658bdaa0b3 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -203,7 +203,8 @@ struct new_asid { unsigned int need_flush : 1; }; =20 -static struct new_asid choose_new_asid(struct mm_struct *next, u64 next_tl= b_gen) +static struct new_asid choose_new_asid(struct mm_struct *next, u64 next_tl= b_gen, + bool new_cpu) { struct new_asid ns; u16 asid; @@ -216,14 +217,22 @@ static struct new_asid choose_new_asid(struct mm_stru= ct *next, u64 next_tlb_gen) =20 /* * TLB consistency for global ASIDs is maintained with hardware assisted - * remote TLB flushing. Global ASIDs are always up to date. + * remote TLB flushing. Global ASIDs are always up to date with INVLPGB, + * and up to date for CPUs in the mm_cpumask with RAR.. */ - if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) { + if (cpu_feature_enabled(X86_FEATURE_INVLPGB) || + cpu_feature_enabled(X86_FEATURE_RAR)) { u16 global_asid =3D mm_global_asid(next); =20 if (global_asid) { ns.asid =3D global_asid; ns.need_flush =3D 0; + /* + * If the CPU fell out of the cpumask, it can be + * out of date with RAR, and should be flushed. + */ + if (cpu_feature_enabled(X86_FEATURE_RAR)) + ns.need_flush =3D new_cpu; return ns; } } @@ -281,7 +290,14 @@ static void reset_global_asid_space(void) { lockdep_assert_held(&global_asid_lock); =20 - invlpgb_flush_all_nonglobals(); + /* + * The global flush ensures that a freshly allocated global ASID + * has no entries in any TLB, and can be used immediately. + * With Intel RAR, the TLB may still need to be flushed at context + * switch time when dealing with a CPU that was not in the mm_cpumask + * for the process, and may have missed flushes along the way. + */ + flush_tlb_all(); =20 /* * The TLB flush above makes it safe to re-use the previously @@ -358,7 +374,7 @@ static void use_global_asid(struct mm_struct *mm) { u16 asid; =20 - guard(raw_spinlock_irqsave)(&global_asid_lock); + guard(raw_spinlock)(&global_asid_lock); =20 /* This process is already using broadcast TLB invalidation. */ if (mm_global_asid(mm)) @@ -384,13 +400,14 @@ static void use_global_asid(struct mm_struct *mm) =20 void mm_free_global_asid(struct mm_struct *mm) { - if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB) && + !cpu_feature_enabled(X86_FEATURE_RAR)) return; =20 if (!mm_global_asid(mm)) return; =20 - guard(raw_spinlock_irqsave)(&global_asid_lock); + guard(raw_spinlock)(&global_asid_lock); =20 /* The global ASID can be re-used only after flush at wrap-around. */ #ifdef CONFIG_BROADCAST_TLB_FLUSH @@ -408,7 +425,8 @@ static bool mm_needs_global_asid(struct mm_struct *mm, = u16 asid) { u16 global_asid =3D mm_global_asid(mm); =20 - if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB) && + !cpu_feature_enabled(X86_FEATURE_RAR)) return false; =20 /* Process is transitioning to a global ASID */ @@ -426,13 +444,17 @@ static bool mm_needs_global_asid(struct mm_struct *mm= , u16 asid) */ static void consider_global_asid(struct mm_struct *mm) { - if (!cpu_feature_enabled(X86_FEATURE_INVLPGB)) + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB) && + !cpu_feature_enabled(X86_FEATURE_RAR)) return; =20 /* Check every once in a while. */ if ((current->pid & 0x1f) !=3D (jiffies & 0x1f)) return; =20 + if (mm =3D=3D &init_mm) + return; + /* * Assign a global ASID if the process is active on * 4 or more CPUs simultaneously. @@ -480,7 +502,7 @@ static void finish_asid_transition(struct flush_tlb_inf= o *info) mm_clear_asid_transition(mm); } =20 -static void broadcast_tlb_flush(struct flush_tlb_info *info) +static void invlpgb_tlb_flush(struct flush_tlb_info *info) { bool pmd =3D info->stride_shift =3D=3D PMD_SHIFT; unsigned long asid =3D mm_global_asid(info->mm); @@ -511,8 +533,6 @@ static void broadcast_tlb_flush(struct flush_tlb_info *= info) addr +=3D nr << info->stride_shift; } while (addr < info->end); =20 - finish_asid_transition(info); - /* Wait for the INVLPGBs kicked off above to finish. */ __tlbsync(); } @@ -840,7 +860,7 @@ void switch_mm_irqs_off(struct mm_struct *unused, struc= t mm_struct *next, /* Check if the current mm is transitioning to a global ASID */ if (mm_needs_global_asid(next, prev_asid)) { next_tlb_gen =3D atomic64_read(&next->context.tlb_gen); - ns =3D choose_new_asid(next, next_tlb_gen); + ns =3D choose_new_asid(next, next_tlb_gen, true); goto reload_tlb; } =20 @@ -878,6 +898,7 @@ void switch_mm_irqs_off(struct mm_struct *unused, struc= t mm_struct *next, ns.asid =3D prev_asid; ns.need_flush =3D true; } else { + bool new_cpu =3D false; /* * Apply process to process speculation vulnerability * mitigations if applicable. @@ -892,20 +913,25 @@ void switch_mm_irqs_off(struct mm_struct *unused, str= uct mm_struct *next, this_cpu_write(cpu_tlbstate.loaded_mm, LOADED_MM_SWITCHING); barrier(); =20 - /* Start receiving IPIs and then read tlb_gen (and LAM below) */ - if (next !=3D &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(next))) + /* Start receiving IPIs and RAR invalidations */ + if (next !=3D &init_mm && !cpumask_test_cpu(cpu, mm_cpumask(next))) { cpumask_set_cpu(cpu, mm_cpumask(next)); + if (cpu_feature_enabled(X86_FEATURE_RAR)) + new_cpu =3D true; + } + next_tlb_gen =3D atomic64_read(&next->context.tlb_gen); =20 - ns =3D choose_new_asid(next, next_tlb_gen); + ns =3D choose_new_asid(next, next_tlb_gen, new_cpu); } =20 reload_tlb: new_lam =3D mm_lam_cr3_mask(next); if (ns.need_flush) { - VM_WARN_ON_ONCE(is_global_asid(ns.asid)); - this_cpu_write(cpu_tlbstate.ctxs[ns.asid].ctx_id, next->context.ctx_id); - this_cpu_write(cpu_tlbstate.ctxs[ns.asid].tlb_gen, next_tlb_gen); + if (is_dyn_asid(ns.asid)) { + this_cpu_write(cpu_tlbstate.ctxs[ns.asid].ctx_id, next->context.ctx_id); + this_cpu_write(cpu_tlbstate.ctxs[ns.asid].tlb_gen, next_tlb_gen); + } load_new_mm_cr3(next->pgd, ns.asid, new_lam, true); =20 trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL); @@ -1122,8 +1148,13 @@ static void flush_tlb_func(void *info) loaded_mm_asid =3D this_cpu_read(cpu_tlbstate.loaded_mm_asid); } =20 - /* Broadcast ASIDs are always kept up to date with INVLPGB. */ - if (is_global_asid(loaded_mm_asid)) + /* + * Broadcast ASIDs are always kept up to date with INVLPGB; with + * Intel RAR IPI based flushes are used periodically to trim the + * mm_cpumask, and flushes that get here should be processed. + */ + if (cpu_feature_enabled(X86_FEATURE_INVLPGB) && + is_global_asid(loaded_mm_asid)) return; =20 VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[loaded_mm_asid].ctx_id) !=3D @@ -1358,6 +1389,35 @@ static DEFINE_PER_CPU_SHARED_ALIGNED(struct flush_tl= b_info, flush_tlb_info); static DEFINE_PER_CPU(unsigned int, flush_tlb_info_idx); #endif =20 +static void rar_tlb_flush(struct flush_tlb_info *info) +{ + unsigned long asid =3D mm_global_asid(info->mm); + u16 pcid =3D kern_pcid(asid); + + /* Flush the remote CPUs. */ + smp_call_rar_many(mm_cpumask(info->mm), pcid, info->start, info->end); + if (cpu_feature_enabled(X86_FEATURE_PTI)) + smp_call_rar_many(mm_cpumask(info->mm), user_pcid(asid), info->start, in= fo->end); + + /* Flush the local TLB, if needed. */ + if (cpumask_test_cpu(smp_processor_id(), mm_cpumask(info->mm))) { + lockdep_assert_irqs_enabled(); + local_irq_disable(); + flush_tlb_func(info); + local_irq_enable(); + } +} + +static void broadcast_tlb_flush(struct flush_tlb_info *info) +{ + if (cpu_feature_enabled(X86_FEATURE_INVLPGB)) + invlpgb_tlb_flush(info); + else /* Intel RAR */ + rar_tlb_flush(info); + + finish_asid_transition(info); +} + static struct flush_tlb_info *get_flush_tlb_info(struct mm_struct *mm, unsigned long start, unsigned long end, unsigned int stride_shift, bool freed_tables, @@ -1418,15 +1478,22 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsig= ned long start, info =3D get_flush_tlb_info(mm, start, end, stride_shift, freed_tables, new_tlb_gen); =20 + /* + * IPIs and RAR can be targeted to a cpumask. Periodically trim that + * mm_cpumask by sending TLB flush IPIs, even when most TLB flushes + * are done with RAR. + */ + if (!cpu_feature_enabled(X86_FEATURE_INVLPGB) || !mm_global_asid(mm)) + info->trim_cpumask =3D should_trim_cpumask(mm); + /* * flush_tlb_multi() is not optimized for the common case in which only * a local TLB flush is needed. Optimize this use-case by calling * flush_tlb_func_local() directly in this case. */ - if (mm_global_asid(mm)) { + if (mm_global_asid(mm) && !info->trim_cpumask) { broadcast_tlb_flush(info); } else if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids) { - info->trim_cpumask =3D should_trim_cpumask(mm); flush_tlb_multi(mm_cpumask(mm), info); consider_global_asid(mm); } else if (mm =3D=3D this_cpu_read(cpu_tlbstate.loaded_mm)) { @@ -1737,6 +1804,14 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_= batch *batch) if (cpu_feature_enabled(X86_FEATURE_INVLPGB) && batch->unmapped_pages) { invlpgb_flush_all_nonglobals(); batch->unmapped_pages =3D false; + } else if (cpu_feature_enabled(X86_FEATURE_RAR) && cpumask_any(&batch->cp= umask) < nr_cpu_ids) { + rar_full_flush(&batch->cpumask); + if (cpumask_test_cpu(cpu, &batch->cpumask)) { + lockdep_assert_irqs_enabled(); + local_irq_disable(); + invpcid_flush_all_nonglobals(); + local_irq_enable(); + } } else if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids) { flush_tlb_multi(&batch->cpumask, info); } else if (cpumask_test_cpu(cpu, &batch->cpumask)) { --=20 2.49.0