From nobody Tue Dec 16 18:36:05 2025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B25CCC4332F for ; Tue, 12 Dec 2023 16:52:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1376427AbjLLQwh (ORCPT ); Tue, 12 Dec 2023 11:52:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51404 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233300AbjLLQwX (ORCPT ); Tue, 12 Dec 2023 11:52:23 -0500 Received: from galois.linutronix.de (Galois.linutronix.de [IPv6:2a0a:51c0:0:12e:550::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C785DEB; Tue, 12 Dec 2023 08:52:28 -0800 (PST) Date: Tue, 12 Dec 2023 16:52:26 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1702399947; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=VeliMcokZTPe7XPx2kdMXQ6xl20+8WRgVhzKKOi+1i8=; b=3aBdJGkNBSgtdzclItf3ymRdRKRtToalqCNolj20TaHixItew6pN70wfCqTuwceYJWDvzD VTmyBZ/jfYKhCoZjBay1CwULeaLgJ1egjVTLnJsmQHVwWjw9UWjelUz2Vr6M7VqM1TqXxe 5rG/gmWyZoDhPaPPD/cv82jKqvIeocqJ6vKLwsmeuAk5EPXdV6kb8SwrrXC9jv9CSqhaRM Y7t8iacmCVCjN1sD8vODU28vS7yOCexWY4YZDb0lYTV7eGhsHy3nEKwjtHqkMUk5fl1x2S Ztxu6lwDjAPpygYd7V4KYHoaZsyX8O/ZozEkuTMQR6UAlY0Hr8kZuhDa4jre/g== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1702399947; h=from:from:sender:sender:reply-to:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-type: content-transfer-encoding:content-transfer-encoding; bh=VeliMcokZTPe7XPx2kdMXQ6xl20+8WRgVhzKKOi+1i8=; b=0GP2NGMIcpnxoDC48AmkRP80nz9Dh2mxAFh1RS8b8U2w6ZxExmVQZ9rS10SNEGdcuLLBWi O9YTzlq2YpOSFEDw== From: "tip-bot2 for Kai Huang" Sender: tip-bot2@linutronix.de Reply-to: linux-kernel@vger.kernel.org To: linux-tip-commits@vger.kernel.org Subject: [tip: x86/tdx] x86/cpu: Detect TDX partial write machine check erratum Cc: Kai Huang , Dave Hansen , "Kirill A. Shutemov" , David Hildenbrand , x86@kernel.org, linux-kernel@vger.kernel.org MIME-Version: 1.0 Message-ID: <170239994675.398.16463657664761707141.tip-bot2@tip-bot2> Robot-ID: Robot-Unsubscribe: Contact to get blacklisted from these emails Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The following commit has been merged into the x86/tdx branch of tip: Commit-ID: 1e536e10689700e006989dea33918cce348e04b6 Gitweb: https://git.kernel.org/tip/1e536e10689700e006989dea33918cce3= 48e04b6 Author: Kai Huang AuthorDate: Fri, 08 Dec 2023 09:07:37 -08:00 Committer: Dave Hansen CommitterDate: Tue, 12 Dec 2023 08:46:40 -08:00 x86/cpu: Detect TDX partial write machine check erratum TDX memory has integrity and confidentiality protections. Violations of this integrity protection are supposed to only affect TDX operations and are never supposed to affect the host kernel itself. In other words, the host kernel should never, itself, see machine checks induced by the TDX integrity hardware. Alas, the first few generations of TDX hardware have an erratum. A partial write to a TDX private memory cacheline will silently "poison" the line. Subsequent reads will consume the poison and generate a machine check. According to the TDX hardware spec, neither of these things should have happened. Virtually all kernel memory accesses operations happen in full cachelines. In practice, writing a "byte" of memory usually reads a 64 byte cacheline of memory, modifies it, then writes the whole line back. Those operations do not trigger this problem. This problem is triggered by "partial" writes where a write transaction of less than cacheline lands at the memory controller. The CPU does these via non-temporal write instructions (like MOVNTI), or through UC/WC memory mappings. The issue can also be triggered away from the CPU by devices doing partial writes via DMA. With this erratum, there are additional things need to be done. To prepare for those changes, add a CPU bug bit to indicate this erratum. Note this bug reflects the hardware thus it is detected regardless of whether the kernel is built with TDX support or not. Signed-off-by: Kai Huang Signed-off-by: Dave Hansen Reviewed-by: Kirill A. Shutemov Reviewed-by: David Hildenbrand Reviewed-by: Dave Hansen Link: https://lore.kernel.org/all/20231208170740.53979-17-dave.hansen%40int= el.com --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/virt/vmx/tdx/tdx.c | 19 +++++++++++++++++++ 2 files changed, 20 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpuf= eatures.h index 45ddc6b..dbec4a0 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -496,6 +496,7 @@ #define X86_BUG_EIBRS_PBRSB X86_BUG(28) /* EIBRS is vulnerable to Post Ba= rrier RSB Predictions */ #define X86_BUG_SMT_RSB X86_BUG(29) /* CPU is vulnerable to Cross-Thread= Return Address Predictions */ #define X86_BUG_GDS X86_BUG(30) /* CPU is affected by Gather Data Sampli= ng */ +#define X86_BUG_TDX_PW_MCE X86_BUG(31) /* CPU may incur #MC if non-TD sof= tware does partial write to TDX private memory */ =20 /* BUG word 2 */ #define X86_BUG_SRSO X86_BUG(1*32 + 0) /* AMD SRSO bug */ diff --git a/arch/x86/virt/vmx/tdx/tdx.c b/arch/x86/virt/vmx/tdx/tdx.c index 6d030f6..06b04b9 100644 --- a/arch/x86/virt/vmx/tdx/tdx.c +++ b/arch/x86/virt/vmx/tdx/tdx.c @@ -33,6 +33,8 @@ #include #include #include +#include +#include #include "tdx.h" =20 static u32 tdx_global_keyid __ro_after_init; @@ -1308,6 +1310,21 @@ static struct notifier_block tdx_memory_nb =3D { .notifier_call =3D tdx_memory_notifier, }; =20 +static void __init check_tdx_erratum(void) +{ + /* + * These CPUs have an erratum. A partial write from non-TD + * software (e.g. via MOVNTI variants or UC/WC mapping) to TDX + * private memory poisons that memory, and a subsequent read of + * that memory triggers #MC. + */ + switch (boot_cpu_data.x86_model) { + case INTEL_FAM6_SAPPHIRERAPIDS_X: + case INTEL_FAM6_EMERALDRAPIDS_X: + setup_force_cpu_bug(X86_BUG_TDX_PW_MCE); + } +} + void __init tdx_init(void) { u32 tdx_keyid_start, nr_tdx_keyids; @@ -1361,4 +1378,6 @@ void __init tdx_init(void) tdx_nr_guest_keyids =3D nr_tdx_keyids - 1; =20 setup_force_cpu_cap(X86_FEATURE_TDX_HOST_PLATFORM); + + check_tdx_erratum(); }