From nobody Mon Apr 6 09:15:14 2026 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3860921CC59 for ; Sat, 21 Mar 2026 00:10:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=170.10.129.124 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774051809; cv=none; b=c7qR8++RIIhFh71Ntca+U5v2JWVtRxTuSr+jfjZFnv9/kSef4S4hGtPpYwVUJqxGhkO4qhNQsG8lHKFfBa68Co/okd9mfSkBMB8OItmy3Uvz9Vy18e56swGtkxxtU9INaXPzYFomI703La3J3ivqXfaU+dylFnbSNIUEhjjbNec= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1774051809; c=relaxed/simple; bh=5iUWbtzlz2YObxdoLlpGeHo0/a6BjtfyWEcRu3CZ6MQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version:Content-Type; b=dqFsob4ZCFkD2cUYjU6YB9qc3Jn0Hh5IAqBs1Ho2Bv8OHwkJLHfdTn+ZefpNrTWpXhQLlbHasQ7m6SD9uNitw2daz0dEfYjc20KBRVWlpEqSXwnqMrPrigoJq6K4lnJ1jcb0siSPJl+4DCb+lXo2s95JZp8F674MfANemKrWiHE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com; spf=pass smtp.mailfrom=redhat.com; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b=JceaN+aH; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b=AuFbAax8; arc=none smtp.client-ip=170.10.129.124 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=redhat.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=redhat.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="JceaN+aH"; dkim=pass (2048-bit key) header.d=redhat.com header.i=@redhat.com header.b="AuFbAax8" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1774051802; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MJQ8BDq4XDZQi2opeGtHBzWX/P8/hfIOuCb1cE+lngI=; b=JceaN+aHVulG2m1q1FW6Md4JswPp+Wr3NxuEXql3x1LTJQkfSTQHDf19mrl8B6P9lDmFxf FmJZ71qmJgQB6c7PnU8WXpvy8bsF66Pf8nAT543eqY7deBdxkQS9qxI5y/WBshT3xxCzE3 ftLrdC9cbsMpdg31jzCgGrlC8G+sGow= Received: from mail-wm1-f70.google.com (mail-wm1-f70.google.com [209.85.128.70]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-117-SbR9b0gnMnej9JE2iQRevw-1; Fri, 20 Mar 2026 20:10:00 -0400 X-MC-Unique: SbR9b0gnMnej9JE2iQRevw-1 X-Mimecast-MFC-AGG-ID: SbR9b0gnMnej9JE2iQRevw_1774051800 Received: by mail-wm1-f70.google.com with SMTP id 5b1f17b1804b1-48378df3469so18294565e9.1 for ; Fri, 20 Mar 2026 17:10:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=google; t=1774051798; x=1774656598; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=MJQ8BDq4XDZQi2opeGtHBzWX/P8/hfIOuCb1cE+lngI=; b=AuFbAax8XOrHUYAxxfNVDqGz40Ma4ADH5rwlnY4P9a6T/zpMcgNYaFgkl0csltDbkl RV3vBa9MaXENnYxUmcIrc4bJpoySQCPLKtWFkvvnB6/XTUv6nbF5NefJ6bJit0MxKRxO 17j6rwkMkItRDi1AbT+DlfvnIICpKLDaLMEO8lpqk8m9YtA9TOPVn40O+BGt1PCWKkuV aZ02QceDP6wCv+uTgGEc+l6Ij/yCBfpSJJPBisBnozBj76HSaVC1DP5q9bmS/XZIfTlk nhGn7ANKZaK/hPkwVN7LulpRsaAp67Zfw7e3jY1Zqoci4YJdqaSeAg4Cu6teywXtkqWQ vhMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774051798; x=1774656598; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=MJQ8BDq4XDZQi2opeGtHBzWX/P8/hfIOuCb1cE+lngI=; b=rxDSS5U8ys0jOCDODITHDJ2y+lgc/zrgm0EVCNeg51+q5pYSy6Pp4JsdGvEQ5xxfgc 8Ok5+yq1O32RneTBG7LUCWqkBqBneNcppAPY/lM+LLVDghS9zFi//HTBLJARoNGdmcvI egKDlNqodkXxWoNq++ElMJr41m30i5PocdXWH3OJXXucXqRgKImnGpjO2Ak4vuCOq+2a oTUr9kZwcwL07THZ8VRE7LvoE8Edp0TB79mBBc3/2eHzCDTEYvPdzVK/gX+m8qi0ktOB dTo+oA7k6JFB9IlBu0tQMEeUouowapbi0pTol6mhiDSdS1Ki3pqurAu6tnIQyOu1usMh hmuw== X-Gm-Message-State: AOJu0Yx6Duj+WmoLu8DjRXkGg6oQBO0JJIYWdH37hcYkW9lQW/6afK7E COyXHGoT984opES9DemiP8E4bEP2rzqO0X6j0qjZKBf15qAru9sbBgMOrhMZ8mkMaI0ov4zpDOa XyPfdK83KDXJw6ZuWHA9pTzQb9j5Pkf41m41dafGkpJfR9zM+vianUNUHTSyaJKpQccWRX0UYC6 EpTIZ5bu+mlRaSCEXR1NZtsYAqtINUoWGk+2TWDmtZxNHhvO9kAg== X-Gm-Gg: ATEYQzxGm+6pkvRoFE06Z710dr0cLkwu3cdrbfXO0sBtmT3CPMwl2DzoXqV9jNKEUz1 H//Xh9ZGTArFDR1rYNT4HLuS9LGC1rX0LgbrByxJTL32vvkD+6VzxOJiViy+hSmDAiPsAn46WEa jlF7kgVGPJMrqIl/DNZ0wuaFh1H8Dh7GkUnYHWm+x1dpTA4DYc9ReCbKsuVJqYIETfyuUOzvQk5 UTF0VuuKKGTyTEFTdOh3Epq1EROpbeeknL+ULWKXrl9SjHw2GyFkhgbbuBuCDHxK43nALLuSpBQ P/XUOEzC4RFbq5/4D1038Z+t4IzNFfkohi3SMKoS4Qf/T0knZY7k5HPvzV9M7/g48nvrHkiekB/ 0TCP+R/GA4sh3YuUbRWbZJqb14VRxe14nZ84vr0xlvlr1RJ1hMY1HA2D/Cq8mVGOf0/NqbhpWhq HgxsrJwvaLnRQDSyO+KIke4MBj X-Received: by 2002:a05:600c:3286:b0:486:fdb9:c065 with SMTP id 5b1f17b1804b1-486fdb9c0f4mr66311385e9.2.1774051798393; Fri, 20 Mar 2026 17:09:58 -0700 (PDT) X-Received: by 2002:a05:600c:3286:b0:486:fdb9:c065 with SMTP id 5b1f17b1804b1-486fdb9c0f4mr66311105e9.2.1774051797771; Fri, 20 Mar 2026 17:09:57 -0700 (PDT) Received: from [192.168.10.48] ([151.49.85.67]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-43b6470393fsm10985743f8f.17.2026.03.20.17.09.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Mar 2026 17:09:55 -0700 (PDT) From: Paolo Bonzini To: linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: Jon Kohler , Marcelo Tosatti , Nikunj A Dadhania , Amit Shah , Sean Christopherson Subject: [PATCH 10/22] KVM: x86/mmu: split XS/XU bits for MBEC Date: Sat, 21 Mar 2026 01:09:19 +0100 Message-ID: <20260321000931.1947084-11-pbonzini@redhat.com> X-Mailer: git-send-email 2.53.0 In-Reply-To: <20260321000931.1947084-1-pbonzini@redhat.com> References: <20260321000931.1947084-1-pbonzini@redhat.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When EPT is in use, replace ACC_USER_MASK with ACC_USER_EXEC_MASK, so that supervisor and user-mode execution can be controlled independently (ACC_USER_MASK would not allow a setting similar to XU=3D0 XS=3D1 W=3D1 R=3D1). Replace shadow_x_mask with shadow_xs_mask/shadow_xu_mask, to allow setting XS and XU bits separately in EPT entries. Signed-off-by: Paolo Bonzini --- arch/x86/include/asm/vmx.h | 1 + arch/x86/kvm/mmu/mmu.c | 15 ++++++++--- arch/x86/kvm/mmu/mmutrace.h | 6 ++--- arch/x86/kvm/mmu/paging_tmpl.h | 4 +++ arch/x86/kvm/mmu/spte.c | 47 ++++++++++++++++++++++------------ arch/x86/kvm/mmu/spte.h | 8 +++--- 6 files changed, 55 insertions(+), 26 deletions(-) diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h index 4a0804cc7c82..0041f8a77447 100644 --- a/arch/x86/include/asm/vmx.h +++ b/arch/x86/include/asm/vmx.h @@ -538,6 +538,7 @@ enum vmcs_field { #define VMX_EPT_IPAT_BIT (1ull << 6) #define VMX_EPT_ACCESS_BIT (1ull << 8) #define VMX_EPT_DIRTY_BIT (1ull << 9) +#define VMX_EPT_USER_EXECUTABLE_MASK (1ull << 10) #define VMX_EPT_SUPPRESS_VE_BIT (1ull << 63) #define VMX_EPT_RWX_MASK (VMX_EPT_READABLE_MASK | = \ VMX_EPT_WRITABLE_MASK | \ diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c index b7366e416baa..254d69c4b9f3 100644 --- a/arch/x86/kvm/mmu/mmu.c +++ b/arch/x86/kvm/mmu/mmu.c @@ -5371,7 +5371,7 @@ static void reset_shadow_zero_bits_mask(struct kvm_vc= pu *vcpu, static inline bool boot_cpu_is_amd(void) { WARN_ON_ONCE(!tdp_enabled); - return shadow_x_mask =3D=3D 0; + return shadow_xs_mask =3D=3D 0; } =20 /* @@ -5450,7 +5450,6 @@ static void update_permission_bitmask(struct kvm_mmu = *mmu, bool ept) { unsigned byte; =20 - const u16 x =3D ACC_BITS_MASK(ACC_EXEC_MASK); const u16 w =3D ACC_BITS_MASK(ACC_WRITE_MASK); const u16 r =3D ACC_BITS_MASK(ACC_READ_MASK); =20 @@ -5491,8 +5490,18 @@ static void update_permission_bitmask(struct kvm_mmu= *mmu, bool ept) u16 smapf =3D 0; =20 if (ept) { - ff =3D (pfec & PFERR_FETCH_MASK) ? (u16)~x : 0; + const u16 xs =3D ACC_BITS_MASK(ACC_EXEC_MASK); + const u16 xu =3D ACC_BITS_MASK(ACC_USER_EXEC_MASK); + + if (pfec & PFERR_FETCH_MASK) { + /* Ignore XU unless MBEC is enabled. */ + if (cr4_smep) + ff =3D pfec & PFERR_USER_MASK ? (u16)~xu : (u16)~xs; + else + ff =3D (u16)~xs; + } } else { + const u16 x =3D ACC_BITS_MASK(ACC_EXEC_MASK); const u16 u =3D ACC_BITS_MASK(ACC_USER_MASK); =20 /* Faults from kernel mode accesses to user pages */ diff --git a/arch/x86/kvm/mmu/mmutrace.h b/arch/x86/kvm/mmu/mmutrace.h index 44545f6f860a..e22588d3e145 100644 --- a/arch/x86/kvm/mmu/mmutrace.h +++ b/arch/x86/kvm/mmu/mmutrace.h @@ -354,8 +354,8 @@ TRACE_EVENT( __entry->sptep =3D virt_to_phys(sptep); __entry->level =3D level; __entry->r =3D shadow_present_mask || (__entry->spte & PT_PRESENT_MASK); - __entry->x =3D is_executable_pte(__entry->spte); - __entry->u =3D shadow_user_mask ? !!(__entry->spte & shadow_user_mask) := -1; + __entry->x =3D (__entry->spte & (shadow_xs_mask | shadow_nx_mask)) =3D= =3D shadow_xs_mask; + __entry->u =3D !!(__entry->spte & (shadow_xu_mask | shadow_user_mask)); ), =20 TP_printk("gfn %llx spte %llx (%s%s%s%s) level %d at %llx", @@ -363,7 +363,7 @@ TRACE_EVENT( __entry->r ? "r" : "-", __entry->spte & PT_WRITABLE_MASK ? "w" : "-", __entry->x ? "x" : "-", - __entry->u =3D=3D -1 ? "" : (__entry->u ? "u" : "-"), + __entry->u ? "u" : "-", __entry->level, __entry->sptep ) ); diff --git a/arch/x86/kvm/mmu/paging_tmpl.h b/arch/x86/kvm/mmu/paging_tmpl.h index bbdbf4ae2d65..c657ea90bb33 100644 --- a/arch/x86/kvm/mmu/paging_tmpl.h +++ b/arch/x86/kvm/mmu/paging_tmpl.h @@ -174,6 +174,10 @@ static inline unsigned FNAME(gpte_access)(u64 gpte) { unsigned access; #if PTTYPE =3D=3D PTTYPE_EPT + /* + * For now nested MBEC is not supported and permission_fault() ignores + * ACC_USER_EXEC_MASK. + */ access =3D ((gpte & VMX_EPT_WRITABLE_MASK) ? ACC_WRITE_MASK : 0) | ((gpte & VMX_EPT_EXECUTABLE_MASK) ? ACC_EXEC_MASK : 0) | ((gpte & VMX_EPT_READABLE_MASK) ? ACC_READ_MASK : 0); diff --git a/arch/x86/kvm/mmu/spte.c b/arch/x86/kvm/mmu/spte.c index 0b09124b0d54..0b3e2b97afbf 100644 --- a/arch/x86/kvm/mmu/spte.c +++ b/arch/x86/kvm/mmu/spte.c @@ -29,8 +29,9 @@ bool __read_mostly kvm_ad_enabled; u64 __read_mostly shadow_host_writable_mask; u64 __read_mostly shadow_mmu_writable_mask; u64 __read_mostly shadow_nx_mask; -u64 __read_mostly shadow_x_mask; /* mutual exclusive with nx_mask */ u64 __read_mostly shadow_user_mask; +u64 __read_mostly shadow_xs_mask; /* mutual exclusive with nx_mask and use= r_mask */ +u64 __read_mostly shadow_xu_mask; /* mutual exclusive with nx_mask and use= r_mask */ u64 __read_mostly shadow_accessed_mask; u64 __read_mostly shadow_dirty_mask; u64 __read_mostly shadow_mmio_value; @@ -216,22 +217,30 @@ bool make_spte(struct kvm_vcpu *vcpu, struct kvm_mmu_= page *sp, * when CR0.PG is toggled, but leveraging that to ignore the mitigation * would tie make_spte() further to vCPU/MMU state, and add complexity * just to optimize a mode that is anything but performance critical. + * + * Use ACC_USER_EXEC_MASK here assuming only Intel processors (EPT) + * are affected by the NX huge page erratum. */ - if (level > PG_LEVEL_4K && (pte_access & ACC_EXEC_MASK) && + if (level > PG_LEVEL_4K && + (pte_access & (ACC_EXEC_MASK | ACC_USER_EXEC_MASK)) && is_nx_huge_page_enabled(vcpu->kvm)) { - pte_access &=3D ~ACC_EXEC_MASK; + pte_access &=3D ~(ACC_EXEC_MASK | ACC_USER_EXEC_MASK); } =20 if (pte_access & ACC_READ_MASK) spte |=3D PT_PRESENT_MASK; /* or VMX_EPT_READABLE_MASK */ =20 - if (pte_access & ACC_EXEC_MASK) - spte |=3D shadow_x_mask; - else - spte |=3D shadow_nx_mask; - - if (pte_access & ACC_USER_MASK) - spte |=3D shadow_user_mask; + if (shadow_nx_mask) { + if (!(pte_access & ACC_EXEC_MASK)) + spte |=3D shadow_nx_mask; + if (pte_access & ACC_USER_MASK) + spte |=3D shadow_user_mask; + } else { + if (pte_access & ACC_EXEC_MASK) + spte |=3D shadow_xs_mask; + if (pte_access & ACC_USER_EXEC_MASK) + spte |=3D shadow_xu_mask; + } =20 if (level > PG_LEVEL_4K) spte |=3D PT_PAGE_SIZE_MASK; @@ -317,11 +326,13 @@ static u64 modify_spte_protections(u64 spte, u64 set,= u64 clear) static u64 make_spte_executable(u64 spte, u8 access) { u64 set, clear; - if (access & ACC_EXEC_MASK) - set =3D shadow_x_mask; + if (shadow_nx_mask) + set =3D (access & ACC_EXEC_MASK) ? 0 : shadow_nx_mask; else - set =3D shadow_nx_mask; - clear =3D set ^ (shadow_nx_mask | shadow_x_mask); + set =3D + (access & ACC_EXEC_MASK ? shadow_xs_mask : 0) | + (access & ACC_USER_EXEC_MASK ? shadow_xu_mask : 0); + clear =3D set ^ (shadow_nx_mask | shadow_xs_mask | shadow_xu_mask); return modify_spte_protections(spte, set, clear); } =20 @@ -388,7 +399,7 @@ u64 make_nonleaf_spte(u64 *child_pt, bool ad_disabled) =20 spte |=3D __pa(child_pt) | shadow_present_mask | PT_WRITABLE_MASK | PT_PRESENT_MASK /* or VMX_EPT_READABLE_MASK */ | - shadow_user_mask | shadow_x_mask | shadow_me_value; + shadow_user_mask | shadow_xs_mask | shadow_xu_mask | shadow_me_value; =20 if (ad_disabled) spte |=3D SPTE_TDP_AD_DISABLED; @@ -496,7 +507,8 @@ void kvm_mmu_set_ept_masks(bool has_ad_bits) shadow_accessed_mask =3D VMX_EPT_ACCESS_BIT; shadow_dirty_mask =3D VMX_EPT_DIRTY_BIT; shadow_nx_mask =3D 0ull; - shadow_x_mask =3D VMX_EPT_EXECUTABLE_MASK; + shadow_xs_mask =3D VMX_EPT_EXECUTABLE_MASK; + shadow_xu_mask =3D VMX_EPT_EXECUTABLE_MASK; shadow_present_mask =3D VMX_EPT_SUPPRESS_VE_BIT; =20 shadow_acc_track_mask =3D VMX_EPT_RWX_MASK; @@ -547,7 +559,8 @@ void kvm_mmu_reset_all_pte_masks(void) shadow_accessed_mask =3D PT_ACCESSED_MASK; shadow_dirty_mask =3D PT_DIRTY_MASK; shadow_nx_mask =3D PT64_NX_MASK; - shadow_x_mask =3D 0; + shadow_xs_mask =3D 0; + shadow_xu_mask =3D 0; shadow_present_mask =3D PT_PRESENT_MASK; =20 shadow_acc_track_mask =3D 0; diff --git a/arch/x86/kvm/mmu/spte.h b/arch/x86/kvm/mmu/spte.h index 0c305f2f4ba0..7323ff19056b 100644 --- a/arch/x86/kvm/mmu/spte.h +++ b/arch/x86/kvm/mmu/spte.h @@ -54,7 +54,8 @@ static_assert(SPTE_TDP_AD_ENABLED =3D=3D 0); =20 #define ACC_READ_MASK PT_PRESENT_MASK #define ACC_WRITE_MASK PT_WRITABLE_MASK -#define ACC_USER_MASK PT_USER_MASK +#define ACC_USER_MASK PT_USER_MASK /* non EPT */ +#define ACC_USER_EXEC_MASK ACC_USER_MASK /* EPT only */ #define ACC_EXEC_MASK 8 #define ACC_ALL (ACC_EXEC_MASK | ACC_WRITE_MASK | ACC_USER_MASK |= ACC_READ_MASK) =20 @@ -184,8 +185,9 @@ extern bool __read_mostly kvm_ad_enabled; extern u64 __read_mostly shadow_host_writable_mask; extern u64 __read_mostly shadow_mmu_writable_mask; extern u64 __read_mostly shadow_nx_mask; -extern u64 __read_mostly shadow_x_mask; /* mutual exclusive with nx_mask */ extern u64 __read_mostly shadow_user_mask; +extern u64 __read_mostly shadow_xs_mask; /* mutual exclusive with nx_mask = and user_mask */ +extern u64 __read_mostly shadow_xu_mask; /* mutual exclusive with nx_mask = and user_mask */ extern u64 __read_mostly shadow_accessed_mask; extern u64 __read_mostly shadow_dirty_mask; extern u64 __read_mostly shadow_mmio_value; @@ -352,7 +354,7 @@ static inline bool is_last_spte(u64 pte, int level) =20 static inline bool is_executable_pte(u64 spte) { - return (spte & (shadow_x_mask | shadow_nx_mask)) =3D=3D shadow_x_mask; + return (spte & (shadow_xs_mask | shadow_xu_mask | shadow_nx_mask)) !=3D s= hadow_nx_mask; } =20 static inline kvm_pfn_t spte_to_pfn(u64 pte) --=20 2.52.0