From nobody Tue Dec 16 14:49:52 2025 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id E1F1E2D877B for ; Fri, 5 Dec 2025 21:59:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764971995; cv=none; b=CuGz7oYbmqve2D3ANBF5xwr/Z0V9xofovc37C/4v85g7kErM5TzRwhXQQV5xl0MWzaJ/to4g2wLLjfo7BCsUeQ1TxvyxuHGWBcthpaK08wRdXJa4srBrBKreYzYJ9IGy52dCiwLiU4FBevAyC8LZ7ofLF8l/agmVzTWUfm6uZtY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1764971995; c=relaxed/simple; bh=J9WAGI8WAt69+pKH/2+K+7un+qCHL2qLCTCpm3LTkn0=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=ED6IqD11/1HHM8nFbhaES63wn8/HQIyhV3lOWtAN2W+uuIBGKkTDacp7nco3Oin2TRF/51Zy6xLF7Ji8vuB3eOIeebjuyFQLxxpewPwf3P3d1/9s6ZIBuvU8wpjTJ+D0CyT+8OkIiWqFCxUSuKRtF1b78q5W1cJDWVWk1IJ+snI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 2C3F31A9A; Fri, 5 Dec 2025 13:59:45 -0800 (PST) Received: from merodach.members.linode.com (usa-sjc-mx-foss1.foss.arm.com [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id C07FA3F740; Fri, 5 Dec 2025 13:59:48 -0800 (PST) From: James Morse To: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org Cc: James Morse , D Scott Phillips OS , carl@os.amperecomputing.com, lcherian@marvell.com, bobo.shaobowang@huawei.com, tan.shaopeng@fujitsu.com, baolin.wang@linux.alibaba.com, Jamie Iles , Xin Hao , peternewman@google.com, dfustini@baylibre.com, amitsinght@marvell.com, David Hildenbrand , Dave Martin , Koba Ko , Shanker Donthineni , fenghuay@nvidia.com, baisheng.gao@unisoc.com, Jonathan Cameron , Gavin Shan , Ben Horgan , rohit.mathew@arm.com, reinette.chatre@intel.com, Punit Agrawal Subject: [RFC PATCH 06/38] KVM: arm64: Force guest EL1 to use user-space's partid configuration Date: Fri, 5 Dec 2025 21:58:29 +0000 Message-Id: <20251205215901.17772-7-james.morse@arm.com> X-Mailer: git-send-email 2.20.1 In-Reply-To: <20251205215901.17772-1-james.morse@arm.com> References: <20251205215901.17772-1-james.morse@arm.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" While we trap the guest's attempts to read/write the MPAM control registers, the hardware continues to use them. Guest-EL0 uses KVM's user-space's configuration, as the value is left in the register, and guest-EL1 uses either the host kernel's configuration, or in the case of VHE, the UNKNOWN reset value of MPAM1_EL1. On nVHE systems, EL2 continues to use partid-0 for world-switch, even when the host may have configured its kernel threads to use a different partid. 0 may have been assigned to another task. We want to force the guest-EL1 to use KVM's user-space's MPAM configuration, and EL2s to match the host's EL1 config. On a nVHE system, copy the EL1 MPAM register to EL2. This ensures world-switch uses the same partid as the kernel thread does on the host. When loading the guest's EL1 registers, copy the VMM's EL0 partid to the EL1 register. For VHE systems, we can skip restoring the EL1 register for the host, as it is out-of-context once HCR_EL2.TGE is set. This is done outside the usual sysreg save/restore as the values can change behind KVMs back, so should not be stored in the guest context. Signed-off-by: James Morse --- arch/arm64/include/asm/kvm_host.h | 1 + arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h | 16 ++++++++++++++++ arch/arm64/kvm/hyp/nvhe/switch.c | 12 ++++++++++++ arch/arm64/kvm/hyp/vhe/sysreg-sr.c | 1 + 4 files changed, 30 insertions(+) diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm= _host.h index b763293281c8..baba23b7ce97 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -447,6 +447,7 @@ enum vcpu_sysreg { MDCCINT_EL1, /* Monitor Debug Comms Channel Interrupt Enable Reg */ OSLSR_EL1, /* OS Lock Status Register */ DISR_EL1, /* Deferred Interrupt Status Register */ + MPAM1_EL1, /* Memory Partitioning And Monitoring register */ =20 /* Performance Monitors Registers */ PMCR_EL0, /* Control Register */ diff --git a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h b/arch/arm64/kvm/hy= p/include/hyp/sysreg-sr.h index a17cbe7582de..d8ab0ced0403 100644 --- a/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h +++ b/arch/arm64/kvm/hyp/include/hyp/sysreg-sr.h @@ -166,6 +166,9 @@ static inline void __sysreg_save_el1_state(struct kvm_c= pu_context *ctxt) ctxt_sys_reg(ctxt, TFSRE0_EL1) =3D read_sysreg_s(SYS_TFSRE0_EL1); } =20 + if (system_supports_mpam()) + ctxt_sys_reg(ctxt, MPAM1_EL1) =3D read_sysreg_el1(SYS_MPAM1); + ctxt_sys_reg(ctxt, SP_EL1) =3D read_sysreg(sp_el1); ctxt_sys_reg(ctxt, ELR_EL1) =3D read_sysreg_el1(SYS_ELR); ctxt_sys_reg(ctxt, SPSR_EL1) =3D read_sysreg_el1(SYS_SPSR); @@ -261,6 +264,9 @@ static inline void __sysreg_restore_el1_state(struct kv= m_cpu_context *ctxt, write_sysreg_s(ctxt_sys_reg(ctxt, TFSRE0_EL1), SYS_TFSRE0_EL1); } =20 + if (system_supports_mpam()) + write_sysreg_el1(ctxt_sys_reg(ctxt, MPAM1_EL1), SYS_MPAM1); + if (!has_vhe() && cpus_have_final_cap(ARM64_WORKAROUND_SPECULATIVE_AT) && ctxt->__hyp_running_vcpu) { @@ -374,4 +380,14 @@ static inline void __sysreg32_restore_state(struct kvm= _vcpu *vcpu) write_sysreg(__vcpu_sys_reg(vcpu, DBGVCR32_EL2), dbgvcr32_el2); } =20 +/* + * The _EL0 value was written by the host's context switch and belongs to = the + * VMM. Copy this into the guest's _EL1 register. + */ +static inline void __mpam_guest_load(void) +{ + if (system_supports_mpam()) + write_sysreg_el1(read_sysreg_s(SYS_MPAM0_EL1), SYS_MPAM1); +} + #endif /* __ARM64_KVM_HYP_SYSREG_SR_H__ */ diff --git a/arch/arm64/kvm/hyp/nvhe/switch.c b/arch/arm64/kvm/hyp/nvhe/swi= tch.c index d3b9ec8a7c28..b785977aa61e 100644 --- a/arch/arm64/kvm/hyp/nvhe/switch.c +++ b/arch/arm64/kvm/hyp/nvhe/switch.c @@ -238,6 +238,15 @@ static inline bool fixup_guest_exit(struct kvm_vcpu *v= cpu, u64 *exit_code) return __fixup_guest_exit(vcpu, exit_code, handlers); } =20 +/* Use the host thread's partid and pmg for world switch */ +static void __mpam_copy_el1_to_el2(void) +{ + if (system_supports_mpam()) { + write_sysreg_s(read_sysreg_s(SYS_MPAM1_EL1), SYS_MPAM2_EL2); + isb(); + } +} + /* Switch to the guest for legacy non-VHE systems */ int __kvm_vcpu_run(struct kvm_vcpu *vcpu) { @@ -247,6 +256,8 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu) bool pmu_switch_needed; u64 exit_code; =20 + __mpam_copy_el1_to_el2(); + /* * Having IRQs masked via PMR when entering the guest means the GIC * will not signal the CPU of interrupts of lower priority, and the @@ -306,6 +317,7 @@ int __kvm_vcpu_run(struct kvm_vcpu *vcpu) __timer_enable_traps(vcpu); =20 __debug_switch_to_guest(vcpu); + __mpam_guest_load(); =20 do { /* Jump in the fire! */ diff --git a/arch/arm64/kvm/hyp/vhe/sysreg-sr.c b/arch/arm64/kvm/hyp/vhe/sy= sreg-sr.c index f28c6cf4fe1b..2a84edc90465 100644 --- a/arch/arm64/kvm/hyp/vhe/sysreg-sr.c +++ b/arch/arm64/kvm/hyp/vhe/sysreg-sr.c @@ -222,6 +222,7 @@ void __vcpu_load_switch_sysregs(struct kvm_vcpu *vcpu) */ __sysreg32_restore_state(vcpu); __sysreg_restore_user_state(guest_ctxt); + __mpam_guest_load(); =20 if (unlikely(is_hyp_ctxt(vcpu))) { __sysreg_restore_vel2_state(vcpu); --=20 2.39.5