From nobody Tue Feb 10 07:38:58 2026 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 73D611F8671; Tue, 25 Mar 2025 18:49:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742928573; cv=none; b=hSP3evHLZqUU1JhGPZyOYkK/ixVVyydEnJMvdgjpBpinviEujufdf4OgiDCR31vN0spbBCd8VoNqRp0hdy5b0xlmV15Z/STUOYNPmjTmSm6NZko3YeTgB1HInnqNqSGIfQjtQh/kjciqELUZv2iFS9QLp/ww2eqn527QdSFAMB8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1742928573; c=relaxed/simple; bh=2+URmYuqU9k3emvb53ht0RW1v5r4YgWgFbXpwB4TljM=; h=From:Date:Subject:MIME-Version:Content-Type:Message-Id:References: In-Reply-To:To:Cc; b=uE/Kwsjq0/zXB5FkANulKkU3Hx2bxCKh46DtdC9113OqIWX1VYqU3aYLHtPWruql7eRz2Qn/o6DI/S8W1saMrCLGD3wj+qVtBdtS6PXJvbayA2Fiv4ghscbDh0ymYCtqWPCCinbxxBRcmjh9YES7jVsczBjDwOSKoGzZhwWVBNM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=YjjZfO5B; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="YjjZfO5B" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7F405C4CEE4; Tue, 25 Mar 2025 18:49:30 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1742928572; bh=2+URmYuqU9k3emvb53ht0RW1v5r4YgWgFbXpwB4TljM=; h=From:Date:Subject:References:In-Reply-To:To:Cc:From; b=YjjZfO5B451az+TLpNEhWv2GonTV2QpL+TjaDxhYeYd3DehPpLGbs5XLrE3dTa5qm PaMX4y+UVZ/8sdLkKQSU3ODCcAKXSdyCJS1c7ZwLR4+PGuyedd04KJP5dVwuA+VO3n li6CwBrC7BnfK7vH2wKtW6cXSlQJOfzu3D3ai5RwtP3CUqSKLX3h39O2C2Rpe7UgBR qdsfLzoAAqzvE0OOhud0cqj9n78kO8FAOrApViH0b+Bwi19/G0N/2Yh1LeCVPzTHm7 wwgd5YC5Exr+qRLw0LCWRHGe7mbC0hgzKVOmwL8lGGlVFZH9x7+hNgQRSGWfTLyfZs KAJjb5dL+aAFg== From: Mark Brown Date: Tue, 25 Mar 2025 18:48:17 +0000 Subject: [PATCH 6.1 03/12] arm64/fpsimd: Have KVM explicitly say which FP registers to save Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Message-Id: <20250325-stable-sve-6-1-v1-3-83259d427d84@kernel.org> References: <20250325-stable-sve-6-1-v1-0-83259d427d84@kernel.org> In-Reply-To: <20250325-stable-sve-6-1-v1-0-83259d427d84@kernel.org> To: Catalin Marinas , Will Deacon , Marc Zyngier , James Morse , Suzuki K Poulose , Oliver Upton , Oleg Nesterov Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, kvmarm@lists.linux.dev, kvmarm@lists.cs.columbia.edu, Mark Brown , Mark Rutland X-Mailer: b4 0.15-dev-1b0d6 X-Developer-Signature: v=1; a=openpgp-sha256; l=7200; i=broonie@kernel.org; h=from:subject:message-id; bh=2+URmYuqU9k3emvb53ht0RW1v5r4YgWgFbXpwB4TljM=; b=owEBbQGS/pANAwAKASTWi3JdVIfQAcsmYgBn4vqobNp43QT6nTfr+EBshJyDuFIV5VCRSn9quLkW AlpIxM+JATMEAAEKAB0WIQSt5miqZ1cYtZ/in+ok1otyXVSH0AUCZ+L6qAAKCRAk1otyXVSH0K2AB/ 4pH9kqgeQXoNKOe3DSgu8O9E0g1VmM7Gsc6re0W86A7KoNBKbKY46YqlbUlfe68YUIA99/3P96r2YN wVgEnMCXbmWbR3LQViAEJnE+OxyZSt98gFM3vwsn6eZC0d8A1ddi6TWey2vc6QVtJqsry70HKN5tJ2 63OdkP0OE5edN2nr8tZ0nsgW78yrpEfzNRMzLYjjRU8Yl6kchyN/2AuL0ZqOeTPCeDIod+r81bbuRz fIL+quOiR/piaYoFvoGAHJVCxDhj+c9txHKmeKtvrDpI4QwOUX212svLMfDyrGVfvEPGTx7ckRKmXJ 80VXA/QAXnunz1n2WV/xUpBxxbKnO3 X-Developer-Key: i=broonie@kernel.org; a=openpgp; fpr=3F2568AAC26998F9E813A1C5C3F436CA30F5D8EB [ Upstream commit deeb8f9a80fdae5a62525656d65c7070c28bd3a4 ] In order to avoid needlessly saving and restoring the guest registers KVM relies on the host FPSMID code to save the guest registers when we context switch away from the guest. This is done by binding the KVM guest state to the CPU on top of the task state that was originally there, then carefully managing the TIF_SVE flag for the task to cause the host to save the full SVE state when needed regardless of the needs of the host task. This works well enough but isn't terribly direct about what is going on and makes it much more complicated to try to optimise what we're doing with the SVE register state. Let's instead have KVM pass in the register state it wants saving when it binds to the CPU. We introduce a new FP_STATE_CURRENT for use during normal task binding to indicate that we should base our decisions on the current task. This should not be used when actually saving. Ideally we might want to use a separate enum for the type to save but this enum and the enum values would then need to be named which has problems with clarity and ambiguity. In order to ease any future debugging that might be required this patch does not actually update any of the decision making about what to save, it merely starts tracking the new information and warns if the requested state is not what we would otherwise have decided to save. Signed-off-by: Mark Brown Reviewed-by: Catalin Marinas Reviewed-by: Marc Zyngier Link: https://lore.kernel.org/r/20221115094640.112848-4-broonie@kernel.org Signed-off-by: Will Deacon [ Mark: trivial backport ] Signed-off-by: Mark Rutland Signed-off-by: Mark Brown --- arch/arm64/include/asm/fpsimd.h | 3 ++- arch/arm64/include/asm/processor.h | 1 + arch/arm64/kernel/fpsimd.c | 27 ++++++++++++++++++++++++--- arch/arm64/kvm/fpsimd.c | 9 ++++++++- 4 files changed, 35 insertions(+), 5 deletions(-) diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsim= d.h index e10894100c7394b3f3156f7475d62646662812d4..7622782d0bb97529867a784bf1d= b0c14260bac99 100644 --- a/arch/arm64/include/asm/fpsimd.h +++ b/arch/arm64/include/asm/fpsimd.h @@ -61,7 +61,8 @@ extern void fpsimd_kvm_prepare(void); extern void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *state, void *sve_state, unsigned int sve_vl, void *za_state, unsigned int sme_vl, - u64 *svcr, enum fp_type *type); + u64 *svcr, enum fp_type *type, + enum fp_type to_save); =20 extern void fpsimd_flush_task_state(struct task_struct *target); extern void fpsimd_save_and_flush_cpu_state(void); diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/pr= ocessor.h index 208434a2e9247c9be5c85a032494f256e4c2cd58..1b822e618bb4bb35b7a89d1308b= ce7f860ee331b 100644 --- a/arch/arm64/include/asm/processor.h +++ b/arch/arm64/include/asm/processor.h @@ -123,6 +123,7 @@ enum vec_type { }; =20 enum fp_type { + FP_STATE_CURRENT, /* Save based on current task state. */ FP_STATE_FPSIMD, FP_STATE_SVE, }; diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c index 2e0cecf02bf8fcaa799cf1bc89439a61a2a77973..1f6fd9229e536966292a9751f08= 103912a48ba07 100644 --- a/arch/arm64/kernel/fpsimd.c +++ b/arch/arm64/kernel/fpsimd.c @@ -126,6 +126,7 @@ struct fpsimd_last_state_struct { unsigned int sve_vl; unsigned int sme_vl; enum fp_type *fp_type; + enum fp_type to_save; }; =20 static DEFINE_PER_CPU(struct fpsimd_last_state_struct, fpsimd_last_state); @@ -356,7 +357,8 @@ void task_set_vl_onexec(struct task_struct *task, enum = vec_type type, * but userspace is discouraged from relying on this. * * task->thread.sve_state does not need to be non-NULL, valid or any - * particular size: it must not be dereferenced. + * particular size: it must not be dereferenced and any data stored + * there should be considered stale and not referenced. * * * SVE state - FP_STATE_SVE: * @@ -369,7 +371,9 @@ void task_set_vl_onexec(struct task_struct *task, enum = vec_type type, * task->thread.uw.fpsimd_state should be ignored. * * task->thread.sve_state must point to a valid buffer at least - * sve_state_size(task) bytes in size. + * sve_state_size(task) bytes in size. The data stored in + * task->thread.uw.fpsimd_state.vregs should be considered stale + * and not referenced. * * * FPSR and FPCR are always stored in task->thread.uw.fpsimd_state * irrespective of whether TIF_SVE is clear or set, since these are @@ -459,6 +463,21 @@ static void fpsimd_save(void) vl =3D last->sve_vl; } =20 + /* + * Validate that an explicitly specified state to save is + * consistent with the task state. + */ + switch (last->to_save) { + case FP_STATE_CURRENT: + break; + case FP_STATE_FPSIMD: + WARN_ON_ONCE(save_sve_regs); + break; + case FP_STATE_SVE: + WARN_ON_ONCE(!save_sve_regs); + break; + } + if (system_supports_sme()) { u64 *svcr =3D last->svcr; =20 @@ -1709,6 +1728,7 @@ static void fpsimd_bind_task_to_cpu(void) last->sme_vl =3D task_get_sme_vl(current); last->svcr =3D ¤t->thread.svcr; last->fp_type =3D ¤t->thread.fp_type; + last->to_save =3D FP_STATE_CURRENT; current->thread.fpsimd_cpu =3D smp_processor_id(); =20 /* @@ -1733,7 +1753,7 @@ static void fpsimd_bind_task_to_cpu(void) void fpsimd_bind_state_to_cpu(struct user_fpsimd_state *st, void *sve_stat= e, unsigned int sve_vl, void *za_state, unsigned int sme_vl, u64 *svcr, - enum fp_type *type) + enum fp_type *type, enum fp_type to_save) { struct fpsimd_last_state_struct *last =3D this_cpu_ptr(&fpsimd_last_state); @@ -1748,6 +1768,7 @@ void fpsimd_bind_state_to_cpu(struct user_fpsimd_stat= e *st, void *sve_state, last->sve_vl =3D sve_vl; last->sme_vl =3D sme_vl; last->fp_type =3D type; + last->to_save =3D to_save; } =20 /* diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c index a4b4502ad850a5c12f0d2809bdedbba6c6eb957e..89c02ce797b874196eff978464a= 936dfb020ad02 100644 --- a/arch/arm64/kvm/fpsimd.c +++ b/arch/arm64/kvm/fpsimd.c @@ -130,9 +130,16 @@ void kvm_arch_vcpu_ctxflush_fp(struct kvm_vcpu *vcpu) */ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) { + enum fp_type fp_type; + WARN_ON_ONCE(!irqs_disabled()); =20 if (vcpu->arch.fp_state =3D=3D FP_STATE_GUEST_OWNED) { + if (vcpu_has_sve(vcpu)) + fp_type =3D FP_STATE_SVE; + else + fp_type =3D FP_STATE_FPSIMD; + /* * Currently we do not support SME guests so SVCR is * always 0 and we just need a variable to point to. @@ -141,7 +148,7 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) vcpu->arch.sve_state, vcpu->arch.sve_max_vl, NULL, 0, &vcpu->arch.svcr, - &vcpu->arch.fp_type); + &vcpu->arch.fp_type, fp_type); =20 clear_thread_flag(TIF_FOREIGN_FPSTATE); update_thread_flag(TIF_SVE, vcpu_has_sve(vcpu)); --=20 2.39.5