From nobody Sun Feb 8 07:21:30 2026 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E960E269820 for ; Mon, 5 May 2025 16:14:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461660; cv=none; b=aCyaLcDF/JyMTnPiOR9s+FuGJCJp1DizzzBAZtW3ns72UncZxapqA+GemGtKCVMVJ+zjc1gJE/G9W5Efs3mN611X/nowSLthS1pIOy2QDBM1iQvKDPbpmLgo45XlhbjkXNCVs2t2ONAXViRcCKc73blaPAliPVpaQtOqtJz1usA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461660; c=relaxed/simple; bh=v60h/PYlX3rvoNvgm4fRVRD4fD6gJf2WBSqw2Zx3QDU=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=kHFlfaefCLyKQacop5kJ/Qi3XfpsieRYiXEKisc6noWu9YJix6sun+VxzsIpBH2Y4Gic7z4nVVtTwbBBuSD71dXrL9R60G+DT9X/1N8GF/mvv8pj3JtpShR9mU0ayMZPoVwJk9KK/MFFVlwLizmVOFH5Ab/JLpSUeCOi4MVs9gM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=ZCeuiyDe; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="ZCeuiyDe" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-740774348f6so1270354b3a.1 for ; Mon, 05 May 2025 09:14:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746461657; x=1747066457; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=YHUmRqC710I5Zczb2Gv8+GrvF7PyHiekDQRANAyhxgo=; b=ZCeuiyDekSOaSZal9s7tvknAo51HddNL7ytjBh2nxpdRoxYnWwVvmeOjJWGzklf4ri S3BRTI1olqDIJ+CQyYpXQXOys+diroamVUZhodez39FfLantBqnuLtYy6GL1dWAZR2Vs xNjDIzoibD7gD74Kc1ohhxRgvB9xkG82GHOtCxQ8HA8SIThepT1Pfj/GuzT8KI20YkX8 ktv+kkXMZJKoQyErGqHUdZRSNWrE07cCXNm4TxOJ9UXN6YTVmt5TQ98jBWYMHC8Aoc+G aSvJ7mZTyZn4gDSC0xi1ml8gNKXErLc/y99ctW3AloqYx6wLpM1Pu4oOi8QC++04PN2u CDNQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746461657; x=1747066457; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=YHUmRqC710I5Zczb2Gv8+GrvF7PyHiekDQRANAyhxgo=; b=UbkMdqnqxIuHO5GQN5Iu5gN957kS8xgJVJSxQSHehcUtzcdokHk1NgaNpBLozX5nnc dDvrKu2vqNMO8FOrpIPBaza+JTqshvQFYQmlyjm1ImyobPwfADwg9WjcwZEKEofRSS3N ko+UYguQAanOwKsDDYG/EwOcCwHbfPLXqjoeSOrAy1UhiOBb5H9Ys/yHxNFrPBxJD4Te CQdiZCp4H4eUATMWG/iK8tW+/JDjPWppYWFuQa+GpEQAUJZnhbRAxcgtP93q1vFvDClh aDlStj2LAhruoS3vS94iSWid+/szo1iG8bEl1BVkpRs5LAsSd1oSDsQDoYFIzAuys+gP e1Kg== X-Forwarded-Encrypted: i=1; AJvYcCWHLQtw/DhMZflsE4MN1DbRYTdOmFwDHM4YRyhMl/AQO74SdOauEh76OBZvbsth0QNym8foju/jI56XHLo=@vger.kernel.org X-Gm-Message-State: AOJu0Yxvp90wKveGUUShALv56EwPw+t50IYQimc+9IszjUKo8nYxdzL9 fihUm0ijswL1elTzvsJviJOEYX2FQUFlkaFozm7ySpzffSj8i1nyuHtzAUW2Z5zlgxazMfJwm5v 7gOvlI0zV9A== X-Google-Smtp-Source: AGHT+IGpv50Wz1FmfbCrQKvOYpNfJ4AHoIP5XJSc/cts11lJ70nVyGRkWYp/qqNzmDSatbGKe3GytAHP4hN+Uw== X-Received: from pfhp37.prod.google.com ([2002:a05:6a00:a25:b0:740:813:f7bb]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:1bca:b0:734:b136:9c39 with SMTP id d2e1a72fcca58-7406f1769bemr10644622b3a.19.1746461657217; Mon, 05 May 2025 09:14:17 -0700 (PDT) Date: Mon, 5 May 2025 16:14:07 +0000 In-Reply-To: <20250505161412.1926643-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250505161412.1926643-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.49.0.967.g6a0df3ecc3-goog Message-ID: <20250505161412.1926643-2-jiaqiyan@google.com> Subject: [PATCH v1 1/6] KVM: arm64: VM exit to userspace to handle SEA From: Jiaqi Yan To: maz@kernel.org, oliver.upton@linux.dev Cc: joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, duenwen@google.com, rananta@google.com, jthoughton@google.com, Jiaqi Yan Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" When APEI fails to handle a stage2 abort that is synchrnous external abort (SEA), today KVM directly injects an async SError to the VCPU then resumes it, which usually results in unpleasant guest kernel panic. One major situation of guest SEA is when vCPU consumes recoverable uncorrected memory error (UER). Although SError and guest kernel panic effectively stops the propagation of corrupted memory, there is still room to recover from memory UER in a more graceful manner. Alternatively KVM can redirect the synchronous SEA event to VMM to - Reduce blast radius if possible. VMM can inject a SEA to VCPU via KVM's existing KVM_SET_VCPU_EVENTS API. If the memory poison consumption or fault is not from guest kernel, blast radius can be limited to the triggering thread in guest userspace, so VM can keep running. - VMM can protect from future memory poison consumption by unmapping the page from stage-2 with KVM userfault [1]. VMM can also track SEA events that VM customer cares about, restart VM when certain number of distinct poison events happened, provide observability to customers [2]. Introduce following userspace-visible features to make VMM handle SEA: - KVM_CAP_ARM_SEA_TO_USER. As the alternative fallback behavior when host APEI fails to claim a SEA, userspace can opt in this new capability to let KVM exit to userspace during synchronous abort. - KVM_EXIT_ARM_SEA. A new exit reason is introduced for this, and KVM fills kvm_run.arm_sea with as much as possible information about the SEA, including - ESR_EL2. - If faulting guest virtual and physical addresses are available. - Faulting guest virtual address if available. - Faulting guest physical address if available. [1] https://lpc.events/event/18/contributions/1757/attachments/1442/3073/LP= C_%20KVM%20Userfault.pdf [2] https://cloud.google.com/solutions/sap/docs/manage-host-errors Signed-off-by: Jiaqi Yan --- arch/arm64/include/asm/kvm_emulate.h | 12 +++++++ arch/arm64/include/asm/kvm_host.h | 8 +++++ arch/arm64/include/asm/kvm_ras.h | 21 ++++------- arch/arm64/kvm/Makefile | 3 +- arch/arm64/kvm/arm.c | 5 +++ arch/arm64/kvm/kvm_ras.c | 54 ++++++++++++++++++++++++++++ arch/arm64/kvm/mmu.c | 12 ++----- include/uapi/linux/kvm.h | 11 ++++++ 8 files changed, 101 insertions(+), 25 deletions(-) create mode 100644 arch/arm64/kvm/kvm_ras.c diff --git a/arch/arm64/include/asm/kvm_emulate.h b/arch/arm64/include/asm/= kvm_emulate.h index bd020fc28aa9c..a9de30478a088 100644 --- a/arch/arm64/include/asm/kvm_emulate.h +++ b/arch/arm64/include/asm/kvm_emulate.h @@ -429,6 +429,18 @@ static __always_inline bool kvm_vcpu_abt_issea(const s= truct kvm_vcpu *vcpu) } } =20 +/* Return true if FAR holds valid faulting guest virtual address. */ +static inline bool kvm_vcpu_sea_far_valid(const struct kvm_vcpu *vcpu) +{ + return !(kvm_vcpu_get_esr(vcpu) & ESR_ELx_FnV); +} + +/* Return true if HPFAR_EL2 holds valid faulting guest physical address. */ +static inline bool kvm_vcpu_sea_ipa_valid(const struct kvm_vcpu *vcpu) +{ + return vcpu->arch.fault.hpfar_el2 & HPFAR_EL2_NS; +} + static __always_inline int kvm_vcpu_sys_get_rt(struct kvm_vcpu *vcpu) { u64 esr =3D kvm_vcpu_get_esr(vcpu); diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm= _host.h index 73b7762b0e7d1..e0129f9799f80 100644 --- a/arch/arm64/include/asm/kvm_host.h +++ b/arch/arm64/include/asm/kvm_host.h @@ -342,6 +342,14 @@ struct kvm_arch { #define KVM_ARCH_FLAG_GUEST_HAS_SVE 9 /* MIDR_EL1, REVIDR_EL1, and AIDR_EL1 are writable from userspace */ #define KVM_ARCH_FLAG_WRITABLE_IMP_ID_REGS 10 + /* + * When APEI failed to claim stage-2 synchronous external abort + * (SEA) return to userspace with fault information. Userspace + * can opt in this feature if KVM_CAP_ARM_SEA_TO_USER is + * supported. Userspace is encouraged to handle this VM exit + * by injecting a SEA to VCPU before resume the VCPU. + */ +#define KVM_ARCH_FLAG_RETURN_SEA_TO_USER 11 unsigned long flags; =20 /* VM-wide vCPU feature set */ diff --git a/arch/arm64/include/asm/kvm_ras.h b/arch/arm64/include/asm/kvm_= ras.h index 9398ade632aaf..a2fd91af8f97e 100644 --- a/arch/arm64/include/asm/kvm_ras.h +++ b/arch/arm64/include/asm/kvm_ras.h @@ -4,22 +4,15 @@ #ifndef __ARM64_KVM_RAS_H__ #define __ARM64_KVM_RAS_H__ =20 -#include -#include -#include - -#include +#include =20 /* - * Was this synchronous external abort a RAS notification? - * Returns '0' for errors handled by some RAS subsystem, or -ENOENT. + * Handle stage2 synchronous external abort (SEA) in the following order: + * 1. Delegate to APEI/GHES and if they can claim SEA, resume guest. + * 2. If userspace opt-ed in KVM_CAP_ARM_SEA_TO_USER, exit to userspace + * with details about the SEA. + * 3. Otherwise, inject async SError into the VCPU and resume guest. */ -static inline int kvm_handle_guest_sea(void) -{ - /* apei_claim_sea(NULL) expects to mask interrupts itself */ - lockdep_assert_irqs_enabled(); - - return apei_claim_sea(NULL); -} +int kvm_handle_guest_sea(struct kvm_vcpu *vcpu); =20 #endif /* __ARM64_KVM_RAS_H__ */ diff --git a/arch/arm64/kvm/Makefile b/arch/arm64/kvm/Makefile index 209bc76263f10..785d568411e88 100644 --- a/arch/arm64/kvm/Makefile +++ b/arch/arm64/kvm/Makefile @@ -23,7 +23,8 @@ kvm-y +=3D arm.o mmu.o mmio.o psci.o hypercalls.o pvtime.= o \ vgic/vgic-v3.o vgic/vgic-v4.o \ vgic/vgic-mmio.o vgic/vgic-mmio-v2.o \ vgic/vgic-mmio-v3.o vgic/vgic-kvm-device.o \ - vgic/vgic-its.o vgic/vgic-debug.o vgic/vgic-v3-nested.o + vgic/vgic-its.o vgic/vgic-debug.o vgic/vgic-v3-nested.o \ + kvm_ras.o =20 kvm-$(CONFIG_HW_PERF_EVENTS) +=3D pmu-emul.o pmu.o kvm-$(CONFIG_ARM64_PTR_AUTH) +=3D pauth.o diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 19ca57def6292..47544945fba45 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -133,6 +133,10 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm, } mutex_unlock(&kvm->lock); break; + case KVM_CAP_ARM_SEA_TO_USER: + r =3D 0; + set_bit(KVM_ARCH_FLAG_RETURN_SEA_TO_USER, &kvm->arch.flags); + break; default: break; } @@ -322,6 +326,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long = ext) case KVM_CAP_IRQFD_RESAMPLE: case KVM_CAP_COUNTER_OFFSET: case KVM_CAP_ARM_WRITABLE_IMP_ID_REGS: + case KVM_CAP_ARM_SEA_TO_USER: r =3D 1; break; case KVM_CAP_SET_GUEST_DEBUG2: diff --git a/arch/arm64/kvm/kvm_ras.c b/arch/arm64/kvm/kvm_ras.c new file mode 100644 index 0000000000000..83f2731c95d77 --- /dev/null +++ b/arch/arm64/kvm/kvm_ras.c @@ -0,0 +1,54 @@ +// SPDX-License-Identifier: GPL-2.0-only + +#include +#include +#include +#include +#include +#include + +/* + * Was this synchronous external abort a RAS notification? + * Returns 0 for errors handled by some RAS subsystem, or -ENOENT. + */ +static int kvm_delegate_guest_sea(void) +{ + /* apei_claim_sea(NULL) expects to mask interrupts itself. */ + lockdep_assert_irqs_enabled(); + return apei_claim_sea(NULL); +} + +int kvm_handle_guest_sea(struct kvm_vcpu *vcpu) +{ + struct kvm_run *run =3D vcpu->run; + bool exit =3D test_bit(KVM_ARCH_FLAG_RETURN_SEA_TO_USER, + &vcpu->kvm->arch.flags); + + /* For RAS the host kernel may handle this abort. */ + if (kvm_delegate_guest_sea() =3D=3D 0) + return 1; + + if (!exit) { + /* Fallback behavior prior to KVM_EXIT_ARM_SEA. */ + kvm_inject_vabt(vcpu); + return 1; + } + + run->exit_reason =3D KVM_EXIT_ARM_SEA; + run->arm_sea.esr =3D kvm_vcpu_get_esr(vcpu); + run->arm_sea.flags =3D 0ULL; + run->arm_sea.gva =3D 0ULL; + run->arm_sea.gpa =3D 0ULL; + + if (kvm_vcpu_sea_far_valid(vcpu)) { + run->arm_sea.flags |=3D KVM_EXIT_ARM_SEA_FLAG_GVA_VALID; + run->arm_sea.gva =3D kvm_vcpu_get_hfar(vcpu); + } + + if (kvm_vcpu_sea_ipa_valid(vcpu)) { + run->arm_sea.flags |=3D KVM_EXIT_ARM_SEA_FLAG_GPA_VALID; + run->arm_sea.gpa =3D kvm_vcpu_get_fault_ipa(vcpu); + } + + return 0; +} diff --git a/arch/arm64/kvm/mmu.c b/arch/arm64/kvm/mmu.c index 754f2fe0cc673..a605ee56fa150 100644 --- a/arch/arm64/kvm/mmu.c +++ b/arch/arm64/kvm/mmu.c @@ -1795,16 +1795,8 @@ int kvm_handle_guest_abort(struct kvm_vcpu *vcpu) int ret, idx; =20 /* Synchronous External Abort? */ - if (kvm_vcpu_abt_issea(vcpu)) { - /* - * For RAS the host kernel may handle this abort. - * There is no need to pass the error into the guest. - */ - if (kvm_handle_guest_sea()) - kvm_inject_vabt(vcpu); - - return 1; - } + if (kvm_vcpu_abt_issea(vcpu)) + return kvm_handle_guest_sea(vcpu); =20 esr =3D kvm_vcpu_get_esr(vcpu); =20 diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index b6ae8ad8934b5..79dc4676ff74b 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -178,6 +178,7 @@ struct kvm_xen_exit { #define KVM_EXIT_NOTIFY 37 #define KVM_EXIT_LOONGARCH_IOCSR 38 #define KVM_EXIT_MEMORY_FAULT 39 +#define KVM_EXIT_ARM_SEA 40 =20 /* For KVM_EXIT_INTERNAL_ERROR */ /* Emulate instruction failed. */ @@ -446,6 +447,15 @@ struct kvm_run { __u64 gpa; __u64 size; } memory_fault; + /* KVM_EXIT_ARM_SEA */ + struct { + __u64 esr; +#define KVM_EXIT_ARM_SEA_FLAG_GVA_VALID (1ULL << 0) +#define KVM_EXIT_ARM_SEA_FLAG_GPA_VALID (1ULL << 1) + __u64 flags; + __u64 gva; + __u64 gpa; + } arm_sea; /* Fix the size of the union. */ char padding[256]; }; @@ -930,6 +940,7 @@ struct kvm_enable_cap { #define KVM_CAP_X86_APIC_BUS_CYCLES_NS 237 #define KVM_CAP_X86_GUEST_MODE 238 #define KVM_CAP_ARM_WRITABLE_IMP_ID_REGS 239 +#define KVM_CAP_ARM_SEA_TO_USER 240 =20 struct kvm_irq_routing_irqchip { __u32 irqchip; --=20 2.49.0.967.g6a0df3ecc3-goog From nobody Sun Feb 8 07:21:30 2026 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 34E2E26B08D for ; Mon, 5 May 2025 16:14:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461662; cv=none; b=t7Uhozc8OnD4m05FAXXXQwVJsstjB+4OZrBnMWAnDmpRWzdFczlbggfLeNTQphny3jILTcrJSijXuCUOIJER0yH6kYSImKkEMLtVgJq3SB5lPkK9qNTj3p7V5w3JkKUPpWAgC9OmEGbLG1/o6hM3eJG3SBwsyIyLZMMu9w+AXrM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461662; c=relaxed/simple; bh=tP/Dy38GceNJDf3Zr30G5oxFd5DNpiB2roiPdOIJcEM=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=mfWGUq5BZuQLVirjslp0Worciyg+f3ywmjIG/oKvmFc7AVP2elZJMirbxVj32LNtttCJELvt9h2gTB37HEZxUv1DiXfryrndASwnLS3VhXA0os7E92sICm1F9hiUNsH0S+2CFyhS1vt4q+kucUoPTLjFGH90f9ILr7oBJx4CUu0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Dq1de9w4; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Dq1de9w4" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-30872785c3cso6774265a91.1 for ; Mon, 05 May 2025 09:14:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746461658; x=1747066458; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=OJagTHHNdVxyDzkTZrB+wizm+z9g6erBUIGhulC+0c0=; b=Dq1de9w4Pg5IEbVQ7AXzJAaX/hESIaPjg90O5SF3l8pDliiJs8rWe/u6tsuAracit+ enRAQfE9EORFZ42ebCyZa/bC3BkD3qJd1OLRJbfQsGNX1Lf2jsi37yU9tsRg0uYDDV9g x11w5UrCq29k3EZSL14EV5gjIXw6SNj5/ORvLunUPHvhVR/ATPCRVnKOPUJmbOioPwSi t594DaLU7cHR6EtgdIP8I3gfqiNQB61vXIR6jmEMrJT45xYeTEZEH+CaaQ7TPClYXqgp lCkuzopDk+YmuIpvS5M/GSSIGAvjIvQbOAkWflbB7G9WX7fbXnytQmFBnALrXvkj11Md WxmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746461658; x=1747066458; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=OJagTHHNdVxyDzkTZrB+wizm+z9g6erBUIGhulC+0c0=; b=Y7gO9Ga/LReFb3d3wjCf/oJINN84Y/PZJL/0LH9mmDMYSLVyoTn+bqqxJdklWTd+if d9jRxcJ1bHMxs72X350bGbyF3QGxUU4RfbV1aIYGCyLcDIv0lQhUzuYiYA0fFtYAaUFg TeTIx1aygwvHRG69WqhPlN78nzPSB1Z7DRHHOwV43f0uxKR8ecu+UxAE+I3aCviHRwSC aYu0FeHw3x0e+SO5ZQ8bM2xAhebVXG5pF7h8LFhsB8Wxbcw6SuWaoQF4WFIQVDcSNoYJ Tv6hyknp22TiqjmzH9ewHT2z+OV/KF1dndw2RACfpb8pS6iXN9Lcnn/AXsRIBxB+1sLy BiUg== X-Forwarded-Encrypted: i=1; AJvYcCVFOusyZiF4UGexqKqXePF0Zzf5dtpFCX234S7scDmb8OnH+ZCyvmi7Re18/aTQEwjKHZkoPPbdubOX/GE=@vger.kernel.org X-Gm-Message-State: AOJu0Yw9Gd5+bmeUnCewkRWjh2Jy6Ch6mhSymVxJV/XYWHAwLHyD41RP Bx3/+tj3y8YYf+GM8EahPqhOyZ0tFQnGoEQ4KNxWZ23bDohz+tSUm1MjPB4RfrE+Pnpn58iMMx3 dEGJsQU2MzQ== X-Google-Smtp-Source: AGHT+IHmeuVYVihs7sQ36TEo2hNXWZbWh14KISXEiA150JU3zbOiiYwpdjNYy9bWt46liO2U4UYaoHnwxcbrZQ== X-Received: from pjbpw4.prod.google.com ([2002:a17:90b:2784:b0:2f8:49ad:406c]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90a:d883:b0:309:eb54:9ea2 with SMTP id 98e67ed59e1d1-30a5ae3f34cmr13505195a91.20.1746461658457; Mon, 05 May 2025 09:14:18 -0700 (PDT) Date: Mon, 5 May 2025 16:14:08 +0000 In-Reply-To: <20250505161412.1926643-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250505161412.1926643-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.49.0.967.g6a0df3ecc3-goog Message-ID: <20250505161412.1926643-3-jiaqiyan@google.com> Subject: [PATCH v1 2/6] KVM: arm64: Set FnV for VCPU when FAR_EL2 is invalid From: Jiaqi Yan To: maz@kernel.org, oliver.upton@linux.dev Cc: joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, duenwen@google.com, rananta@google.com, jthoughton@google.com, Jiaqi Yan Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Certain microarchitectures (e.g. Neoverse V2) do not keep track of the faulting address for a memory load that consumes poisoned data and results in a synchronous external abort (SEA). This means the faulting guest physical address is unavailable when KVM handles such SEA in EL2, and FAR_EL2 just holds a garbage value. In case VMM later asks KVM to synchronously inject a SEA into the guest, KVM should set FnV bit - in VCPU's ESR_EL1 to let guest kernel know that FAR_EL1 is invalid and holds garbage value - in VCPU's ESR_EL2 to let nested virtualization know that FAR_EL2 is invalid and holds garbage value Signed-off-by: Jiaqi Yan --- arch/arm64/kvm/inject_fault.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/arch/arm64/kvm/inject_fault.c b/arch/arm64/kvm/inject_fault.c index a640e839848e6..b4f9a09952ead 100644 --- a/arch/arm64/kvm/inject_fault.c +++ b/arch/arm64/kvm/inject_fault.c @@ -81,6 +81,9 @@ static void inject_abt64(struct kvm_vcpu *vcpu, bool is_i= abt, unsigned long addr if (!is_iabt) esr |=3D ESR_ELx_EC_DABT_LOW << ESR_ELx_EC_SHIFT; =20 + if (!kvm_vcpu_sea_far_valid(vcpu)) + esr |=3D ESR_ELx_FnV; + esr |=3D ESR_ELx_FSC_EXTABT; =20 if (match_target_el(vcpu, unpack_vcpu_flag(EXCEPT_AA64_EL1_SYNC))) { --=20 2.49.0.967.g6a0df3ecc3-goog From nobody Sun Feb 8 07:21:30 2026 Received: from mail-pl1-f202.google.com (mail-pl1-f202.google.com [209.85.214.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A3AFC26B0BC for ; Mon, 5 May 2025 16:14:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461663; cv=none; b=ljQMX+Hf/+8gbytV3d06kW+5kMaDu3fVt+t+Joy3h4p3ewjQ900AZn7H8LMkPQ+Jb5xK+7VpAB253Wmzb17wQOlWeA8b0WP/k4JkAjJ0sNyXsd+sBfTF4BC7NE3D8ZImcE8uDSYdwEHl0kiwJCVd/Jm5eTuykghWUr6gzrFD7w8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461663; c=relaxed/simple; bh=WNmcRnW25SuTNW/fidnyuzaC0ufF7CU/QhzVlk6Eyk4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=QWYqJeQgFTITREaXQnZEaghpzsWlruRc4zcXsfmoWzrIXcrzl8Qn6xGsyx/8zzVBXDEhcGQYD8fz4f1c1pye0o8PKlElidF2ftsym9TW45rgEWX+YZFFGwM6juONj+cmz9bOSrI+l59TN97YDJnL2sM7W4e1XeGoz1bhctHuRRU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=y3GC/6Ew; arc=none smtp.client-ip=209.85.214.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="y3GC/6Ew" Received: by mail-pl1-f202.google.com with SMTP id d9443c01a7336-22c35bafdbdso70207265ad.1 for ; Mon, 05 May 2025 09:14:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746461660; x=1747066460; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=ih0/JpStngeu2lOMQmMMElA7ojSkXNH4xy6u5tdpksc=; b=y3GC/6EwVjLQsC9dtPwGfK31ngSOH8Y2h2UeLmZBn9r2UE375VLvVmiG0EBCmseboa sCCpM/i9ylGQL6AHx+v1eevWwgUOzrafX+C5dprIS/3CO89MkccX79RBAX1/X0diBLef luxSwCZKwYwJh2iHmIN2bk4s2A02SxhWk6o4cOjLy+vsJvR731lEkkhS9C8ePyC8W8Hg +737TdUPDpMFcEITdumJ3UAs2roxubWZtsDJUK/FWFq1CdEKx+3YANN6A25F+e+eR5H2 G+Li3rXUxZYP0p2azMmWBnQbbeKWIcS15Pd7zhgK3CTPmbIl/ZBk5qQpltd6q/WCPBVF nzwg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746461660; x=1747066460; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=ih0/JpStngeu2lOMQmMMElA7ojSkXNH4xy6u5tdpksc=; b=oZpBE2Glz0I/INZOXp7+GpjmTGfmOYHUipvr1UlT5ERtI3vVAAtoJzdUUxf2cipdNa bNPDFmPrJZd2e7UJiRgDYqvQEUBElxZPp2oiVlxr8p25QVFzw3wNi9N6wCIvJKGmVx04 qKbvh0MfNXpd3BtSOMzOEV+vF6EhMYQGYvIOvjqS3zgY/DSqEdJWf+auWaqP/ty285FD lI20slyZ0ZGRRxP6zf/5Vm1z1AFziwFtLov8L6xoHSU0xIYnCsUha60BRBZG4NrpTGZC V4GCtplV+FCpM3l2yIB12O+OU8Goo4z5kJS3q7yVht4GBcj8gy6u1y5etstGCOgvSU14 tS3w== X-Forwarded-Encrypted: i=1; AJvYcCUm0BaORh/64qx3oP/9OicUBAeSYtZW9V+OT6QhmitBo1/ptG6vtNBDU+dldV9ryVHMx9FevEiTk1xATc4=@vger.kernel.org X-Gm-Message-State: AOJu0YwOxTCW05aIgtrxGfsewV53AP9Hz7o2b5yL8akiQIMrTW3+RfMH MGPyuj96uaUeGvvriKZKtwaS4Xyv/OBTUM6qUDBN6rH700PcaHrBqF2akNcNs919/LO8VklYPZf LfhSYWmoX5w== X-Google-Smtp-Source: AGHT+IEALx06q4ifLBOjCQ9ZZOZ/gQJWNi+Q92PHdvJDeENKCM2QDj0Em0rsjG/Q7RXdhODS4FzwytaIsf6wMQ== X-Received: from pfvo15.prod.google.com ([2002:a05:6a00:1b4f:b0:73d:b1c6:c137]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a17:902:f683:b0:224:f12:3734 with SMTP id d9443c01a7336-22e1ea87368mr99744415ad.30.1746461659950; Mon, 05 May 2025 09:14:19 -0700 (PDT) Date: Mon, 5 May 2025 16:14:09 +0000 In-Reply-To: <20250505161412.1926643-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250505161412.1926643-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.49.0.967.g6a0df3ecc3-goog Message-ID: <20250505161412.1926643-4-jiaqiyan@google.com> Subject: [PATCH v1 3/6] KVM: arm64: Allow userspace to inject external instruction aborts From: Jiaqi Yan To: maz@kernel.org, oliver.upton@linux.dev Cc: joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, duenwen@google.com, rananta@google.com, jthoughton@google.com, Jiaqi Yan Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" From: Raghavendra Rao Ananta When KVM returns to userspace for KVM_EXIT_ARM_SEA, the userspace is encouraged to inject the abort into the guest via KVM_SET_VCPU_EVENTS. KVM_SET_VCPU_EVENTS currently only allows injecting external data aborts. However, the synchronous external abort that caused KVM_EXIT_ARM_SEA is possible to be an instruction abort. Userspace is already able to tell if an abort is due to data or instruction via kvm_run.arm_sea.esr, by checking its Exception Class value. Extend the KVM_SET_VCPU_EVENTS ioctl to allow injecting instruction abort into the guest. Signed-off-by: Jiaqi Yan Signed-off-by: Raghavendra Rao Ananta --- arch/arm64/include/uapi/asm/kvm.h | 3 ++- arch/arm64/kvm/arm.c | 1 + arch/arm64/kvm/guest.c | 13 ++++++++++--- include/uapi/linux/kvm.h | 1 + 4 files changed, 14 insertions(+), 4 deletions(-) diff --git a/arch/arm64/include/uapi/asm/kvm.h b/arch/arm64/include/uapi/as= m/kvm.h index ed5f3892674c7..643e8c4825451 100644 --- a/arch/arm64/include/uapi/asm/kvm.h +++ b/arch/arm64/include/uapi/asm/kvm.h @@ -184,8 +184,9 @@ struct kvm_vcpu_events { __u8 serror_pending; __u8 serror_has_esr; __u8 ext_dabt_pending; + __u8 ext_iabt_pending; /* Align it to 8 bytes */ - __u8 pad[5]; + __u8 pad[4]; __u64 serror_esr; } exception; __u32 reserved[12]; diff --git a/arch/arm64/kvm/arm.c b/arch/arm64/kvm/arm.c index 47544945fba45..dc2efb627f450 100644 --- a/arch/arm64/kvm/arm.c +++ b/arch/arm64/kvm/arm.c @@ -319,6 +319,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long = ext) case KVM_CAP_ARM_IRQ_LINE_LAYOUT_2: case KVM_CAP_ARM_NISV_TO_USER: case KVM_CAP_ARM_INJECT_EXT_DABT: + case KVM_CAP_ARM_INJECT_EXT_IABT: case KVM_CAP_SET_GUEST_DEBUG: case KVM_CAP_VCPU_ATTRIBUTES: case KVM_CAP_PTP_KVM: diff --git a/arch/arm64/kvm/guest.c b/arch/arm64/kvm/guest.c index 2196979a24a32..4917361ecf5cb 100644 --- a/arch/arm64/kvm/guest.c +++ b/arch/arm64/kvm/guest.c @@ -825,9 +825,9 @@ int __kvm_arm_vcpu_get_events(struct kvm_vcpu *vcpu, events->exception.serror_esr =3D vcpu_get_vsesr(vcpu); =20 /* - * We never return a pending ext_dabt here because we deliver it to - * the virtual CPU directly when setting the event and it's no longer - * 'pending' at this point. + * We never return a pending ext_dabt or ext_iabt here because we + * deliver it to the virtual CPU directly when setting the event + * and it's no longer 'pending' at this point. */ =20 return 0; @@ -839,6 +839,7 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu, bool serror_pending =3D events->exception.serror_pending; bool has_esr =3D events->exception.serror_has_esr; bool ext_dabt_pending =3D events->exception.ext_dabt_pending; + bool ext_iabt_pending =3D events->exception.ext_iabt_pending; =20 if (serror_pending && has_esr) { if (!cpus_have_final_cap(ARM64_HAS_RAS_EXTN)) @@ -852,8 +853,14 @@ int __kvm_arm_vcpu_set_events(struct kvm_vcpu *vcpu, kvm_inject_vabt(vcpu); } =20 + /* DABT and IABT cannot happen at the same time. */ + if (ext_dabt_pending && ext_iabt_pending) + return -EINVAL; + if (ext_dabt_pending) kvm_inject_dabt(vcpu, kvm_vcpu_get_hfar(vcpu)); + else if (ext_iabt_pending) + kvm_inject_pabt(vcpu, kvm_vcpu_get_hfar(vcpu)); =20 return 0; } diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h index 79dc4676ff74b..bcf2b95b79123 100644 --- a/include/uapi/linux/kvm.h +++ b/include/uapi/linux/kvm.h @@ -941,6 +941,7 @@ struct kvm_enable_cap { #define KVM_CAP_X86_GUEST_MODE 238 #define KVM_CAP_ARM_WRITABLE_IMP_ID_REGS 239 #define KVM_CAP_ARM_SEA_TO_USER 240 +#define KVM_CAP_ARM_INJECT_EXT_IABT 241 =20 struct kvm_irq_routing_irqchip { __u32 irqchip; --=20 2.49.0.967.g6a0df3ecc3-goog From nobody Sun Feb 8 07:21:30 2026 Received: from mail-pf1-f202.google.com (mail-pf1-f202.google.com [209.85.210.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D5DCA26772C for ; Mon, 5 May 2025 16:14:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461664; cv=none; b=EKXra3XkdgxArCAk37m7V1g1UpRYXBJaVu/BOYZrH76VVw6suLqlS4dXRNzs5200PRRN6UZNiuhlV0FPGKiY7PgkUtBezPKes9ZM7Eqn1lALkbjklGfScLS6CDg9ZhLfkALTq1MCVXWm1yOEP4twA9eOB1u7AJ4/543c6pEdV54= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461664; c=relaxed/simple; bh=TA/VLKIsq4UUQwvxrC4ZVZvjB+NeMfRDMShqDmxRKko=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=lXsGELa5vbHlPYvTZAjKhI7DLEq4lN5SULEByPy6dHGns9TpCCzV9Ebdc9lBMEr0CG9NHz5Ksxu5HXD+q7vWFOIiUOtkeFJ3ciOg5gMHz5WkY7U2mG50+mvKQOCcuu2ficjjjytLKPx5NpCx2GikpKZH+z9co/DPXlomrZTigNo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=iyCaXB2j; arc=none smtp.client-ip=209.85.210.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="iyCaXB2j" Received: by mail-pf1-f202.google.com with SMTP id d2e1a72fcca58-73720b253fcso3532062b3a.2 for ; Mon, 05 May 2025 09:14:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746461661; x=1747066461; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=yvZJykt7Uf4dVmRhJKmJi6MP1scq5omIxYcDpxOQw64=; b=iyCaXB2jUbX1a1VRSdqXQTNBuqFOQMsX0/aofnZG+zQlzOQAkkKwxfY97cO+S0azz8 11st0K4Fud+EvGocPWNmiLAIEOF71zSEMsnUefTHp2vys0bAQYfdc9bqGHThwZyJwQvd 2UhAnO+QZpAKK/qvt8CmXSaql3KtdR+eS/PvmV8TzV2NX6X0qY5skNd3jx0f5ORm/5iN 5e1XxqEohUi1HRnTyT5xxIFhGLRGeqiE+nbqnweo4w6bTxmcl6ZXeTgr6h6v4jVA6nkk 4tsw1Q+GHrky7X8aLjf1al+wCrpzh4TCyHsQj022zsne7rebjCzObMRXa6tzcXfEKhfV TEaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746461661; x=1747066461; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=yvZJykt7Uf4dVmRhJKmJi6MP1scq5omIxYcDpxOQw64=; b=YrtPOARB0HAKXyMWwxUT+tx75NyfvvDJuus/ogJ9VIjavUwm1ONMJgG8OpTbaLxttY nQnV1I+AUsrhKG97/te73w7mb9KAFqV5hKTbRX+ZWFCRAON7mM3OWqZNiObrjBs0cD/K 4u9Z1nUi3I90IlsQfD2tPFu2G4JR2wMdcD9q4PzaNTW+QoLGpVRPwrDGoivkcxLmcMly brXp3Vq+QUpavYUmPQWaBmZEaL4sh+0e2nysKpVvEIpqzMYRQm+N/XUs7Wn7oQIrUcef n3Vm5HfqcTuly8DvUpn4Vo6aUTKhqGZaWzwk43kUW6nF/y1yNFo843ak0xHap1UlcD4l BIhQ== X-Forwarded-Encrypted: i=1; AJvYcCW5mzodNlSDgQUGb1C8YsYFKmMr4+75JHpgF/wCc1grKAQuxS2svYXCYte1prYpP71nfr2flAe5Lm28wPw=@vger.kernel.org X-Gm-Message-State: AOJu0YyE3OgdmY5GlgwamMykNXXfVBH2/TSwP2tc632bCa2RF64Us3cs n2shkVgwl3Zm/gPMMYOIZ4OI/cW5RoHp+sPrwCL6aAvs6j3j3E+Je0xIQmE6BUo4utLrhwC6wBW UfqjxfnVlTw== X-Google-Smtp-Source: AGHT+IHn8tBuCeZUAxg8gs/dBzpH1HCCE85DCDtaZxk6M+ZfRc5ag2yPjrWYgrHQohzrfHegE2cATmHScRjqUA== X-Received: from pghg13.prod.google.com ([2002:a63:e60d:0:b0:b1f:bc65:a8df]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a21:32a2:b0:1f5:8da5:ffe9 with SMTP id adf61e73a8af0-20e966057d7mr9762805637.12.1746461661200; Mon, 05 May 2025 09:14:21 -0700 (PDT) Date: Mon, 5 May 2025 16:14:10 +0000 In-Reply-To: <20250505161412.1926643-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250505161412.1926643-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.49.0.967.g6a0df3ecc3-goog Message-ID: <20250505161412.1926643-5-jiaqiyan@google.com> Subject: [PATCH v1 4/6] KVM: selftests: Test for KVM_EXIT_ARM_SEA and KVM_CAP_ARM_SEA_TO_USER From: Jiaqi Yan To: maz@kernel.org, oliver.upton@linux.dev Cc: joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, duenwen@google.com, rananta@google.com, jthoughton@google.com, Jiaqi Yan Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Test how KVM handles guest stage-2 SEA when APEI is unable to claim it. The behavior is triggered by consuming recoverable memory error (UER) injected via EINJ. The test asserts two major things: 1. KVM returns to userspace with KVM_EXIT_ARM_SEA exit reason, and has provided correct fault information, e.g. esr, flags, gva, gpa. 2. Userspace is able to handle KVM_EXIT_ARM_SEA by injecting SEA to guest and KVM injects expected SEA into the VCPU. Tested on a data center server running Siryn AmpereOne processor. Several things to notice before attempting to run this selftest: - The test relies on EINJ support in both firmware and kernel to inject UER. Otherwise the test will be skipped. - The under-test platform's APEI should be unable to claim the SEA. Otherwise the test will be skipped. - Some platform doesn't support notrigger in EINJ, which may cause APEI and GHES to offline the memory before guest can consume injected UER, and making test unable to trigger SEA. Signed-off-by: Jiaqi Yan --- tools/testing/selftests/kvm/Makefile.kvm | 1 + .../testing/selftests/kvm/arm64/sea_to_user.c | 324 ++++++++++++++++++ tools/testing/selftests/kvm/lib/kvm_util.c | 1 + 3 files changed, 326 insertions(+) create mode 100644 tools/testing/selftests/kvm/arm64/sea_to_user.c diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selft= ests/kvm/Makefile.kvm index f62b0a5aba35a..16d2e9f32619f 100644 --- a/tools/testing/selftests/kvm/Makefile.kvm +++ b/tools/testing/selftests/kvm/Makefile.kvm @@ -151,6 +151,7 @@ TEST_GEN_PROGS_arm64 +=3D arm64/hypercalls TEST_GEN_PROGS_arm64 +=3D arm64/mmio_abort TEST_GEN_PROGS_arm64 +=3D arm64/page_fault_test TEST_GEN_PROGS_arm64 +=3D arm64/psci_test +TEST_GEN_PROGS_arm64 +=3D arm64/sea_to_user TEST_GEN_PROGS_arm64 +=3D arm64/set_id_regs TEST_GEN_PROGS_arm64 +=3D arm64/smccc_filter TEST_GEN_PROGS_arm64 +=3D arm64/vcpu_width_config diff --git a/tools/testing/selftests/kvm/arm64/sea_to_user.c b/tools/testin= g/selftests/kvm/arm64/sea_to_user.c new file mode 100644 index 0000000000000..9490cdbad3466 --- /dev/null +++ b/tools/testing/selftests/kvm/arm64/sea_to_user.c @@ -0,0 +1,324 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * Test KVM returns to userspace with KVM_EXIT_ARM_SEA if host APEI fails + * to handle SEA and userspace has opt-ed in KVM_CAP_ARM_SEA_TO_USER. + * + * After reaching userspace with expected arm_sea info, also test userspace + * injecting a synchronous external data abort into the guest. + * + * This test utilizes EINJ to generate a REAL synchronous external data + * abort by consuming a recoverable uncorrectable memory error. Therefore + * the device under test must support EINJ in both firmware and host kerne= l, + * including the notrigger feature. Otherwise the test will be skipped. + * The under-test platform's APEI should be unable to claim SEA. Otherwise + * the test will also be skipped. + */ + +#include +#include +#include +#include + +#include "test_util.h" +#include "kvm_util.h" +#include "processor.h" +#include "guest_modes.h" + +#define PAGE_PRESENT (1ULL << 63) +#define PAGE_PHYSICAL 0x007fffffffffffffULL +#define PAGE_ADDR_MASK (~(0xfffULL)) + +/* Value for "Recoverable state (UER)". */ +#define ESR_ELx_SET_UER 0U + +#define EINJ_ETYPE "/sys/kernel/debug/apei/einj/error_type" +#define EINJ_ADDR "/sys/kernel/debug/apei/einj/param1" +#define EINJ_MASK "/sys/kernel/debug/apei/einj/param2" +#define EINJ_FLAGS "/sys/kernel/debug/apei/einj/flags" +#define EINJ_NOTRIGGER "/sys/kernel/debug/apei/einj/notrigger" +#define EINJ_DOIT "/sys/kernel/debug/apei/einj/error_inject" +/* Memory Uncorrectable non-fatal. */ +#define ERROR_TYPE_MEMORY_UER 0x10 +/* Memory address and mask valid (param1 and param2). */ +#define MASK_MEMORY_UER 0b10 + +/* Guest virtual address region =3D [2G, 3G). */ +#define START_GVA 0x80000000UL +#define VM_MEM_SIZE 0x40000000UL +/* Note: EINJ_OFFSET must < VM_MEM_SIZE. */ +#define EINJ_OFFSET 0x05234badUL +#define EINJ_GVA ((START_GVA) + (EINJ_OFFSET)) + +static vm_paddr_t einj_gpa; +static void *einj_hva; +static uint64_t einj_hpa; +static bool far_invalid; + +static uint64_t translate_to_host_paddr(unsigned long vaddr) +{ + uint64_t pinfo; + int64_t offset =3D vaddr / getpagesize() * sizeof(pinfo); + int fd; + uint64_t page_addr; + uint64_t paddr; + + fd =3D open("/proc/self/pagemap", O_RDONLY); + if (fd < 0) + ksft_exit_fail_perror("Failed to open /proc/self/pagemap"); + if (pread(fd, &pinfo, sizeof(pinfo), offset) !=3D sizeof(pinfo)) { + close(fd); + ksft_exit_fail_perror("Failed to read /proc/self/pagemap"); + } + + close(fd); + + if ((pinfo & PAGE_PRESENT) =3D=3D 0) + ksft_exit_fail_perror("Page not present"); + + page_addr =3D (pinfo & PAGE_PHYSICAL) << MIN_PAGE_SHIFT; + paddr =3D page_addr + (vaddr & (getpagesize() - 1)); + return paddr; +} + +static void write_einj_entry(const char *einj_path, uint64_t val) +{ + char cmd[256] =3D {0}; + FILE *cmdfile =3D NULL; + + sprintf(cmd, "echo %#lx > %s", val, einj_path); + cmdfile =3D popen(cmd, "r"); + + if (pclose(cmdfile) =3D=3D 0) + ksft_print_msg("echo %#lx > %s - done\n", val, einj_path); + else + ksft_exit_fail_perror("Failed to write EINJ entry"); +} + +static void inject_uer(uint64_t paddr) +{ + if (access("/sys/firmware/acpi/tables/EINJ", R_OK) =3D=3D -1) + ksft_test_result_skip("EINJ table no available in firmware"); + + if (access(EINJ_ETYPE, R_OK | W_OK) =3D=3D -1) + ksft_test_result_skip("EINJ module probably not loaded?"); + + write_einj_entry(EINJ_ETYPE, ERROR_TYPE_MEMORY_UER); + write_einj_entry(EINJ_FLAGS, MASK_MEMORY_UER); + write_einj_entry(EINJ_ADDR, paddr); + write_einj_entry(EINJ_MASK, ~0x0UL); + write_einj_entry(EINJ_NOTRIGGER, 1); + write_einj_entry(EINJ_DOIT, 1); +} + +/* + * When host APEI successfully claims the SEA caused by guest_code, kernel + * will send SIGBUS signal with BUS_MCEERR_AR to test thread. + * + * We set up this SIGBUS handler to skip the test for that case. + */ +static void sigbus_signal_handler(int sig, siginfo_t *si, void *v) +{ + ksft_print_msg("SIGBUS (%d) received, dumping siginfo...\n", sig); + ksft_print_msg("si_signo=3D%d, si_errno=3D%d, si_code=3D%d, si_addr=3D%p\= n", + si->si_signo, si->si_errno, si->si_code, si->si_addr); + if (si->si_code =3D=3D BUS_MCEERR_AR) + ksft_test_result_skip("SEA is claimed by host APEI\n"); + else + ksft_test_result_fail("Exit with signal unhandled\n"); + + exit(0); +} + +static void setup_sigbus_handler(void) +{ + struct sigaction act; + + memset(&act, 0, sizeof(act)); + sigemptyset(&act.sa_mask); + act.sa_sigaction =3D sigbus_signal_handler; + act.sa_flags =3D SA_SIGINFO; + TEST_ASSERT(sigaction(SIGBUS, &act, NULL) =3D=3D 0, + "Failed to setup SIGBUS handler"); +} + +static void guest_code(void) +{ + uint64_t guest_data; + + /* Consumes error will cause a SEA. */ + guest_data =3D *(uint64_t *)EINJ_GVA; + + GUEST_FAIL("Data corruption not prevented by SEA: gva=3D%#lx, data=3D%#lx= ", + EINJ_GVA, guest_data); +} + +static void expect_sea_handler(struct ex_regs *regs) +{ + u64 esr =3D read_sysreg(esr_el1); + u64 far =3D read_sysreg(far_el1); + bool expect_far_invalid =3D far_invalid; + + GUEST_PRINTF("Guest SEA esr_el1=3D%#lx, far_el1=3D%#lx\n", esr, far); + + GUEST_ASSERT_EQ(ESR_ELx_EC(esr), ESR_ELx_EC_DABT_CUR); + GUEST_ASSERT_EQ(esr & ESR_ELx_FSC_TYPE, ESR_ELx_FSC_EXTABT); + + if (expect_far_invalid) { + GUEST_ASSERT(esr & ESR_ELx_FnV); + GUEST_PRINTF("Guest observed garbage value in FAR\n"); + } else { + GUEST_ASSERT(!(esr & ESR_ELx_FnV)); + GUEST_ASSERT_EQ(far, EINJ_GVA); + } + + GUEST_DONE(); +} + +static void vcpu_inject_sea(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_events events =3D {}; + + events.exception.ext_dabt_pending =3D true; + vcpu_events_set(vcpu, &events); +} + +static void run_vm(struct kvm_vm *vm, struct kvm_vcpu *vcpu) +{ + struct ucall uc; + bool guest_done =3D false; + struct kvm_run *run =3D vcpu->run; + + /* Resume the vCPU after error injection to consume the error. */ + vcpu_run(vcpu); + + ksft_print_msg("Dump kvm_run info about KVM_EXIT_%s\n", + exit_reason_str(run->exit_reason)); + ksft_print_msg("kvm_run.arm_sea: esr=3D%#llx, flags=3D%#llx\n", + run->arm_sea.esr, run->arm_sea.flags); + ksft_print_msg("kvm_run.arm_sea: gva=3D%#llx, gpa=3D%#llx\n", + run->arm_sea.gva, run->arm_sea.gpa); + + /* Validate the KVM_EXIT. */ + TEST_ASSERT_KVM_EXIT_REASON(vcpu, KVM_EXIT_ARM_SEA); + TEST_ASSERT_EQ(ESR_ELx_EC(run->arm_sea.esr), ESR_ELx_EC_DABT_LOW); + TEST_ASSERT_EQ(run->arm_sea.esr & ESR_ELx_FSC_TYPE, ESR_ELx_FSC_EXTABT); + TEST_ASSERT_EQ(run->arm_sea.esr & ESR_ELx_SET_MASK, ESR_ELx_SET_UER); + + if (run->arm_sea.flags & KVM_EXIT_ARM_SEA_FLAG_GVA_VALID) + TEST_ASSERT_EQ(run->arm_sea.gva, EINJ_GVA); + + if (run->arm_sea.flags & KVM_EXIT_ARM_SEA_FLAG_GPA_VALID) + TEST_ASSERT_EQ(run->arm_sea.gpa, einj_gpa & PAGE_ADDR_MASK); + + far_invalid =3D run->arm_sea.esr & ESR_ELx_FnV; + + /* Inject a SEA into guest and expect handled in SEA handler. */ + vcpu_inject_sea(vcpu); + + /* Expect the guest to reach GUEST_DONE gracefully. */ + do { + vcpu_run(vcpu); + switch (get_ucall(vcpu, &uc)) { + case UCALL_PRINTF: + ksft_print_msg("From guest: %s", uc.buffer); + break; + case UCALL_DONE: + ksft_print_msg("Guest done gracefully!\n"); + guest_done =3D 1; + break; + case UCALL_ABORT: + ksft_print_msg("Guest aborted!\n"); + guest_done =3D 1; + REPORT_GUEST_ASSERT(uc); + break; + default: + TEST_FAIL("Unexpected ucall: %lu\n", uc.cmd); + } + } while (!guest_done); +} + +static struct kvm_vm *vm_create_with_sea_handler(struct kvm_vcpu **vcpu) +{ + size_t backing_page_size; + size_t guest_page_size; + size_t alignment; + uint64_t num_guest_pages; + vm_paddr_t start_gpa; + enum vm_mem_backing_src_type src_type =3D VM_MEM_SRC_ANONYMOUS_HUGETLB_1G= B; + struct kvm_vm *vm; + + backing_page_size =3D get_backing_src_pagesz(src_type); + guest_page_size =3D vm_guest_mode_params[VM_MODE_DEFAULT].page_size; + alignment =3D max(backing_page_size, guest_page_size); + num_guest_pages =3D VM_MEM_SIZE / guest_page_size; + + vm =3D __vm_create_with_one_vcpu(vcpu, num_guest_pages, guest_code); + vm_init_descriptor_tables(vm); + vcpu_init_descriptor_tables(*vcpu); + + vm_install_sync_handler(vm, + /*vector=3D*/VECTOR_SYNC_CURRENT, + /*ec=3D*/ESR_ELx_EC_DABT_CUR, + /*handler=3D*/expect_sea_handler); + + start_gpa =3D (vm->max_gfn - num_guest_pages) * guest_page_size; + start_gpa =3D align_down(start_gpa, alignment); + + vm_userspace_mem_region_add( + /*vm=3D*/vm, + /*src_type=3D*/src_type, + /*guest_paddr=3D*/start_gpa, + /*slot=3D*/1, + /*npages=3D*/num_guest_pages, + /*flags=3D*/0); + + virt_map(vm, START_GVA, start_gpa, num_guest_pages); + + ksft_print_msg("Mapped %#lx pages: gva=3D%#lx to gpa=3D%#lx\n", + num_guest_pages, START_GVA, start_gpa); + return vm; +} + +static void vm_inject_memory_uer(struct kvm_vm *vm) +{ + uint64_t guest_data; + + einj_gpa =3D addr_gva2gpa(vm, EINJ_GVA); + einj_hva =3D addr_gva2hva(vm, EINJ_GVA); + + /* Populate certain data before injecting UER. */ + *(uint64_t *)einj_hva =3D 0xBAADCAFE; + guest_data =3D *(uint64_t *)einj_hva; + ksft_print_msg("Before EINJect: data=3D%#lx\n", + guest_data); + + einj_hpa =3D translate_to_host_paddr((unsigned long)einj_hva); + + ksft_print_msg("EINJ_GVA=3D%#lx, einj_gpa=3D%#lx, einj_hva=3D%p, einj_hpa= =3D%#lx\n", + EINJ_GVA, einj_gpa, einj_hva, einj_hpa); + + inject_uer(einj_hpa); + ksft_print_msg("Memory UER EINJected\n"); +} + +int main(int argc, char *argv[]) +{ + struct kvm_vm *vm; + struct kvm_vcpu *vcpu; + + TEST_REQUIRE(kvm_has_cap(KVM_CAP_ARM_SEA_TO_USER)); + + setup_sigbus_handler(); + + vm =3D vm_create_with_sea_handler(&vcpu); + + vm_enable_cap(vm, KVM_CAP_ARM_SEA_TO_USER, 0); + + vm_inject_memory_uer(vm); + + run_vm(vm, vcpu); + + kvm_vm_free(vm); + + return 0; +} diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/sel= ftests/kvm/lib/kvm_util.c index 815bc45dd8dc6..bc9fcf6c3295a 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -2021,6 +2021,7 @@ static struct exit_reason { KVM_EXIT_STRING(NOTIFY), KVM_EXIT_STRING(LOONGARCH_IOCSR), KVM_EXIT_STRING(MEMORY_FAULT), + KVM_EXIT_STRING(ARM_SEA), }; =20 /* --=20 2.49.0.967.g6a0df3ecc3-goog From nobody Sun Feb 8 07:21:30 2026 Received: from mail-pl1-f201.google.com (mail-pl1-f201.google.com [209.85.214.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3913E26C397 for ; Mon, 5 May 2025 16:14:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461664; cv=none; b=X+UPQL6NNatqKpG5VwIlv5xnFTO37ngZ3jTuaGgBqIGs04lWtEPeNqOEuLEaJp2Cx7hcDRFmX+CT0LzPD27loCYHpHy9d4b1UMH1qK51m3xAagDBcFaSwS2noxXWUgEWxhuMIt2VL5nxXf3YWeFsck3PGZ2xhCzqwlFCxDDkHTc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461664; c=relaxed/simple; bh=0xPFNmQ2aYdSjFcVuUL7sWXmakZUXXwCI8TdRZJDz9g=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=W7VTH/g+cZN5+J4XkmU2Tv5X47ac/xV+BxEDpw6ih4jQw8Sb752ebfvWGaqM/bU3jyhjqpXaskbbCVB8ewGn+thf8ZA2xn3XkjsGMUZShEiR2UcRMtP8Zgjuc4/eQhZeqSfsJ4qc2KKIq43RpQJWcFKAQm9tbrc65WnBw5ajRDY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=Cc/uRD0N; arc=none smtp.client-ip=209.85.214.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="Cc/uRD0N" Received: by mail-pl1-f201.google.com with SMTP id d9443c01a7336-22c31b55ac6so65506655ad.0 for ; Mon, 05 May 2025 09:14:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746461663; x=1747066463; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=qoM9DcHdqZBj+rk9l05sHHopuoPZxH677juEX9RJpMg=; b=Cc/uRD0NkHMrvPWyT0Y0tQDFxYje4u+lAlj51UllWvP+vb8F75TLhDZfMk/s9tqp7e By63/13gFQ5NcWX7qRtzSQjeih5tCGR1c4iv2BjHSclUscQqkiqAuSwKsxd0IHNDsbN4 lJ9rEWZRh/3i2Z8c49kE1kkVvhByLqyf3sx4WUcRy4ipDV13h2nIem54t3B5v+pL2lTJ YETibOULRG4lMzgU1+AZWkqKJztkgRlfKGqXeceMHAucShGOXji7IaI+ZjWesRpq+mRV 9m0qNeK9j2gO5EmLahmYFSfuVz6COBeV5t8a+2SDHuUFsRcf/qxHnR97MySv8E7pPpl7 59vw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746461663; x=1747066463; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qoM9DcHdqZBj+rk9l05sHHopuoPZxH677juEX9RJpMg=; b=FMZkj6Cxou6lrUrS7rRwpfh4uWP10SvVey51Kw+sOLQ01/SFwS14KLBlTo9kEzhof3 4xGpYWx3q/IkG+h8bofiu/0bTmzgZ4wdTa4+VNU2m00iEVhlv1QtrAZPhY81eq5S5mnr TLry40ucPiqywECPkF9UF1x4tSdkkK5PK4AsJGB7HR7BaxfNc3Mf3fcbeRnbYnGt4gKb hV18wqgruaAvqAI3GHGesdv05ZP+WkBffc8u9UDyO4ApqP8Vu8u+d52cR2LDHZlMi5DR 9lpLkaF9aCPiDn8Yrmifg1GQ8S1tO9fmJ81RsudxfJZWZR635TcPbq4U+SeRjhxLgvz3 OD+Q== X-Forwarded-Encrypted: i=1; AJvYcCU2gNExU3jhLzOa9OsvtOj4Lz13ARAT8eByuYtp05AMBzcg9vifaT9iKS+Zd8VJo+3O0LHt+lCM/DP/8/s=@vger.kernel.org X-Gm-Message-State: AOJu0Yyv9krlARzXsS6Y+eWbwU3eutUTl76zWIOQEeBMlw8+1YJcxgae /NsmFtgmC0v4gUM4DJ+CMSUVyx6Z3P7L95fz/JTlWsr4lPaACj6A1k6zAQzVTPjjCo/Iy/vfaqt lviQDrQYSnQ== X-Google-Smtp-Source: AGHT+IFZMxB5zKbz0DCB224XcS28TLNRLHuTk3G5CmkpWtkGxQ7uQKT12C81gz2CRUWIcDMNDv5x3jsR5vZQHg== X-Received: from pfbdh13.prod.google.com ([2002:a05:6a00:478d:b0:739:485f:c33e]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a17:903:2383:b0:21f:52e:939e with SMTP id d9443c01a7336-22e18bc4033mr159804055ad.28.1746461662677; Mon, 05 May 2025 09:14:22 -0700 (PDT) Date: Mon, 5 May 2025 16:14:11 +0000 In-Reply-To: <20250505161412.1926643-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250505161412.1926643-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.49.0.967.g6a0df3ecc3-goog Message-ID: <20250505161412.1926643-6-jiaqiyan@google.com> Subject: [PATCH v1 5/6] KVM: selftests: Test for KVM_CAP_INJECT_EXT_IABT From: Jiaqi Yan To: maz@kernel.org, oliver.upton@linux.dev Cc: joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, duenwen@google.com, rananta@google.com, jthoughton@google.com, Jiaqi Yan Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Test userspace can use KVM_SET_VCPU_EVENTS to inject an external instruction abort into guest. The test injects instruction abort at an arbitrary time without real SEA happening in the guest VCPU, so only certain ESR_EL1 value can be expected, but not the case for FAR_EL1. Signed-off-by: Jiaqi Yan --- tools/arch/arm64/include/uapi/asm/kvm.h | 3 +- tools/testing/selftests/kvm/Makefile.kvm | 1 + .../testing/selftests/kvm/arm64/inject_iabt.c | 100 ++++++++++++++++++ 3 files changed, 103 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/kvm/arm64/inject_iabt.c diff --git a/tools/arch/arm64/include/uapi/asm/kvm.h b/tools/arch/arm64/inc= lude/uapi/asm/kvm.h index af9d9acaf9975..d3a4530846311 100644 --- a/tools/arch/arm64/include/uapi/asm/kvm.h +++ b/tools/arch/arm64/include/uapi/asm/kvm.h @@ -184,8 +184,9 @@ struct kvm_vcpu_events { __u8 serror_pending; __u8 serror_has_esr; __u8 ext_dabt_pending; + __u8 ext_iabt_pending; /* Align it to 8 bytes */ - __u8 pad[5]; + __u8 pad[4]; __u64 serror_esr; } exception; __u32 reserved[12]; diff --git a/tools/testing/selftests/kvm/Makefile.kvm b/tools/testing/selft= ests/kvm/Makefile.kvm index 16d2e9f32619f..708fd126a36dd 100644 --- a/tools/testing/selftests/kvm/Makefile.kvm +++ b/tools/testing/selftests/kvm/Makefile.kvm @@ -148,6 +148,7 @@ TEST_GEN_PROGS_arm64 +=3D arm64/aarch32_id_regs TEST_GEN_PROGS_arm64 +=3D arm64/arch_timer_edge_cases TEST_GEN_PROGS_arm64 +=3D arm64/debug-exceptions TEST_GEN_PROGS_arm64 +=3D arm64/hypercalls +TEST_GEN_PROGS_arm64 +=3D arm64/inject_iabt TEST_GEN_PROGS_arm64 +=3D arm64/mmio_abort TEST_GEN_PROGS_arm64 +=3D arm64/page_fault_test TEST_GEN_PROGS_arm64 +=3D arm64/psci_test diff --git a/tools/testing/selftests/kvm/arm64/inject_iabt.c b/tools/testin= g/selftests/kvm/arm64/inject_iabt.c new file mode 100644 index 0000000000000..43b701e9143c2 --- /dev/null +++ b/tools/testing/selftests/kvm/arm64/inject_iabt.c @@ -0,0 +1,100 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * inject_iabt.c - Tests for injecting instruction aborts into guest. + */ + +#include "processor.h" +#include "test_util.h" + +static void expect_iabt_handler(struct ex_regs *regs) +{ + u64 esr =3D read_sysreg(esr_el1); + + GUEST_PRINTF("Guest SEA esr_el1=3D%#lx\n", esr); + GUEST_ASSERT_EQ(ESR_ELx_EC(esr), ESR_ELx_EC_IABT_CUR); + GUEST_ASSERT_EQ(esr & ESR_ELx_FSC_TYPE, ESR_ELx_FSC_EXTABT); + /* + * We inject IABT but there is no SEA in guest at all, + * so guest should see FnV =3D=3D 1, which is set by KVM. + */ + GUEST_ASSERT(esr & ESR_ELx_FnV); + + GUEST_DONE(); +} + +static void guest_code(void) +{ + GUEST_FAIL("Guest should only run SEA handler"); +} + +static void vcpu_run_expect_done(struct kvm_vcpu *vcpu) +{ + struct ucall uc; + bool guest_done =3D false; + + do { + vcpu_run(vcpu); + switch (get_ucall(vcpu, &uc)) { + case UCALL_ABORT: + REPORT_GUEST_ASSERT(uc); + break; + case UCALL_PRINTF: + ksft_print_msg("From guest: %s", uc.buffer); + case UCALL_DONE: + ksft_print_msg("Guest done gracefully!\n"); + guest_done =3D true; + break; + default: + TEST_FAIL("Unexpected ucall: %lu", uc.cmd); + } + } while (!guest_done); +} + +static void vcpu_inject_ext_iabt(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_events events =3D {}; + + events.exception.ext_iabt_pending =3D true; + vcpu_events_set(vcpu, &events); +} + +static void vcpu_inject_invalid_abt(struct kvm_vcpu *vcpu) +{ + struct kvm_vcpu_events events =3D {}; + int r; + + events.exception.ext_iabt_pending =3D true; + events.exception.ext_dabt_pending =3D true; + + ksft_print_msg("Injecting invalid external abort events\n"); + r =3D __vcpu_ioctl(vcpu, KVM_SET_VCPU_EVENTS, &events); + TEST_ASSERT(r && errno =3D=3D EINVAL, + KVM_IOCTL_ERROR(KVM_SET_VCPU_EVENTS, r)); +} + +static void test_inject_iabt(void) +{ + struct kvm_vcpu *vcpu; + struct kvm_vm *vm; + + vm =3D vm_create_with_one_vcpu(&vcpu, guest_code); + + vm_init_descriptor_tables(vm); + vcpu_init_descriptor_tables(vcpu); + + vm_install_sync_handler(vm, VECTOR_SYNC_CURRENT, + ESR_ELx_EC_IABT_CUR, expect_iabt_handler); + + vcpu_inject_invalid_abt(vcpu); + + vcpu_inject_ext_iabt(vcpu); + vcpu_run_expect_done(vcpu); + + kvm_vm_free(vm); +} + +int main(void) +{ + test_inject_iabt(); + return 0; +} --=20 2.49.0.967.g6a0df3ecc3-goog From nobody Sun Feb 8 07:21:30 2026 Received: from mail-pf1-f201.google.com (mail-pf1-f201.google.com [209.85.210.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1816826D4C8 for ; Mon, 5 May 2025 16:14:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461667; cv=none; b=NXyX3pgn9YK4T04v5m+LwzduF3frh71cX/b40g98AN9xWpKv37Iue3cxeBbFfIQ6thVuiVlOc4M/BdSwHvbFCts65q1cpHmnJdifjJ6t6fNFwjkCsbDAFT5SAmoJWC7fbdZsUqEzuEjNaLQ5tEecoZiLXldn+zrFRT+rKm1+VKI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1746461667; c=relaxed/simple; bh=fyGzyoomoTYo9P/p+ApF12DGGrn8h/VhXiAUSg88UfE=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=sQq+8TlDD3HhetDhiyK7g3ckH3+A6g5SzzJ4jrjUCM3Tk4xzuXZdZZMOGaruO7/fe3KLlgQvAAD4QyZMSGqcfNkl8LS/5e5YvAgf74Qbte4hZVA6smSEU3tJJnjjQgg9CW9s3KY1wcvi6d1KWRY2+Z/UTsj09Wa3X+lGcq0QqCM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=JjLX4CbE; arc=none smtp.client-ip=209.85.210.201 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jiaqiyan.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="JjLX4CbE" Received: by mail-pf1-f201.google.com with SMTP id d2e1a72fcca58-736cd36189bso6538252b3a.2 for ; Mon, 05 May 2025 09:14:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1746461664; x=1747066464; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=qCRTPaLL/ddSZCO1i6dVdhSDdFDF5EjuW/HmAYg1Ry0=; b=JjLX4CbEoI+2S1VRPgO5cgh/LAQ5gedaYVo9IK9XWCblD8R069jq+QHSdrX9ijz9Na x5onDG5LYYy2sjhiu/VwW59NpFhwJ08fm82kGvNJAmx5x+dFABnWuk77ggmHZQhx39Lh rtdk9X5PD+H92zftPKpwBRzxKQMQ+cQ/1KllcM/zbSL7zy7nMuuAX2uEqjL5jmUq3Q66 rp9x983JaJX1PSoMe9RlMvRMfgPIA/vU+DJxwdSbOh8rjuM9Saei/fSXrTXj84kW/Cz2 Xv+E+kjORA+R6knHiZ+W8ycm5lp30v6pRW5Bzf1RmZMRzkR0bsKS7jPD2KudT05bFDf7 cS5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1746461664; x=1747066464; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qCRTPaLL/ddSZCO1i6dVdhSDdFDF5EjuW/HmAYg1Ry0=; b=wc1yTZl2FHJlIAoEeozeWUJQjsW8e+VJuRIZ/3kZoMYMflD7F/A9sdrhn4srmHCXbG v3O3V13FPmFV+gIfLNm7ueh8XiNsqlTKV0+GmphafIptXtDyFkl7392bRJriVJ6dDuFI pMNTpsy6KMSpQNvilOJmMylhV+e1zj8A1SwM2uygXnDgblRvsSVWqWNFUxTLxPbMhwgG 3YA6G90fgyweXv4OXMqFJv6SRLlaOxZYnH794ZSU2qo/iIA7V/dePpOGIFtwYgtOG7wE ew6d73zSKkkiAMGGFfs8Q+M1nK/VGDq5JbmIlTDRljPMSxjpHObsOHupufAxTxyiLtL9 2VIQ== X-Forwarded-Encrypted: i=1; AJvYcCU+EhApj/oGpKo/PtKWBH1FCpvo7uocvpjNl2fgqJm2833TCw5QsGg9bhIgloqnrl2HX9p16E6WQXE9+Yg=@vger.kernel.org X-Gm-Message-State: AOJu0Yz19ZfTqzAkm9rCeQapZ97w72fYffjo6yTHWP+097y6qThsL66s EY3t2k4DxSVtG+Qn3CQHQAExL/Y2BcLQW9FZ+Ev9DU4LKO4kVh9agKlK4tteDOIVzXnmm5rC9c0 1mtrxUfEdjg== X-Google-Smtp-Source: AGHT+IE1SVeU3n4ai6CrxrMJm0u9dBQoZvk8Si1b7xa5PwydPXT87LL3o3Oqy2HVQLl3IBAbqfHn31O9UmD3SA== X-Received: from pfbki23.prod.google.com ([2002:a05:6a00:9497:b0:73d:b1c4:5d7f]) (user=jiaqiyan job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a00:2e9d:b0:740:6f7f:7645 with SMTP id d2e1a72fcca58-7406f7f7a1amr12841654b3a.8.1746461664178; Mon, 05 May 2025 09:14:24 -0700 (PDT) Date: Mon, 5 May 2025 16:14:12 +0000 In-Reply-To: <20250505161412.1926643-1-jiaqiyan@google.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20250505161412.1926643-1-jiaqiyan@google.com> X-Mailer: git-send-email 2.49.0.967.g6a0df3ecc3-goog Message-ID: <20250505161412.1926643-7-jiaqiyan@google.com> Subject: [PATCH v1 6/6] Documentation: kvm: new uAPI for handling SEA From: Jiaqi Yan To: maz@kernel.org, oliver.upton@linux.dev Cc: joey.gouly@arm.com, suzuki.poulose@arm.com, yuzenghui@huawei.com, catalin.marinas@arm.com, will@kernel.org, pbonzini@redhat.com, corbet@lwn.net, shuah@kernel.org, kvm@vger.kernel.org, kvmarm@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org, duenwen@google.com, rananta@google.com, jthoughton@google.com, Jiaqi Yan Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset="utf-8" Document the new userspace-visible features and APIs for handling synchronous external abort (SEA) - KVM_CAP_ARM_SEA_TO_USER: How userspace enables the new feature. - KVM_EXIT_ARM_SEA: When userspace needs to handle SEA and what userspace gets while taking the SEA. - KVM_CAP_ARM_INJECT_EXT_(D|I)ABT: How userspace injects SEA to guest while taking the SEA. Signed-off-by: Jiaqi Yan --- Documentation/virt/kvm/api.rst | 120 +++++++++++++++++++++++++++++---- 1 file changed, 107 insertions(+), 13 deletions(-) diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst index 47c7c3f92314e..fa91a123e1b88 100644 --- a/Documentation/virt/kvm/api.rst +++ b/Documentation/virt/kvm/api.rst @@ -1236,8 +1236,9 @@ directly to the virtual CPU). __u8 serror_pending; __u8 serror_has_esr; __u8 ext_dabt_pending; + __u8 ext_iabt_pending; /* Align it to 8 bytes */ - __u8 pad[5]; + __u8 pad[4]; __u64 serror_esr; } exception; __u32 reserved[12]; @@ -1292,20 +1293,52 @@ ARM64: =20 User space may need to inject several types of events to the guest. =20 +Inject SError +~~~~~~~~~~~~~ + Set the pending SError exception state for this VCPU. It is not possible to 'cancel' an Serror that has been made pending. =20 -If the guest performed an access to I/O memory which could not be handled = by -userspace, for example because of missing instruction syndrome decode -information or because there is no device mapped at the accessed IPA, then -userspace can ask the kernel to inject an external abort using the address -from the exiting fault on the VCPU. It is a programming error to set -ext_dabt_pending after an exit which was not either KVM_EXIT_MMIO or -KVM_EXIT_ARM_NISV. This feature is only available if the system supports -KVM_CAP_ARM_INJECT_EXT_DABT. This is a helper which provides commonality in -how userspace reports accesses for the above cases to guests, across diffe= rent -userspace implementations. Nevertheless, userspace can still emulate all A= rm -exceptions by manipulating individual registers using the KVM_SET_ONE_REG = API. +Inject SEA (synchronous external abort) +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + +- If the guest performed an access to I/O memory which could not be handle= d by + userspace, for example because of missing instruction syndrome decode + information or because there is no device mapped at the accessed IPA. + +- If the guest consumed an uncorrected memory error, and RAS extension in = the + Trusted Firmware choose to notify PE with SEA, KVM has to handle it when + host APEI is unable to claim the SEA. For the following types of faults, + if userspace enabled KVM_CAP_ARM_SEA_TO_USER, KVM returns to userspace w= ith + KVM_EXIT_ARM_SEA: + + - Synchronous external abort, not on translation table walk or hardware + update of translation table. + + - Synchronous external abort on translation table walk or hardware updat= e of + translation table, including all levels. + + - Synchronous parity or ECC error on memory access, not on translation t= able + walk. + + - Synchronous parity or ECC error on memory access on translation table = walk + or hardware update of translation table, including all levels. + +For the cases above, userspace can ask the kernel to replay either an exte= rnal +data abort (by setting ext_dabt_pending) or an external instruciton abort +(by setting ext_iabt_pending) into the faulting VCPU. KVM will use the add= ress +from the exiting fault on the VCPU. Setting both ext_dabt_pending and +ext_iabt_pending at the same time will return -EINVAL. + +It is a programming error to set ext_dabt_pending or ext_iabt_pending afte= r an +exit which was not KVM_EXIT_MMIO, KVM_EXIT_ARM_NISV or KVM_EXIT_ARM_SEA. +Injecting SEA for data and instruction abort is only available if KVM supp= orts +KVM_CAP_ARM_INJECT_EXT_DABT and KVM_CAP_ARM_INJECT_EXT_IABT respectively. + +This is a helper which provides commonality in how userspace reports acces= ses +for the above cases to guests, across different userspace implementations. +Nevertheless, userspace can still emulate all Arm exceptions by manipulati= ng +individual registers using the KVM_SET_ONE_REG API. =20 See KVM_GET_VCPU_EVENTS for the data structure. =20 @@ -7151,6 +7184,55 @@ The valid value for 'flags' is: - KVM_NOTIFY_CONTEXT_INVALID -- the VM context is corrupted and not valid in VMCS. It would run into unknown result if resume the target VM. =20 +:: + + /* KVM_EXIT_ARM_SEA */ + struct { + __u64 esr; + #define KVM_EXIT_ARM_SEA_FLAG_GVA_VALID (1ULL << 0) + #define KVM_EXIT_ARM_SEA_FLAG_GPA_VALID (1ULL << 1) + __u64 flags; + __u64 gva; + __u64 gpa; + } arm_sea; + +Used on arm64 systems. When the VM capability KVM_CAP_ARM_SEA_TO_USER is +enabled, a VM exit is generated if guest caused a synchronous external abo= rt +(SEA) and the host APEI fails to handle the SEA. + +Historically KVM handles SEA by first delegating the SEA to host APEI as t= here +is high chance that the SEA is caused by consuming uncorrected memory erro= r. +However, not all platforms support SEA handling in APEI, and KVM's fallback +handling is to inject an async SError into the guest, which usually panics +guest kernel unpleasantly. As an alternative, userspace can participate in= to +the SEA handling by enabling KVM_CAP_ARM_SEA_TO_USER at VM creation, after +querying the capability. Once enabled, when KVM has to handle the guest +caused SEA, it returns to userspace with KVM_EXIT_ARM_SEA, with details +about the SEA available in 'arm_sea'. + +The 'esr' filed holds the value of the exception syndrome register (ESR) w= hile +KVM taking the SEA, which tells userspace the character of the current SEA, +such as its Exception Class, Synchronous Error Type, Fault Specific Code a= nd +so on. For more details on ESR, check the Arm Architecture Registers +documentation. + +The 'flags' field indicates if the faulting addresses are available while +taking the SEA: + + - KVM_EXIT_ARM_SEA_FLAG_GVA_VALID -- the faulting guest virtual address + is valid and userspace can get its value in the 'gva' field. + - KVM_EXIT_ARM_SEA_FLAG_GPA_VALID -- the faulting guest physical address + is valid and userspace can get its value in the 'gpa' filed. + +Userspace needs to take actions to handle guest SEA synchronously, namely = in +the same thread that runs KVM_RUN and receives KVM_EXIT_ARM_SEA. One of the +encouraged approaches is to utilize the KVM_SET_VCPU_EVENTS to inject the = SEA +to the faulting VCPU. This way, the guest has the opportunity to keep runn= ing +and limit the blast radius of the SEA to the particular guest application = that +caused the SEA. If the Exception Class indicated by 'esr' field in 'arm_se= a' +is data abort, userspace should inject data abort. If the Exception Class = is +instruction abort, userspace should inject instruction abort. + :: =20 /* Fix the size of the union. */ @@ -8478,7 +8560,7 @@ ENOSYS for the others. When enabled, KVM will exit to userspace with KVM_EXIT_SYSTEM_EVENT of type KVM_SYSTEM_EVENT_SUSPEND to process the guest suspend request. =20 -7.37 KVM_CAP_ARM_WRITABLE_IMP_ID_REGS +7.42 KVM_CAP_ARM_WRITABLE_IMP_ID_REGS ------------------------------------- =20 :Architectures: arm64 @@ -8496,6 +8578,18 @@ aforementioned registers before the first KVM_RUN. T= hese registers are VM scoped, meaning that the same set of values are presented on all vCPUs in a given VM. =20 +7.43 KVM_CAP_ARM_SEA_TO_USER +---------------------------- + +:Architecture: arm64 +:Target: VM +:Parameters: none +:Returns: 0 on success, -EINVAL if unsupported. + +This capability, if KVM_CHECK_EXTENSION indicates that it is available, me= ans +that KVM has an implementation that allows userspace to participate in han= dling +synchronous external abort caused by VM, by an exit of KVM_EXIT_ARM_SEA. + 8. Other capabilities. =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D =20 --=20 2.49.0.967.g6a0df3ecc3-goog