From nobody Sat Feb 7 05:56:52 2026 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass(p=none dis=none) header.from=samsung.com ARC-Seal: i=1; a=rsa-sha256; t=1766498600; cv=none; d=zohomail.com; s=zohoarc; b=esU9/CtdAfGb+Iz6f2hQXO/w8DGAL0MVOcucskkUldILfJSENz+1Sa3272R0fzFEbT5Nz+V3A5x6V885m/2ag/7AOgoKm5VSvSBZmpHk4S8/ZUqGsXVlkCLUiIFX5VtaTDa4ZqmKA/ZN5O/WP0S6OOwpQPnOELDRmhR+WIrN9hA= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1766498600; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=+jIeGwIatjkyU4t7V93SrU6AI+YxYI+n5AFl/dB/Tsg=; b=npHVcGzHyAwh0dNBoRdu5KL77gEMccKTW32BhbfdT99mTEZDtqky7eVX12vVZSc8Z1fkNb2Wya9tpzCZfZl/hNsqMOzZHavlqEg0bKiB3Wjs3IPColNFbsl5FMbSFGdfX7n00J28R3jUjuETc/I120rnbm/Y2o24DmSlObWLs7M= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=pass header.from= (p=none dis=none) Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1766498600442946.4648974049452; Tue, 23 Dec 2025 06:03:20 -0800 (PST) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1vY2xt-0005G2-JQ; Tue, 23 Dec 2025 09:02:31 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vY16Q-0005L7-6x for qemu-devel@nongnu.org; Tue, 23 Dec 2025 07:02:58 -0500 Received: from mailout2.samsung.com ([203.254.224.25]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1vY16L-00028g-NV for qemu-devel@nongnu.org; Tue, 23 Dec 2025 07:02:57 -0500 Received: from epcas5p2.samsung.com (unknown [182.195.41.40]) by mailout2.samsung.com (KnoxPortal) with ESMTP id 20251223120244epoutp024397366f2793b9c97be4368d7ed2e543~D1olVyIiv1773417734epoutp02W for ; Tue, 23 Dec 2025 12:02:44 +0000 (GMT) Received: from epsnrtp04.localdomain (unknown [182.195.42.156]) by epcas5p4.samsung.com (KnoxPortal) with ESMTPS id 20251223120243epcas5p4fac8c002c8f443f8a716510b0865da50~D1olDvkCi1846618466epcas5p4m; Tue, 23 Dec 2025 12:02:43 +0000 (GMT) Received: from epcas5p4.samsung.com (unknown [182.195.38.88]) by epsnrtp04.localdomain (Postfix) with ESMTP id 4dbDCk6m0Lz6B9m5; Tue, 23 Dec 2025 12:02:42 +0000 (GMT) Received: from epsmtip1.samsung.com (unknown [182.195.34.30]) by epcas5p4.samsung.com (KnoxPortal) with ESMTPA id 20251223120242epcas5p44b454df5afd0d517a4562a545ad11218~D1ojhtPZe1846618466epcas5p4i; Tue, 23 Dec 2025 12:02:42 +0000 (GMT) Received: from test-System-Product-Name.samsungds.net (unknown [107.99.41.85]) by epsmtip1.samsung.com (KnoxPortal) with ESMTPA id 20251223120240epsmtip12d23c4881f6573367aa14d8cdc9b0a04~D1ohk4nw03164831648epsmtip19; Tue, 23 Dec 2025 12:02:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.samsung.com 20251223120244epoutp024397366f2793b9c97be4368d7ed2e543~D1olVyIiv1773417734epoutp02W DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1766491364; bh=+jIeGwIatjkyU4t7V93SrU6AI+YxYI+n5AFl/dB/Tsg=; h=From:To:Cc:Subject:Date:References:From; b=dZSgEBbkRnz+wANoz2IL0vTpkCURglkL0mYCZvqeogQ++sM6Qt/JOADVJeE/3c3gs aL2QQ/kwm8Vuv9kXLsMxQ2S4NPnmy9dCzfI/hPI3891B42U9VJYCplzIogfgFIrIuZ woy7bxfcxPQk8KvuVYT7FZ996elca+eXdxkQlhVA= From: Ashish Anand To: qemu-devel@nongnu.org Cc: peter.maydell@linaro.org, qemu-arm@nongnu.org, y.kaushal@samsung.com, vishwa.mg@samsung.com, ashish.anand202@gmail.com, "ashish.a6" Subject: [PATCH] target/arm: Implement WFE, SEV and SEVONPEND for Cortex-M Date: Tue, 23 Dec 2025 17:30:12 +0530 Message-ID: <20251223120012.541777-1-ashish.a6@samsung.com> X-Mailer: git-send-email 2.43.0 MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable X-CMS-MailID: 20251223120242epcas5p44b454df5afd0d517a4562a545ad11218 X-Msg-Generator: CA Content-Type: text/plain; charset="utf-8" CMS-TYPE: 105P cpgsPolicy: CPGSC10-542,Y X-CFilter-Loop: Reflected X-CMS-RootMailID: 20251223120242epcas5p44b454df5afd0d517a4562a545ad11218 References: Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=203.254.224.25; envelope-from=ashish.a6@samsung.com; helo=mailout2.samsung.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_MSPIKE_H5=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, RCVD_IN_VALIDITY_SAFE_BLOCKED=0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-Mailman-Approved-At: Tue, 23 Dec 2025 09:01:09 -0500 X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @samsung.com) X-ZM-MESSAGEID: 1766498603724158500 From: "ashish.a6" Currently, QEMU implements the 'Wait For Event' (WFE) instruction as a a simple yield. This causes high host CPU usage because Guest RTOS idle loops effectively become busy-wait loops. To improve efficiency, this patch transitions WFE to use the architectu= ral 'Halt' state (EXCP_HLT) for M-profile CPUs. This allows the host thread to sleep when the guest is idle. To support this transition, we implement the full architectural behavior required for WFE, specifically the 'Event Register' and 'SEVONPEND' log= ic defined in the ARMv7-M specification. This ensures that the CPU correct= ly identifies wakeup conditions as defined by the architecture. Changes include: 1. target/arm/cpu.h: Added 'event_register' to the v7m state struct. 2. target/arm/cpu.c: Ensured event_register is cleared on reset and updated arm_cpu_has_work() to check the event register state. 3. target/arm/machine.c: Added VMState subsection for migration compat= ibility. 4. hw/intc/armv7m_nvic.h: Added was_pending flag, to check for SEVONPE= ND logic. 5. hw/intc/armv7m_nvic.c: Implemented SEVONPEND logic in : nvic_recompute_state() and nvic_recompute_state_secure(). This sets the event register and kicks the CPU when a new interrupt becomes pending. 6. target/arm/tcg/helper.h: Declare the new helper_sev function. 7. target/arm/tcg/op_helper.c: - Update HELPER(wfe) to use EXCP_HLT for M-profile CPUs. - Implement HELPER(sev) to set the event register and kick all vCPU= s. 8. target/arm/tcg/t16.decode / t32.decode: Enable decoding of the SEV instruction in Thumb/Thumb-2 mode. 9. target/arm/tcg/translate.c: Implement trans_SEV using inline TCG generation to ensure system-wide visibility. This patch enables resource-efficient idle emulation for Cortex-M. Signed-off-by: Ashish Anand ashish.a6@samsung.com --- Testing Notes: - Performance Test: Monitored host CPU usage while the guest was in an=20 idle loop. Usage dropped from 100% (busy-yield) to <5% (EXCP_HALT). - Functional Test: Verified SEVONPEND logic using a bare-metal test=20 case on Cortex-M. Confirmed that pending interrupts correctly=20 trigger a wakeup when SEVONPEND is enabled, as per Armv7-M spec. - Regression Test: 'make check' passes for all ARM targets. - Note: 4 x86-specific test cases (lpc-ich9, e1000e) failed, but=20 I confirmed these also fail on a clean master branch. hw/intc/armv7m_nvic.c | 22 ++++++++++++++++- include/hw/intc/armv7m_nvic.h | 1 + target/arm/cpu.c | 7 ++++++ target/arm/cpu.h | 1 + target/arm/machine.c | 19 +++++++++++++++ target/arm/tcg/helper.h | 1 + target/arm/tcg/op_helper.c | 45 +++++++++++++++++++++++++++++------ target/arm/tcg/t16.decode | 2 +- target/arm/tcg/t32.decode | 2 +- target/arm/tcg/translate.c | 20 ++++++++++++---- 10 files changed, 106 insertions(+), 14 deletions(-) diff --git a/hw/intc/armv7m_nvic.c b/hw/intc/armv7m_nvic.c index 7c78961040..a60990c71f 100644 --- a/hw/intc/armv7m_nvic.c +++ b/hw/intc/armv7m_nvic.c @@ -16,6 +16,7 @@ #include "migration/vmstate.h" #include "qemu/timer.h" #include "hw/intc/armv7m_nvic.h" +#include "hw/arm/armv7m.h" #include "hw/irq.h" #include "hw/qdev-properties.h" #include "system/tcg.h" @@ -232,6 +233,8 @@ static void nvic_recompute_state_secure(NVICState *s) int pend_irq =3D 0; bool pending_is_s_banked =3D false; int pend_subprio =3D 0; + ARMv7MState *armv7m =3D container_of(s, ARMv7MState, nvic); + ARMCPU *cpu =3D armv7m->cpu; =20 /* R_CQRV: precedence is by: * - lowest group priority; if both the same then @@ -259,6 +262,14 @@ static void nvic_recompute_state_secure(NVICState *s) targets_secure =3D !exc_is_banked(i) && exc_targets_secure= (s, i); } =20 + if (!vec->was_pending && vec->pending) { + if (cpu->env.v7m.scr[bank] & R_V7M_SCR_SEVONPEND_MASK) { + cpu->env.v7m.event_register =3D 1; + qemu_cpu_kick(CPU(cpu)); + } + } + vec->was_pending =3D vec->pending; + prio =3D exc_group_prio(s, vec->prio, targets_secure); subprio =3D vec->prio & ~nvic_gprio_mask(s, targets_secure); if (vec->enabled && vec->pending && @@ -293,6 +304,8 @@ static void nvic_recompute_state(NVICState *s) int pend_prio =3D NVIC_NOEXC_PRIO; int active_prio =3D NVIC_NOEXC_PRIO; int pend_irq =3D 0; + ARMv7MState *armv7m =3D container_of(s, ARMv7MState, nvic); + ARMCPU *cpu =3D armv7m->cpu; =20 /* In theory we could write one function that handled both * the "security extension present" and "not present"; however @@ -316,6 +329,13 @@ static void nvic_recompute_state(NVICState *s) if (vec->active && vec->prio < active_prio) { active_prio =3D vec->prio; } + if (!vec->was_pending && vec->pending) { + if (cpu->env.v7m.scr[M_REG_NS] & R_V7M_SCR_SEVONPEND_MASK) { + cpu->env.v7m.event_register =3D 1; + qemu_cpu_kick(CPU(cpu)); + } + } + vec->was_pending =3D vec->pending; } =20 if (active_prio > 0) { @@ -1657,7 +1677,7 @@ static void nvic_writel(NVICState *s, uint32_t offset= , uint32_t value, } /* We don't implement deep-sleep so these bits are RAZ/WI. * The other bits in the register are banked. - * QEMU's implementation ignores SEVONPEND and SLEEPONEXIT, which + * QEMU's implementation ignores SLEEPONEXIT, which * is architecturally permitted. */ value &=3D ~(R_V7M_SCR_SLEEPDEEP_MASK | R_V7M_SCR_SLEEPDEEPS_MASK); diff --git a/include/hw/intc/armv7m_nvic.h b/include/hw/intc/armv7m_nvic.h index 7b9964fe7e..305eaf6e6a 100644 --- a/include/hw/intc/armv7m_nvic.h +++ b/include/hw/intc/armv7m_nvic.h @@ -32,6 +32,7 @@ typedef struct VecInfo { uint8_t pending; uint8_t active; uint8_t level; /* exceptions <=3D15 never set level */ + bool was_pending; } VecInfo; =20 struct NVICState { diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 39292fb9bc..0a044f7254 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -143,6 +143,12 @@ static bool arm_cpu_has_work(CPUState *cs) { ARMCPU *cpu =3D ARM_CPU(cs); =20 + if (arm_feature(&cpu->env, ARM_FEATURE_M)) { + if (cpu->env.v7m.event_register) { + return true; + } + } + return (cpu->power_state !=3D PSCI_OFF) && cpu_test_interrupt(cs, CPU_INTERRUPT_FIQ | CPU_INTERRUPT_HARD @@ -480,6 +486,7 @@ static void arm_cpu_reset_hold(Object *obj, ResetType t= ype) ~(R_V7M_FPCCR_LSPEN_MASK | R_V7M_FPCCR_S_MASK); env->v7m.control[M_REG_S] |=3D R_V7M_CONTROL_FPCA_MASK; #endif + env->v7m.event_register =3D 0; } =20 /* M profile requires that reset clears the exclusive monitor; diff --git a/target/arm/cpu.h b/target/arm/cpu.h index 39f2b2e54d..44433a444c 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -639,6 +639,7 @@ typedef struct CPUArchState { uint32_t nsacr; uint32_t ltpsize; uint32_t vpr; + uint32_t event_register; } v7m; =20 /* Information associated with an exception about to be taken: diff --git a/target/arm/machine.c b/target/arm/machine.c index 0befdb0b28..acc79188f2 100644 --- a/target/arm/machine.c +++ b/target/arm/machine.c @@ -508,6 +508,24 @@ static const VMStateDescription vmstate_m_mve =3D { }, }; =20 +static bool m_event_needed(void *opaque) +{ + ARMCPU *cpu =3D opaque; + /* Only save the state if the event register is set (non-zero) */ + return cpu->env.v7m.event_register !=3D 0; +} + +static const VMStateDescription vmstate_m_event =3D { + .name =3D "cpu/m/event", + .version_id =3D 1, + .minimum_version_id =3D 1, + .needed =3D m_event_needed, + .fields =3D (const VMStateField[]) { + VMSTATE_UINT32(env.v7m.event_register, ARMCPU), + VMSTATE_END_OF_LIST() + } +}; + static const VMStateDescription vmstate_m =3D { .name =3D "cpu/m", .version_id =3D 4, @@ -535,6 +553,7 @@ static const VMStateDescription vmstate_m =3D { &vmstate_m_v8m, &vmstate_m_fp, &vmstate_m_mve, + &vmstate_m_event, NULL } }; diff --git a/target/arm/tcg/helper.h b/target/arm/tcg/helper.h index 4636d1bc03..5a10a9fba3 100644 --- a/target/arm/tcg/helper.h +++ b/target/arm/tcg/helper.h @@ -60,6 +60,7 @@ DEF_HELPER_1(yield, void, env) DEF_HELPER_1(pre_hvc, void, env) DEF_HELPER_2(pre_smc, void, env, i32) DEF_HELPER_1(vesb, void, env) +DEF_HELPER_1(sev, void, env) =20 DEF_HELPER_3(cpsr_write, void, env, i32, i32) DEF_HELPER_2(cpsr_write_eret, void, env, i32) diff --git a/target/arm/tcg/op_helper.c b/target/arm/tcg/op_helper.c index 4fbd219555..ad79724778 100644 --- a/target/arm/tcg/op_helper.c +++ b/target/arm/tcg/op_helper.c @@ -469,16 +469,47 @@ void HELPER(wfit)(CPUARMState *env, uint64_t timeout) #endif } =20 +void HELPER(sev)(CPUARMState *env) +{ + CPUState *cs =3D env_cpu(env); + CPU_FOREACH(cs) { + ARMCPU *target_cpu =3D ARM_CPU(cs); + if (arm_feature(&target_cpu->env, ARM_FEATURE_M)) { + target_cpu->env.v7m.event_register =3D 1; + } + qemu_cpu_kick(cs); + } +} + void HELPER(wfe)(CPUARMState *env) { - /* This is a hint instruction that is semantically different - * from YIELD even though we currently implement it identically. - * Don't actually halt the CPU, just yield back to top - * level loop. This is not going into a "low power state" - * (ie halting until some event occurs), so we never take - * a configurable trap to a different exception level. + /* + * WFE (Wait For Event) is a hint instruction. + * For Cortex-M (M-profile), we implement the strict architectural beh= avior: + * 1. Check the Event Register (set by SEV or SEVONPEND). + * 2. If set, clear it and continue (consume the event). */ - HELPER(yield)(env); + if (arm_feature(env, ARM_FEATURE_M)) { + CPUState *cs =3D env_cpu(env); + + if (env->v7m.event_register) { + env->v7m.event_register =3D 0; + return; + } + + cs->exception_index =3D EXCP_HLT; + cs->halted =3D 1; + cpu_loop_exit(cs); + } else { + /* + * For A-profile and others, we rely on the existing "yield" behav= ior. + * Don't actually halt the CPU, just yield back to top + * level loop. This is not going into a "low power state" + * (ie halting until some event occurs), so we never take + * a configurable trap to a different exception level + */ + HELPER(yield)(env); + } } =20 void HELPER(yield)(CPUARMState *env) diff --git a/target/arm/tcg/t16.decode b/target/arm/tcg/t16.decode index 646c74929d..ac6e24ac14 100644 --- a/target/arm/tcg/t16.decode +++ b/target/arm/tcg/t16.decode @@ -229,7 +229,7 @@ REVSH 1011 1010 11 ... ... @rdm WFI 1011 1111 0011 0000 =20 # TODO: Implement SEV, SEVL; may help SMP performance. - # SEV 1011 1111 0100 0000 + SEV 1011 1111 0100 0000 # SEVL 1011 1111 0101 0000 =20 # The canonical nop has the second nibble as 0000, but the whole of the diff --git a/target/arm/tcg/t32.decode b/target/arm/tcg/t32.decode index d327178829..59b0edf63f 100644 --- a/target/arm/tcg/t32.decode +++ b/target/arm/tcg/t32.decode @@ -370,7 +370,7 @@ CLZ 1111 1010 1011 ---- 1111 .... 1000 ...= . @rdm WFI 1111 0011 1010 1111 1000 0000 0000 0011 =20 # TODO: Implement SEV, SEVL; may help SMP performance. - # SEV 1111 0011 1010 1111 1000 0000 0000 0100 + SEV 1111 0011 1010 1111 1000 0000 0000 0100 # SEVL 1111 0011 1010 1111 1000 0000 0000 0101 =20 ESB 1111 0011 1010 1111 1000 0000 0001 0000 diff --git a/target/arm/tcg/translate.c b/target/arm/tcg/translate.c index 63735d9789..bfe2691884 100644 --- a/target/arm/tcg/translate.c +++ b/target/arm/tcg/translate.c @@ -3241,14 +3241,25 @@ static bool trans_YIELD(DisasContext *s, arg_YIELD = *a) return true; } =20 +static bool trans_SEV(DisasContext *s, arg_SEV *a) +{ + gen_update_pc(s, curr_insn_len(s)); + gen_helper_sev(tcg_env); + tcg_gen_exit_tb(NULL, 0); + s->base.is_jmp =3D DISAS_NORETURN; + return true; +} + static bool trans_WFE(DisasContext *s, arg_WFE *a) { /* * When running single-threaded TCG code, use the helper to ensure that - * the next round-robin scheduled vCPU gets a crack. In MTTCG mode we - * just skip this instruction. Currently the SEV/SEVL instructions, - * which are *one* of many ways to wake the CPU from WFE, are not - * implemented so we can't sleep like WFI does. + * the next round-robin scheduled vCPU gets a crack. + * + * For Cortex-M, we implement the architectural WFE behavior (sleeping + * until an event occurs or the Event Register is set). + * For other profiles, we currently treat this as a NOP or yield, + * to preserve existing performance characteristics. */ if (!(tb_cflags(s->base.tb) & CF_PARALLEL)) { gen_update_pc(s, curr_insn_len(s)); @@ -6807,6 +6818,7 @@ static void arm_tr_tb_stop(DisasContextBase *dcbase, = CPUState *cpu) break; case DISAS_WFE: gen_helper_wfe(tcg_env); + tcg_gen_exit_tb(NULL, 0); break; case DISAS_YIELD: gen_helper_yield(tcg_env); --=20 2.43.0