From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759280653; cv=none; d=zohomail.com; s=zohoarc; b=KrY1CkKIXpzgzkZqIuJAFJGb3adiN0snX3hHfW6f9z61sGCBHFHIeHSzsXW3c4YlneTIPwipmTM2JlLyWppQ0i/GAKH6QU4AG7vM7H4EzuOVqBIBbpVURK1thdiYTtBf1TMHm9zaFfuFq2pRHzd5ZYu7zCS54TQuIKggrw1GeP0= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759280653; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=E95z1LitnEsQAC+IDKkKPOZh6LcS9ALCsacoPjved+U=; b=OP3ypnc99DnHsD8r+E/l2OxiqLFLlUIZ+Bjjy8Zc+KpjmhFhF+vQ1l/T5lq7jmbk7IRYav0ZJ+OOdblD77C25u/AUIv99Qqz+u33Gcyts6LpUzYIbpuTHk/S/An9q+/8dI+cNVINZTzUH2fdgYLU5Y30YXMXUeohUG+lcwPB3vs= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759280653774277.6869244642444; Tue, 30 Sep 2025 18:04:13 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lEk-0005C7-8r; Tue, 30 Sep 2025 21:02:32 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lEi-0005Bw-77 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:02:28 -0400 Received: from mail-wm1-x332.google.com ([2a00:1450:4864:20::332]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lEa-0007xW-F3 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:02:27 -0400 Received: by mail-wm1-x332.google.com with SMTP id 5b1f17b1804b1-46e47cca387so47362895e9.3 for ; Tue, 30 Sep 2025 18:02:18 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280536; x=1759885336; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=E95z1LitnEsQAC+IDKkKPOZh6LcS9ALCsacoPjved+U=; b=DhnrnXn4Tc0q4q2rZjfCzaLGWLVay6+CVDzrSaJNYV4VX9586P0PjrYaFAu/aOjBXM fzULzfBw02hFDU3sNYjvRCuQ4pMWonFpUF3SfrsWmRmbs8PcSuPbAiEDZ/7CkvN7ZPqE iPNDTGGuEukcnMJ4CwWcqIRaigC6C+pFI+GF/oXUxxXgXQlLgale+BLANvFVDywXfxtn KqeA5S+xbo5i18hO30fo/Cr3OQV8YCUAhBdI4rch2yyyinEP3ZCvuwAHqXdVUffEBIbJ P04vOvdwswppwCIMS38Q50jv2KAXHrSRTp5xW4U9uEy5F6RAIaHrd+XbHpQwbY9PTdtR lyaA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280536; x=1759885336; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=E95z1LitnEsQAC+IDKkKPOZh6LcS9ALCsacoPjved+U=; b=Vb7mQWyi2/bTvB1CB1ayzCXSIbXH6sVHhoF42gOv69EGzO74SqzSFRRPhQhZByTPBT j/NO0dr4HW3T8L4snXKPxnKcvx6RbQvnojFuSAlz69IrO5qv9Kdgfky9D3KMyWrj++0y 8Sq9LuWOzy2CioxVtfmkLrSpRNXwlKC/0sa3Ti4UZ5LSGiL8Bcp79A8n4Kralqdv5TjU eNtxSXJHcwTR50wL4K9q6Z9GUcZmo6Qo9ClYLHf4RQe5KWkK1QuG3Qv1MkME5InqHwRu gr85BG3QxyN4iS+KoeuTMv7GP/V20MLXfpyQoKuqe//juDeGQwUzhRub5x6FEvS6+u85 6hMQ== X-Gm-Message-State: AOJu0Yw9xy0E/iC/olNveUEsEjl9EoY8G8gH0JFNDNU3L1WXxcmbX6d7 yT9wDWBtI1Z3t2XwglW04kKXnp55HuUeYpxtw9lBrGFGQ4yYID/3LTrxMmS4U75IVugYqo1tAV2 z8mtpTpRUNw== X-Gm-Gg: ASbGncuwjtUElM5GteH1ReXhM4v2MoxqXZmIFfvYYjbYdTa/vlS0n91Q5NRtBcbFJ3L r6Dg1F5FquOtrspR2dbszONy6CXM/O87+ojyxlkanP+/STsN5nupDuR3i6R0a78alZLvpu9szaH dF3FxTzHL69S4BGXQO8eqggQKc3DYxdeKFWFSfNMf96srZDhHgER5NRmUnHy27GFZ/Pc8gq+AZM NZB+ZSUFH/mDa2033mUHR1nhPPBEh1BxI1J18HAgr9CqsEmHP+hwDUO5UTvjxRnLqMsibhz6/3j y0gRbRBxyH9+FZTdAiP8cPu+/mQfUkWQa/mcvPIxvkvLcDcsR1s/lat/BX2hLsdBfBZMriAyIZB /k7BPL/8IIvhXlj8q6ddzxheyUfGrN6Qjew8w9rh/inthxH8m5BmthkD/rCPZxhPdFz1YGJCbZ+ z74aZIUADD7mzssGDvJfkFbQ3S2hHpTIZlaKhLs7yhPqmoZrxK4lbyIw== X-Google-Smtp-Source: AGHT+IH40O7EwScL5QaCUo3bATKbMaaGb/W4QizlO/YTtsqCqCReWYP7rh5t8KrEv01hrhpIzc8IZQ== X-Received: by 2002:a05:600c:4748:b0:46e:5df3:190d with SMTP id 5b1f17b1804b1-46e61285eb1mr11032815e9.11.1759280536078; Tue, 30 Sep 2025 18:02:16 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 01/24] hw/core: Introduce administrative power-state property and its accessors Date: Wed, 1 Oct 2025 01:01:04 +0000 Message-Id: <20251001010127.3092631-2-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::332; envelope-from=salil.mehta@opnsrc.net; helo=mail-wm1-x332.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759280656242116600 From: Salil Mehta Some devices cannot be hot-unplugged, either because removal is not meaning= ful (e.g. on-board devices) or not supported (e.g. certain PCIe devices). Other= s, such as CPUs on architectures like ARM, lack native hotplug support but can still have their availability controlled through host policy. In all these cases, a mechanism is needed to track and control a device=E2=80=99s *admin= istrative* power state =E2=80=94 independent of its runtime operational state =E2=80= =94 so QEMU can: - Disable a device while keeping it described in firmware, ACPI, or other configuration. - Prevent guest use until explicitly re-enabled. - Coordinate transitions with platform-specific power handlers and migrat= ion logic. This patch introduces the core qdev support for administrative power state = =E2=80=94 defining the property, enum, and accessors =E2=80=94 without yet applying i= t to any device. Later patches in this series integrate it with helper APIs (qdev_disable(), qdev_enable(), etc.) and specific device types such as CPU= s, completing the flow with platform-specific handlers. Key additions: - New enum DeviceAdminPowerState with ENABLED, DISABLED, and REMOVED stat= es, defaulting to ENABLED. - New DeviceClass flag admin_power_state_supported to advertise support f= or administrative transitions. - New QOM property "admin_power_state" to query or set the state on suppo= rted devices. - Internal accessors device_get_admin_power_state() and device_set_admin_power_state() to manage state changes, including safe handling when the device is not yet realized. The enum models *policy* rather than electrical or functional power state, = and is distinct from runtime mechanisms (e.g. PSCI for ARM CPUs). The actual operational state of a device is maintained by platform-specific or device- specific code, which enforces runtime behaviour based on the administrative setting. Every device starts administratively ENABLED by default. A DISABLED device remains logically present but blocked from operation; a REMOVED devi= ce is logically absent. Signed-off-by: Salil Mehta --- hw/core/qdev.c | 62 ++++++++++++++++++++++++++++++++++++++++++ include/hw/qdev-core.h | 54 ++++++++++++++++++++++++++++++++++++ target/arm/cpu.c | 1 + 3 files changed, 117 insertions(+) diff --git a/hw/core/qdev.c b/hw/core/qdev.c index f600226176..8502d6216f 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -633,6 +633,53 @@ static bool device_get_hotplugged(Object *obj, Error *= *errp) return dev->hotplugged; } =20 +static int device_get_admin_power_state(Object *obj, Error **errp) +{ + DeviceState *dev =3D DEVICE(obj); + + return dev->admin_power_state; +} + +static void +device_set_admin_power_state(Object *obj, int new_state, Error **errp) +{ + DeviceState *dev =3D DEVICE(obj); + DeviceClass *dc =3D DEVICE_GET_CLASS(dev); + + if (!dc->admin_power_state_supported) { + error_setg(errp, "Device '%s' admin power state change not support= ed", + object_get_typename(obj)); + return; + } + + switch (new_state) { + case DEVICE_ADMIN_POWER_STATE_DISABLED: { + /* + * TODO: Operational state transition triggered by administrative = action + * Powering off the realized device either synchronously or via OS= PM. + */ + + qatomic_set(&dev->admin_power_state, DEVICE_ADMIN_POWER_STATE_DISA= BLED); + smp_wmb(); + break; + } + case DEVICE_ADMIN_POWER_STATE_ENABLED: { + /* + * TODO: Operational state transition triggered by administrative = action + * Powering on the device and restoring migration registration. + */ + + qatomic_set(&dev->admin_power_state, DEVICE_ADMIN_POWER_STATE_ENAB= LED); + smp_wmb(); + break; + } + default: + error_setg(errp, "Invalid admin power state %d for device '%s'", + new_state, dev->id); + break; + } +} + static void device_initfn(Object *obj) { DeviceState *dev =3D DEVICE(obj); @@ -644,6 +691,7 @@ static void device_initfn(Object *obj) =20 dev->instance_id_alias =3D -1; dev->realized =3D false; + dev->admin_power_state =3D DEVICE_ADMIN_POWER_STATE_ENABLED; dev->allow_unplug_during_migration =3D false; =20 QLIST_INIT(&dev->gpios); @@ -731,6 +779,15 @@ device_vmstate_if_get_id(VMStateIf *obj) return qdev_get_dev_path(dev); } =20 +static const QEnumLookup device_admin_power_state_lookup =3D { + .array =3D (const char *const[]) { + [DEVICE_ADMIN_POWER_STATE_ENABLED] =3D "enabled", + [DEVICE_ADMIN_POWER_STATE_REMOVED] =3D "removed", + [DEVICE_ADMIN_POWER_STATE_DISABLED] =3D "disabled", + }, + .size =3D DEVICE_ADMIN_POWER_STATE_MAX, +}; + static void device_class_init(ObjectClass *class, const void *data) { DeviceClass *dc =3D DEVICE_CLASS(class); @@ -765,6 +822,11 @@ static void device_class_init(ObjectClass *class, cons= t void *data) device_get_hotpluggable, NULL); object_class_property_add_bool(class, "hotplugged", device_get_hotplugged, NULL); + object_class_property_add_enum(class, "admin_power_state", + "DeviceAdminPowerState", + &device_admin_power_state_lookup, + device_get_admin_power_state, + device_set_admin_power_state); object_class_property_add_link(class, "parent_bus", TYPE_BUS, offsetof(DeviceState, parent_bus), NULL= , 0); } diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 530f3da702..3bc212ab3a 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -159,6 +159,7 @@ struct DeviceClass { */ bool user_creatable; bool hotpluggable; + bool admin_power_state_supported; =20 /* callbacks */ /** @@ -217,6 +218,55 @@ typedef QLIST_HEAD(, NamedGPIOList) NamedGPIOListHead; typedef QLIST_HEAD(, NamedClockList) NamedClockListHead; typedef QLIST_HEAD(, BusState) BusStateHead; =20 +/** + * enum DeviceAdminPowerState - Administrative control states for a device + * + * This enum defines abstract administrative states used by QEMU to enable, + * disable, or logically remove a device from the virtual machine. These + * states reflect administrative control over a device's power availability + * and presence in the system. These administrative states are distinct fr= om + * runtime operational power states (e.g., PSCI states for ARM CPUs). They + * represent administrative *policy* rather than physical, electrical, or + * functional state. + * + * Administrative state is managed externally "via QMP, firmware, or other + * host-side policy agents" and acts as a gating policy that determines + * whether guest software is permitted to interact with the device. Most + * devices default to the ENABLED state unless explicitly disabled or remo= ved. + * + * Changing a device administrative state may directly or indirectly affect + * its operational behavior. For example, a DISABLED device will reject gu= est + * attempts to power it on or transition it out of a suspended state. Not = all + * devices support dynamic transitions between administrative states. + * + * - DEVICE_ADMIN_POWER_STATE_ENABLED: + * The device is administratively enabled (i.e., logically present and + * permitted to operate). Guest software may change its operational st= ate + * (e.g., activate, deactivate, suspend) within allowed architectural + * semantics. This is the default state for most devices unless explic= itly + * disabled or unplugged. + * + * - DEVICE_ADMIN_POWER_STATE_DISABLED: + * The device is administratively disabled. It remains logically prese= nt + * but is blocked from functional operation. Guest-initiated transitio= ns + * are either suppressed or ignored. This is typically used to enforce + * shutdown, deny execution, or offline the device without removing it. + * + * - DEVICE_ADMIN_POWER_STATE_REMOVED: + * The device has been logically removed (e.g., via hot-unplug). It is= no + * longer considered present or visible to the guest. This state exists + * for representational or transitional purposes only. In most cases, + * once removed, the corresponding DeviceState object is destroyed and + * no longer tracked. This concept may not apply to some devices as + * architectural limitations might make unplug not meaningful. + */ +typedef enum DeviceAdminPowerState { + DEVICE_ADMIN_POWER_STATE_ENABLED =3D 0, + DEVICE_ADMIN_POWER_STATE_DISABLED, + DEVICE_ADMIN_POWER_STATE_REMOVED, + DEVICE_ADMIN_POWER_STATE_MAX +} DeviceAdminPowerState; + /** * struct DeviceState - common device state, accessed with qdev helpers * @@ -240,6 +290,10 @@ struct DeviceState { * @realized: has device been realized? */ bool realized; + /** + * @admin_power_state: device administrative power state + */ + DeviceAdminPowerState admin_power_state; /** * @pending_deleted_event: track pending deletion events during unplug */ diff --git a/target/arm/cpu.c b/target/arm/cpu.c index e2b2337399..0c9a2e7ea4 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -2765,6 +2765,7 @@ static void arm_cpu_class_init(ObjectClass *oc, const= void *data) cc->gdb_get_core_xml_file =3D arm_gdb_get_core_xml_file; cc->gdb_stop_before_watchpoint =3D true; cc->disas_set_info =3D arm_disas_set_info; + dc->admin_power_state_supported =3D true; =20 #ifdef CONFIG_TCG cc->tcg_ops =3D &arm_tcg_ops; --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759280704; cv=none; d=zohomail.com; s=zohoarc; b=mogqqNS942Hf4jnGHjqIsvxJNc0cCHot1ewKGkcs6eWzYQ9x+qK9vGoPAM6Lte/3zvk3AxRrEDBXu6P3+SsJpZD6noFuYByNE7ubxYEuglA/LQ1pLgvQt0Z2AuIa2PWO6Gtc/KZLzbfmFY9HuLj8FlScJ/kpBXXP4VLXzihpWjo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759280704; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=OLSc1BapDuUxqkydDnJRDwQehgAHt3T/lFBNPXooIC8=; b=PNqGwX8vaKlRTJQYj1UXMWYJU7i0r4Y7LwebLP3OTbmVJKnJOS6jHyQdkhQrTluPI6Q0sy7Rzl6WH1nYvyuhYpjb3M3VWahrDgm4PH1CE7GpaMgUrwDvSBbXzz8f6sZZetqRXnLjnorbBtU/Cq+emB7OODmGodsItNUMxBLp1A0= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759280704655307.61229648295114; Tue, 30 Sep 2025 18:05:04 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lFB-0005H3-1c; Tue, 30 Sep 2025 21:02:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lEz-0005EM-Eh for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:02:46 -0400 Received: from mail-wm1-x334.google.com ([2a00:1450:4864:20::334]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lEe-0007yE-Vt for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:02:45 -0400 Received: by mail-wm1-x334.google.com with SMTP id 5b1f17b1804b1-46e42deffa8so58161975e9.0 for ; Tue, 30 Sep 2025 18:02:22 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280538; x=1759885338; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=OLSc1BapDuUxqkydDnJRDwQehgAHt3T/lFBNPXooIC8=; b=aKivOOSbNG1XL/3YT4EcSWRoc1Yqo0AOe0GDdaJIZvZAHMktR/i8XryyMKejEpfBU8 4nXPHEl3OmWIAHt39pLQvJKKALWedL7gIjIpWxZG8orrMYQ6pv2ncKkXJhjR4KUb2NAw wzf87zZ9B/go6lfANpAM7I8HKNykcRM894o5WXHmXQjRYMwblNSc87bgtfJDcMpOzEB6 /t5DHZEN1nPBJS/6Qdd5KwNylEnSv1+nMhjA7s3HGRTCig+DxmXVyPCAnIfbSAyBogXL ipJJTd6BuOIdYikpKBrV5bDfvqd8y4hM8dnYiCVqExYADiUh1xWGxe11qaYu+HAVXbrK T2Uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280538; x=1759885338; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=OLSc1BapDuUxqkydDnJRDwQehgAHt3T/lFBNPXooIC8=; b=D0eShl5J2pAvIRarQQ/7iUffFbNhiiuaOEgxPtRp6IfBObX81idHfComBR34iP/6/5 aFZZVP6hfBT3JT1pAjOeAA+BbMKPYuABg2LVD/3Bl03umC3XWxTdpPUelZRQuOptaQRW cc9zed8Nrta6voIpkNt9VHaBPTpLYucZI63ZbPuFyM2v+2NUf5J/ESvwtvI86ODKG8bu mpjyKOq3A3PFOsombEgOZxJhnYZ8/URkfYHAeRt1hFP8C+06YPok/iRRE1CpgZUu9tN8 vN+g/389gvsXw0Ip28sFTgiKLCiL0eSsuOpI5U0uah/b7QvMbi6yO8AZEuAXf4uoOy/H bFPw== X-Gm-Message-State: AOJu0YzTbVHKCO2KNyJ0oj9eXrW0cyI9X8JQGr8a/0yRoUYy05C6LRdM 6tfTJ0Iio6GgZEV9l2xeypA5zthcFtHLWahqE8VeeyikVXE02EPd/5GuSQ+kfzFdQwFMlF4dnDo 4f9dghv9z9g== X-Gm-Gg: ASbGnct1snAuekAx65xXsZr/fWHbhRdLqCaA9Y73HHaB8xNzqlD5Gpu/LWqDCRFkbhU +PvhJ/3k7jKM5kJU2JJ8nDARGzZuN1s4SXlfvev4A0+Ybn1OCgdsOUkhk0xyoLdk7fOjXM9pcM6 oRxVekslOsHvGXwvnNQ+X6+VSj1Y0t9rH97GuMmRCOy5ofMYy6WP3mdu737C+DYm+F4i0lwvQCk KlmfdagUvWm/d1dYdzxpmb+eREQQf3rEBeIbI6BiNkArRSQQBenCIgl0dJKA6K6sN0f2xYaLuzv A8PYYoSG/oMfF/zyD7gSfZh2DSFyi+lxIMmkHZXk9QlT2Hvdw/ibb9eQMDNChKOPiED0yCbJO+k qpcDwYWKDzUu5R1U0B/sgXQDSOQ+j+gGTDUMHLRIFD2lbYXAweaE8idT+TeyAk+2U3xS85l6R3T 9Tb0qwEP66MSB5QEvjMlVpyD9K20iHweic5u9fl3GUavg= X-Google-Smtp-Source: AGHT+IGe4rGnkaCPkwTQQa5FttkzRVljkl57WR6TySYlUl9rhfMQBCUrtG4XD1ivmQN+nX3/6v41dQ== X-Received: by 2002:a05:6000:25c6:b0:3fd:bf1d:15ac with SMTP id ffacd0b85a97d-425577f3639mr1328524f8f.20.1759280537838; Tue, 30 Sep 2025 18:02:17 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 02/24] hw/core, qemu-options.hx: Introduce 'disabledcpus' SMP parameter Date: Wed, 1 Oct 2025 01:01:05 +0000 Message-Id: <20251001010127.3092631-3-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::334; envelope-from=salil.mehta@opnsrc.net; helo=mail-wm1-x334.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759280706016116601 From: Salil Mehta Add support for a new SMP configuration parameter, 'disabledcpus', which specifies the number of additional CPUs that are present in the virtual machine but administratively disabled at boot. These CPUs are visible in firmware (e.g. ACPI tables) yet unavailable to the guest until explicitly enabled via QMP/HMP, or via the 'device_set' API (introduced in later patches). This feature is intended for architectures that lack native CPU hotplug support but can change the administrative power state of present CPUs. It allows simulating CPU hot-add=E2=80=93like scenarios while all CPUs rema= in physically present in the topology at boot time. Note: ARM is the first architecture to support this concept. Changes include: - Extend CpuTopology with a 'disabledcpus' field. - Update machine_parse_smp_config() to account for disabled CPUs when computing 'cpus' and 'maxcpus'. - Update SMPConfiguration in QAPI to accept 'disabledcpus'. - Extend -smp option documentation to describe 'disabledcpus' usage and behavior. Signed-off-by: Salil Mehta --- hw/core/machine-smp.c | 24 +++++++----- include/hw/boards.h | 2 + qapi/machine.json | 3 ++ qemu-options.hx | 86 +++++++++++++++++++++++++++++++++---------- system/vl.c | 3 ++ 5 files changed, 89 insertions(+), 29 deletions(-) diff --git a/hw/core/machine-smp.c b/hw/core/machine-smp.c index 0be0ac044c..c1a09fdc3f 100644 --- a/hw/core/machine-smp.c +++ b/hw/core/machine-smp.c @@ -87,6 +87,7 @@ void machine_parse_smp_config(MachineState *ms, { MachineClass *mc =3D MACHINE_GET_CLASS(ms); unsigned cpus =3D config->has_cpus ? config->cpus : 0; + unsigned disabledcpus =3D config->has_disabledcpus ? config->disabledc= pus : 0; unsigned drawers =3D config->has_drawers ? config->drawers : 0; unsigned books =3D config->has_books ? config->books : 0; unsigned sockets =3D config->has_sockets ? config->sockets : 0; @@ -166,8 +167,13 @@ void machine_parse_smp_config(MachineState *ms, sockets =3D sockets > 0 ? sockets : 1; cores =3D cores > 0 ? cores : 1; threads =3D threads > 0 ? threads : 1; + + maxcpus =3D drawers * books * sockets * dies * clusters * + modules * cores * threads; + cpus =3D maxcpus - disabledcpus; } else { - maxcpus =3D maxcpus > 0 ? maxcpus : cpus; + maxcpus =3D maxcpus > 0 ? maxcpus : cpus + disabledcpus; + cpus =3D cpus > 0 ? cpus : maxcpus - disabledcpus; =20 if (mc->smp_props.prefer_sockets) { /* prefer sockets over cores before 6.2 */ @@ -207,12 +213,8 @@ void machine_parse_smp_config(MachineState *ms, } } =20 - total_cpus =3D drawers * books * sockets * dies * - clusters * modules * cores * threads; - maxcpus =3D maxcpus > 0 ? maxcpus : total_cpus; - cpus =3D cpus > 0 ? cpus : maxcpus; - ms->smp.cpus =3D cpus; + ms->smp.disabledcpus =3D disabledcpus; ms->smp.drawers =3D drawers; ms->smp.books =3D books; ms->smp.sockets =3D sockets; @@ -226,6 +228,8 @@ void machine_parse_smp_config(MachineState *ms, mc->smp_props.has_clusters =3D config->has_clusters; =20 /* sanity-check of the computed topology */ + total_cpus =3D maxcpus =3D drawers * books * sockets * dies * clusters= * + modules * cores * threads; if (total_cpus !=3D maxcpus) { g_autofree char *topo_msg =3D cpu_hierarchy_to_string(ms); error_setg(errp, "Invalid CPU topology: " @@ -235,12 +239,12 @@ void machine_parse_smp_config(MachineState *ms, return; } =20 - if (maxcpus < cpus) { + if (maxcpus < (cpus + disabledcpus)) { g_autofree char *topo_msg =3D cpu_hierarchy_to_string(ms); error_setg(errp, "Invalid CPU topology: " - "maxcpus must be equal to or greater than smp: " - "%s =3D=3D maxcpus (%u) < smp_cpus (%u)", - topo_msg, maxcpus, cpus); + "maxcpus must be equal to or greater than smp[+disabled= cpus]:" + "%s =3D=3D maxcpus (%u) < smp_cpus (%u) [+ offline cpus= (%u)]", + topo_msg, maxcpus, cpus, disabledcpus); return; } =20 diff --git a/include/hw/boards.h b/include/hw/boards.h index f94713e6e2..2b182d7817 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -361,6 +361,7 @@ typedef struct DeviceMemoryState { /** * CpuTopology: * @cpus: the number of present logical processors on the machine + * @disabledcpus: the number additional present but admin disabled cpus * @drawers: the number of drawers on the machine * @books: the number of books in one drawer * @sockets: the number of sockets in one book @@ -373,6 +374,7 @@ typedef struct DeviceMemoryState { */ typedef struct CpuTopology { unsigned int cpus; + unsigned int disabledcpus; unsigned int drawers; unsigned int books; unsigned int sockets; diff --git a/qapi/machine.json b/qapi/machine.json index 038eab281c..e45740da33 100644 --- a/qapi/machine.json +++ b/qapi/machine.json @@ -1634,6 +1634,8 @@ # # @cpus: number of virtual CPUs in the virtual machine # +# @disabledcpus: number of additional present but disabled(or offline) CPUs +# # @maxcpus: maximum number of hotpluggable virtual CPUs in the virtual # machine # @@ -1657,6 +1659,7 @@ ## { 'struct': 'SMPConfiguration', 'data': { '*cpus': 'int', + '*disabledcpus': 'int', '*drawers': 'int', '*books': 'int', '*sockets': 'int', diff --git a/qemu-options.hx b/qemu-options.hx index ab23f14d21..83ccde341b 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -326,12 +326,15 @@ SRST ERST =20 DEF("smp", HAS_ARG, QEMU_OPTION_smp, - "-smp [[cpus=3D]n][,maxcpus=3Dmaxcpus][,drawers=3Ddrawers][,books=3Dbo= oks][,sockets=3Dsockets]\n" - " [,dies=3Ddies][,clusters=3Dclusters][,modules=3Dmodule= s][,cores=3Dcores]\n" - " [,threads=3Dthreads]\n" - " set the number of initial CPUs to 'n' [default=3D1]\n" - " maxcpus=3D maximum number of total CPUs, including\n" - " offline CPUs for hotplug, etc\n" + "-smp [[cpus=3D]n][,disabledcpus=3Ddisabledcpus][,maxcpus=3Dmaxcpus][,= drawers=3Ddrawers][,books=3Dbooks]\n" + " [,sockets=3Dsockets][,dies=3Ddies][,clusters=3Dcluster= s][,modules=3Dmodules]\n" + " [,cores=3Dcores][,threads=3Dthreads]\n" + " set the initial number of CPUs present and\n" + " administratively enabled at boot time to 'n' [defau= lt=3D1]\n" + " disabledcpus=3D number of present but administrativel= y\n" + " disabled CPUs (unavailable to the guest at boot)\n" + " maxcpus=3D maximum total CPUs (present + hotpluggable= )\n" + " on machines without CPU hotplug, defaults to n + di= sabledcpus\n" " drawers=3D number of drawers on the machine board\n" " books=3D number of books in one drawer\n" " sockets=3D number of sockets in one book\n" @@ -351,22 +354,49 @@ DEF("smp", HAS_ARG, QEMU_OPTION_smp, " For a particular machine type board, an expected CPU topology h= ierarchy\n" " can be defined through the supported sub-option. Unsupported pa= rameters\n" " can also be provided in addition to the sub-option, but their v= alues\n" - " must be set as 1 in the purpose of correct parsing.\n", + " must be set as 1 in the purpose of correct parsing.\n" + " \n" + " Administratively disabled CPUs: Some machine types do not suppo= rt vCPU\n" + " hotplug but their CPUs can be marked disabled (powered off) and= kept\n" + " unavailable to the guest. Later, such CPUs can be enabled via Q= MP/HMP\n" + " (e.g., 'device_set ... admin-state=3Denable'). This is similar = to hotplug,\n" + " except all disabled CPUs are already present at boot. Useful on= \n" + " architectures that lack architectural CPU hotplug.\n", QEMU_ARCH_ALL) SRST -``-smp [[cpus=3D]n][,maxcpus=3Dmaxcpus][,drawers=3Ddrawers][,books=3Dbooks= ][,sockets=3Dsockets][,dies=3Ddies][,clusters=3Dclusters][,modules=3Dmodule= s][,cores=3Dcores][,threads=3Dthreads]`` - Simulate a SMP system with '\ ``n``\ ' CPUs initially present on - the machine type board. On boards supporting CPU hotplug, the optional - '\ ``maxcpus``\ ' parameter can be set to enable further CPUs to be - added at runtime. When both parameters are omitted, the maximum number +``-smp [[cpus=3D]n][,disabledcpus=3Ddisabledcpus][,maxcpus=3Dmaxcpus][,dra= wers=3Ddrawers][,books=3Dbooks][,sockets=3Dsockets][,dies=3Ddies][,clusters= =3Dclusters][,modules=3Dmodules][,cores=3Dcores][,threads=3Dthreads]`` + Simulate a SMP system with '\ ``n``\ ' CPUs initially present & enable= d on + the machine type board. Furthermore, on architectures that support cha= nging + the administrative power state of CPUs, optional '\ ``disabledcpus``\ ' + parameter specifies *additional* CPUs that are present in firmware (e.= g., + ACPI) but are administratively disabled (i.e., not usable by the guest= at + boot time). + + This is different from CPU hotplug where additional CPUs are not even + present in the system description. Administratively disabled CPUs appe= ar in + ACPI tables i.e. are provisioned, but cannot be used until explicitly + enabled via QMP/HMP or the deviceset API. + + On boards supporting CPU hotplug, the optional '\ ``maxcpus``\ ' param= eter + can be set to enable further CPUs to be added at runtime. When both + '\ ``n``\ ' & '\ ``maxcpus``\ ' parameters are omitted, the maximum nu= mber of CPUs will be calculated from the provided topology members and the - initial CPU count will match the maximum number. When only one of them - is given then the omitted one will be set to its counterpart's value. - Both parameters may be specified, but the maximum number of CPUs must - be equal to or greater than the initial CPU count. Product of the - CPU topology hierarchy must be equal to the maximum number of CPUs. - Both parameters are subject to an upper limit that is determined by - the specific machine type chosen. + initial CPU count will match the maximum number. When only one of them= is + given then the omitted one will be set to its counterpart's value. Both + parameters may be specified, but the maximum number of CPUs must be eq= ual + to or greater than the initial CPU count. Product of the CPU topology + hierarchy must be equal to the maximum number of CPUs. Both parameters= are + subject to an upper limit that is determined by the specific machine t= ype + chosen. Boards that support administratively disabled CPUs but do *not* + support CPU hotplug derive the maximum number of CPUs implicitly: + '\ ``maxcpus``\ ' is treated as '\ ``n + disabledcpus``\ ' (the total = CPUs + present in firmware). If '\ ``maxcpus``\ ' is provided, it must equal + '\ ``n + disabledcpus``\ '. The topology product must equal this deriv= ed + maximum as well. + + Note: Administratively disabled CPUs will appear to the guest as + unavailable, and any attempt to bring them online must go through QMP/= HMP + commands like 'device_set'. =20 To control reporting of CPU topology information, values of the topolo= gy parameters can be specified. Machines may only support a subset of the @@ -425,6 +455,24 @@ SRST =20 -smp 2 =20 + Examples using 'disabledcpus': + + For a board without CPU hotplug, enable 4 CPUs at boot and provision + 2 additional administratively disabled CPUs (maximum is derived + implicitly as 6 =3D 4 + 2): + + :: + + -smp cpus=3D4,disabledcpus=3D2 + + For a board that supports CPU hotplug and 'disabledcpus', enable 4 CPUs + at boot, provision 2 administratively disabled CPUs, and allow hotplug= of + 2 more CPUs (for a maximum of 8): + + :: + + -smp cpus=3D4,disabledcpus=3D2,maxcpus=3D8 + Note: The cluster topology will only be generated in ACPI and exposed to guest if it's explicitly specified in -smp. ERST diff --git a/system/vl.c b/system/vl.c index 3b7057e6c6..2f0fd21a1f 100644 --- a/system/vl.c +++ b/system/vl.c @@ -736,6 +736,9 @@ static QemuOptsList qemu_smp_opts =3D { { .name =3D "cpus", .type =3D QEMU_OPT_NUMBER, + }, { + .name =3D "disabledcpus", + .type =3D QEMU_OPT_NUMBER, }, { .name =3D "drawers", .type =3D QEMU_OPT_NUMBER, --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759281114; cv=none; d=zohomail.com; s=zohoarc; b=W2Q6plmCaPYaqoQbmohd7oW9WbSOJvPYhnOZn9EWru5zZvoXkeWWADDsS+2IJpn1ZcD0Jbm4ZZ2xH2x1qaKof549UcSNUz8TNpZcw+63mmt0PI+zn8UKmPqmin1K4MKU4c9AnFiEvQH+vQcokaeF8AOV2vQjHGuVSwBPFi4Rt/s= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759281114; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=+8XfMldxxVCrNmv0kqbSVtv1GLDt+68S/R+Hgz9Z4xw=; b=IEuTcvOMa/9y6CoiMqJXbDClMzaEdAhqeTphRQVX0Iisxg4BWCiH0mVpwKmOQbW4N+6xgPFDw+eOp85tQhAWhW4hCwpJihspCbhTwinBaqqzvt/kn7hwjCQXl+NK+anLRm0YeKMB8jKF4paigzkpdVZ1/AIUstMwQloqNJf3J/4= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759281114800994.5959688740807; Tue, 30 Sep 2025 18:11:54 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lEy-0005DQ-9w; Tue, 30 Sep 2025 21:02:44 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lEu-0005Ch-96 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:02:40 -0400 Received: from mail-wm1-x331.google.com ([2a00:1450:4864:20::331]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lEe-0007ye-Iv for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:02:40 -0400 Received: by mail-wm1-x331.google.com with SMTP id 5b1f17b1804b1-46e34bd8eb2so21360645e9.3 for ; Tue, 30 Sep 2025 18:02:22 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280541; x=1759885341; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=+8XfMldxxVCrNmv0kqbSVtv1GLDt+68S/R+Hgz9Z4xw=; b=GZPkvM/T37vSapIx7l9E8hJRAlAV+RLMzovz7/tXOHZPNU/Z3yqheRp+9xlADBjI1L 9PTzXwFvSFUXS/wi/E0ZOGfisMECCvgQh1AnMCAP6OyvN7BEGlQEC4x3+98Cr+OUG3// vIQH9UQFJluJUat1M6dMEzxGcV7r1pxv1ESqZaXsRnwhH7hPeAKyGljFNlSn3LaDaCq7 E9xlwwhEkVzJ3xBKlF6zseR2hH+2LFt1huzUHkadt5L7gyH9ul3fhIBvPZ4x1O6T9E0c 5i07KJQUcSMyIqSQKBW/ZKE6hFKAhJXUpJr4xOMvsMzgYtukyeiZ677tCYKIA4Z5bM03 LBig== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280541; x=1759885341; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=+8XfMldxxVCrNmv0kqbSVtv1GLDt+68S/R+Hgz9Z4xw=; b=XEIZyxB4VJm2grbLzQy8j+w2XZJDwjcYwhzUyfudpJ5O+RzbWKBYDks8Sa6mF0srQG l3R7gG7DjVVVP+2tTJFrQ1yO9T/JvBV7Nn7SQXlPIqNjQEC1fVrJiGlRocufIG6zDUyr u2LEZV8A+l8Te3JZZo/fFkv/fPBcUMiYXDdG/bZiBuFYU1GtKbV4l4z9HP7MB6rSKbey opoxXyR/n07AsT0fQr5LMpM5ERjNhKHY2QsEDZdgGtyPkWVGJbBmBJ4xL9tNAIkNFlBI 6VohFnfQhX4+ct6n+/OMVGA5t8eU6Y4XF97ePOSawn5x3CPyniksgcX8t1+zf/nRlAu0 XB3Q== X-Gm-Message-State: AOJu0YyI0l3MclaxZ7ntfWcNZf8EWOE6ITIjQzz3VeObcuOXzeiC2HJ3 WFaGoY2YvczDTEUxk5QtqXxHHJjx6w3SII8mguRDuokwNpqhC4BUjHsi6Ec7B8hs132O1Epqmy+ GCWIoN7IpfQ== X-Gm-Gg: ASbGncvhl0dgFW7koJYOkGqWf2V5TDn92hRsPkxaJIWjD1/bwq1+PQ9sDwOCxuyJRrW tBldm/PLEayBFyDk1/hdl8f8uKNmXywkkLYs0R0r2bkOet+qrmr2E+dEni/L1/VkA9uEoRMDwJG hvl5sVsjLi54Rwxq9P2yemNbvm79J+ff7LEv4czzPh1w3JL1HTXkmxGT0GV+gHr7aSeF42I73L8 i02oM7uiCMSDMW7a1rpZhWCqX9kcX2Vk8t86HS2ZTrYDqiKbNgZmKN+ATT1V1NqlUWvB49hlWBh lAHFoWMloykXh5KWIksvx96XducwaLxojiuttlod21mjerQ22MOKpiTOP5I7G1JrDg1x0dxvxmB ZJFXeAxfWYwLjWuLAD9mC3yFPYshB6yIqR+r/Dq8BT4200+8plsGxjoN3FQWMtMtdc5UGx/f/ov Jj1vDP1Y7VuW+tupnNnoNoxVG9GEmetLdi5msknHeMKeg= X-Google-Smtp-Source: AGHT+IHq/D/wM1ROCaoz7TPSt/zRIizaBtVxINQT1Wi3V79bOXPFSKj0yC6hHR65I5ACPNuCq/g28Q== X-Received: by 2002:a05:600c:474b:b0:46e:477a:16cc with SMTP id 5b1f17b1804b1-46e612bdb1dmr14093585e9.24.1759280540352; Tue, 30 Sep 2025 18:02:20 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 03/24] hw/arm/virt: Clamp 'maxcpus' as-per machine's vCPU deferred online-capability Date: Wed, 1 Oct 2025 01:01:06 +0000 Message-Id: <20251001010127.3092631-4-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::331; envelope-from=salil.mehta@opnsrc.net; helo=mail-wm1-x331.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759281116257116600 From: Salil Mehta To support a vCPU hot-add=E2=80=93like model on ARM, the virt machine may b= e setup with more CPUs than are active at boot. These additional CPUs are fully realized= in KVM and listed in ACPI tables from the start, but begin in a disabled state. They can later be brought online or taken offline under host or platform po= licy control. The CPU topology is fixed at VM creation time and cannot change dynamically on ARM. Therefore, we must determine precisely the 'maxcpus' va= lue that applies for the full lifetime of the VM. On ARM, this deferred online-capable model is only valid if: - The GIC version is 3 or higher, and - Each non-boot CPU=E2=80=99s GIC CPU Interface is marked =E2=80=9Conline= -capable=E2=80=9D in its ACPI GICC structure (UEFI ACPI Specification 6.5, =C2=A75.2.12.14, Tabl= e 5.37 =E2=80=9CGICC CPU Interface Flags=E2=80=9D), and - The chosen accelerator supports safe deferred CPU online: * TCG with multi-threaded TCG (MTTCG) enabled * KVM (on supported hosts) * Not HVF or QTest This patch sizes the machine=E2=80=99s max-possible CPUs during VM init: - If all conditions are satisfied, retain the full set of CPUs correspond= ing to (`-smp cpus` + `-smp disabledcpus`), allowing the additional (initia= lly disabled) CPUs to participate in later policy-driven online. - Otherwise, clamp the max-possible CPUs to the boot-enabled count (`-smp disabledcpus=3D0` equivalent) to avoid advertising CPUs the gues= t can never use. A new MachineClass flag, `has_online_capable_cpus`, records whether the mac= hine supports deferred vCPU online. This is usable by other machine types as wel= l. Signed-off-by: Salil Mehta --- hw/arm/virt.c | 84 ++++++++++++++++++++++++++++++--------------- include/hw/boards.h | 1 + 2 files changed, 57 insertions(+), 28 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index ef6be3660f..76f21bd56a 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -2168,8 +2168,7 @@ static void machvirt_init(MachineState *machine) bool has_ged =3D !vmc->no_ged; unsigned int smp_cpus =3D machine->smp.cpus; unsigned int max_cpus =3D machine->smp.max_cpus; - - possible_cpus =3D mc->possible_cpu_arch_ids(machine); + DeviceClass *dc; =20 /* * In accelerated mode, the memory map is computed earlier in kvm_type= () @@ -2186,7 +2185,7 @@ static void machvirt_init(MachineState *machine) * we are about to deal with. Once this is done, get rid of * the object. */ - cpuobj =3D object_new(possible_cpus->cpus[0].type); + cpuobj =3D object_new(machine->cpu_type); armcpu =3D ARM_CPU(cpuobj); =20 pa_bits =3D arm_pamax(armcpu); @@ -2201,6 +2200,57 @@ static void machvirt_init(MachineState *machine) */ finalize_gic_version(vms); =20 + /* + * The maximum number of CPUs depends on the GIC version, or on how + * many redistributors we can fit into the memory map (which in turn + * depends on whether this is a GICv3 or v4). + */ + if (vms->gic_version =3D=3D VIRT_GIC_VERSION_2) { + virt_max_cpus =3D GIC_NCPU; + } else { + virt_max_cpus =3D virt_redist_capacity(vms, VIRT_GIC_REDIST); + if (vms->highmem_redists) { + virt_max_cpus +=3D virt_redist_capacity(vms, VIRT_HIGH_GIC_RED= IST2); + } + } + + if ((tcg_enabled() && !qemu_tcg_mttcg_enabled()) || hvf_enabled() || + qtest_enabled() || vms->gic_version =3D=3D VIRT_GIC_VERSION_2) { + max_cpus =3D machine->smp.max_cpus =3D smp_cpus; + if (mc->has_online_capable_cpus) { + if (vms->gic_version =3D=3D VIRT_GIC_VERSION_2) { + warn_report("GICv2 does not support online-capable CPUs"); + } + mc->has_online_capable_cpus =3D false; + } + } + + if (mc->has_online_capable_cpus) { + max_cpus =3D smp_cpus + machine->smp.disabledcpus; + machine->smp.max_cpus =3D max_cpus; + } + + if (max_cpus > virt_max_cpus) { + error_report("Number of SMP CPUs requested (%d) exceeds max CPUs " + "supported by machine 'mach-virt' (%d)", + max_cpus, virt_max_cpus); + if (vms->gic_version !=3D VIRT_GIC_VERSION_2 && !vms->highmem_redi= sts) { + error_printf("Try 'highmem-redists=3Don' for more CPUs\n"); + } + + exit(1); + } + + dc =3D DEVICE_CLASS(object_class_by_name(machine->cpu_type)); + if (!dc) { + error_report("CPU type '%s' not registered", machine->cpu_type); + exit(1); + } + dc->admin_power_state_supported =3D mc->has_online_capable_cpus; + + /* uses smp.max_cpus to initialize all possible vCPUs */ + possible_cpus =3D mc->possible_cpu_arch_ids(machine); + if (vms->secure) { /* * The Secure view of the world is the same as the NonSecure, @@ -2235,31 +2285,6 @@ static void machvirt_init(MachineState *machine) vms->psci_conduit =3D QEMU_PSCI_CONDUIT_HVC; } =20 - /* - * The maximum number of CPUs depends on the GIC version, or on how - * many redistributors we can fit into the memory map (which in turn - * depends on whether this is a GICv3 or v4). - */ - if (vms->gic_version =3D=3D VIRT_GIC_VERSION_2) { - virt_max_cpus =3D GIC_NCPU; - } else { - virt_max_cpus =3D virt_redist_capacity(vms, VIRT_GIC_REDIST); - if (vms->highmem_redists) { - virt_max_cpus +=3D virt_redist_capacity(vms, VIRT_HIGH_GIC_RED= IST2); - } - } - - if (max_cpus > virt_max_cpus) { - error_report("Number of SMP CPUs requested (%d) exceeds max CPUs " - "supported by machine 'mach-virt' (%d)", - max_cpus, virt_max_cpus); - if (vms->gic_version !=3D VIRT_GIC_VERSION_2 && !vms->highmem_redi= sts) { - error_printf("Try 'highmem-redists=3Don' for more CPUs\n"); - } - - exit(1); - } - if (vms->secure && !tcg_enabled() && !qtest_enabled()) { error_report("mach-virt: %s does not support providing " "Security extensions (TrustZone) to the guest CPU", @@ -3245,6 +3270,9 @@ static void virt_machine_class_init(ObjectClass *oc, = const void *data) hc->plug =3D virt_machine_device_plug_cb; hc->unplug_request =3D virt_machine_device_unplug_request_cb; hc->unplug =3D virt_machine_device_unplug_cb; + + mc->has_online_capable_cpus =3D true; + mc->nvdimm_supported =3D true; mc->smp_props.clusters_supported =3D true; mc->auto_enable_numa_with_memhp =3D true; diff --git a/include/hw/boards.h b/include/hw/boards.h index 2b182d7817..b27c2326a2 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -302,6 +302,7 @@ struct MachineClass { bool rom_file_has_mr; int minimum_page_bits; bool has_hotpluggable_cpus; + bool has_online_capable_cpus; bool ignore_memory_transaction_failures; int numa_mem_align_shift; const char * const *valid_cpu_types; --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 175928070195240.32323702316205; Tue, 30 Sep 2025 18:05:01 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lFB-0005HP-15; Tue, 30 Sep 2025 21:02:57 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lF4-0005G2-5J for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:02:50 -0400 Received: from mail-wr1-x431.google.com ([2a00:1450:4864:20::431]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lEi-0007z1-H0 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:02:49 -0400 Received: by mail-wr1-x431.google.com with SMTP id ffacd0b85a97d-3ee12a63af1so4309916f8f.1 for ; Tue, 30 Sep 2025 18:02:24 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280542; x=1759885342; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QPR0o41Frny4pKRUCSDDT4xsBtm8U/Vvvxat3jE//sc=; b=fNCmZjCUnFz6y3kzsHHWxdQbZKSu9QztjTz22lbtSPDzb/2evDsaI2ktTq7b7IvgC6 EXgNnfR+uj6N+NDd9xKGSUYQpIeR0MDMAOYHw7avHbKdHgeb+ZTMNAs53WvhcLRaURFK SjWrPnwe2UZW9jWPl/zcc02PgEkQNVZFTclN/lcbDTi08DiUmiHwBOSK29vY6THqgkzf bXSRSh8wiQjvYUzWvdyYkOxVOiuBppVHJ5fGAO71PpdgC9+l+9As4a60DpCGH7S5WuEH yicl3cfIWVYKz2SUMMgTmo0/WKSkUtk+iegvLw8stTGAOJ6JwYvA0zcyrVWGW/UOcPg5 D8Tw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280542; x=1759885342; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QPR0o41Frny4pKRUCSDDT4xsBtm8U/Vvvxat3jE//sc=; b=o/U210N55/4bPwgWDCYNn52KOa6GS0E+aqLCtg4Y2myxo8fmjuTzGPGbAiu2KqSjSH LyvVYlYtHpjsS3OMK4RPj7yNO95W3UkH0hT8dfBJa+WmPEiJpOVmDECwMBLwgCVySz/U KMuJM8kGk7DTN/qSfgMqXnVH6Jm5FsDVIcyUTfkma8ZOcgY5QUAs5o/Yz2VlatbTqZAE sSNaUbPB//qGTazdLs5kdTUmF6wWIXoo5M2zjw4J+LkLRHg/W9a9Rspkhx+Bhss8PynH tT4WaaXdg9LeSPqlQpVRWt2y/6r7hHHYlQRYzZoG2GG5VoF753m2iuxzW6n+9UoXTTJx lAwQ== X-Gm-Message-State: AOJu0YwlPhr17wtoQ3Gv3PrrNAB8Y96AFwzVmNt52Z6MWxq5fgi3xNMJ cBHHrUh3CckuhMRhb1yQ3aixmqavgHHJPdW0ImPYoiqtV7ajUrXYb7iBgmSw7SfwsPGBNJZVqBb z7odALvJ6xg== X-Gm-Gg: ASbGncsrhzu5KL+4dyMob/WaSWxeCMzneajDI6hpFJ/28YYjL2Rkmno9JQHrLc/yVIU bSbIlRjmnRPncPtFlqKsFhjht0oT2NTts+DsjLuTeym+LOaSXytMkMGacn2507tGYUqP4EKA7hy Mv2NBK7jHSKVdMNUDgCWHLYFLlvm6EBwEAFaYKl+kOLqEJBJ8111yPXmVTPGSgr2rdOH5Yj74mW ClXPXauFfY5xPOrNFvfTpMgebRmGNK4Lbsj/ecQQD/kSDtO3HxXfQqXcv2lSlQajGPrR5vKSnIP rd5MjuaZ914IZ5Qz5rWfzs7pSkZtKL1rUud5ABQm0xG/jrZ9wBpsKfi39808yhnt9frs+aD/FkY A1pWCPVVbuzto3L1K7tFfvv6BOGsEAJc/FNbC5ONf/X/XISPnV/yf5vn+/y/CTqbl2o+afDZGpL EO8ItknLTBK7YtAN5vt4Xr9gTKjyG5Rr+JPIf8psCvaYY= X-Google-Smtp-Source: AGHT+IFxPFV6XUQCP6z9xEmOSgO6v2OxujqbtZ00dj4mQUoUNgabVHt8FI7+jtrdo434dqGyKzIQ/w== X-Received: by 2002:a05:6000:26c7:b0:3f1:5bdd:190a with SMTP id ffacd0b85a97d-425577ee92bmr1209557f8f.3.1759280542024; Tue, 30 Sep 2025 18:02:22 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 04/24] arm/virt, target/arm: Add new ARMCPU {socket, cluster, core, thread}-id property Date: Wed, 1 Oct 2025 01:01:07 +0000 Message-Id: <20251001010127.3092631-5-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::431; envelope-from=salil.mehta@opnsrc.net; helo=mail-wr1-x431.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1759280704574116600 Content-Type: text/plain; charset="utf-8" From: Salil Mehta Store the user-specified topology (socket/cluster/core/thread) and derive a unique 'vcpu-id'. The 'vcpu-id' is used as the slot index in the possible v= CPUs list when administratively enabling or disabling a vCPU. Co-developed-by: Keqian Zhu Signed-off-by: Keqian Zhu Signed-off-by: Salil Mehta Reviewed-by: Miguel Luis --- hw/arm/virt.c | 10 ++++++++++ include/hw/arm/virt.h | 36 ++++++++++++++++++++++++++++++++++++ target/arm/cpu.c | 4 ++++ target/arm/cpu.h | 4 ++++ 4 files changed, 54 insertions(+) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 76f21bd56a..4ded19dc69 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -2334,6 +2334,14 @@ static void machvirt_init(MachineState *machine) &error_fatal); =20 aarch64 &=3D object_property_get_bool(cpuobj, "aarch64", NULL); + object_property_set_int(cpuobj, "socket-id", virt_get_socket_id(n), + NULL); + object_property_set_int(cpuobj, "cluster-id", virt_get_cluster_id(= n), + NULL); + object_property_set_int(cpuobj, "core-id", virt_get_core_id(n), + NULL); + object_property_set_int(cpuobj, "thread-id", virt_get_thread_id(n), + NULL); =20 if (!vms->secure) { object_property_set_bool(cpuobj, "has_el3", false, NULL); @@ -2902,6 +2910,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_id= s(MachineState *ms) { int n; unsigned int max_cpus =3D ms->smp.max_cpus; + unsigned int smp_threads =3D ms->smp.threads; VirtMachineState *vms =3D VIRT_MACHINE(ms); MachineClass *mc =3D MACHINE_GET_CLASS(vms); =20 @@ -2915,6 +2924,7 @@ static const CPUArchIdList *virt_possible_cpu_arch_id= s(MachineState *ms) ms->possible_cpus->len =3D max_cpus; for (n =3D 0; n < ms->possible_cpus->len; n++) { ms->possible_cpus->cpus[n].type =3D ms->cpu_type; + ms->possible_cpus->cpus[n].vcpus_count =3D smp_threads; ms->possible_cpus->cpus[n].arch_id =3D virt_cpu_mp_affinity(vms, n); =20 diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h index 365a28b082..683e4b965a 100644 --- a/include/hw/arm/virt.h +++ b/include/hw/arm/virt.h @@ -213,4 +213,40 @@ static inline int virt_gicv3_redist_region_count(VirtM= achineState *vms) vms->highmem_redists) ? 2 : 1; } =20 +static inline int virt_get_socket_id(int cpu_index) +{ + MachineState *ms =3D MACHINE(qdev_get_machine()); + + assert(cpu_index >=3D 0 && cpu_index < ms->possible_cpus->len); + + return ms->possible_cpus->cpus[cpu_index].props.socket_id; +} + +static inline int virt_get_cluster_id(int cpu_index) +{ + MachineState *ms =3D MACHINE(qdev_get_machine()); + + assert(cpu_index >=3D 0 && cpu_index < ms->possible_cpus->len); + + return ms->possible_cpus->cpus[cpu_index].props.cluster_id; +} + +static inline int virt_get_core_id(int cpu_index) +{ + MachineState *ms =3D MACHINE(qdev_get_machine()); + + assert(cpu_index >=3D 0 && cpu_index < ms->possible_cpus->len); + + return ms->possible_cpus->cpus[cpu_index].props.core_id; +} + +static inline int virt_get_thread_id(int cpu_index) +{ + MachineState *ms =3D MACHINE(qdev_get_machine()); + + assert(cpu_index >=3D 0 && cpu_index < ms->possible_cpus->len); + + return ms->possible_cpus->cpus[cpu_index].props.thread_id; +} + #endif /* QEMU_ARM_VIRT_H */ diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 0c9a2e7ea4..7e0d5b2ed8 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -2607,6 +2607,10 @@ static const Property arm_cpu_properties[] =3D { DEFINE_PROP_UINT64("mp-affinity", ARMCPU, mp_affinity, ARM64_AFFINITY_INVALID), DEFINE_PROP_INT32("node-id", ARMCPU, node_id, CPU_UNSET_NUMA_NODE_ID), + DEFINE_PROP_INT32("socket-id", ARMCPU, socket_id, 0), + DEFINE_PROP_INT32("cluster-id", ARMCPU, cluster_id, 0), + DEFINE_PROP_INT32("core-id", ARMCPU, core_id, 0), + DEFINE_PROP_INT32("thread-id", ARMCPU, thread_id, 0), DEFINE_PROP_INT32("core-count", ARMCPU, core_count, -1), /* True to default to the backward-compat old CNTFRQ rather than 1Ghz = */ DEFINE_PROP_BOOL("backcompat-cntfrq", ARMCPU, backcompat_cntfrq, false= ), diff --git a/target/arm/cpu.h b/target/arm/cpu.h index dc9b6dce4c..cd5982d362 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -1126,6 +1126,10 @@ struct ArchCPU { QLIST_HEAD(, ARMELChangeHook) el_change_hooks; =20 int32_t node_id; /* NUMA node this CPU belongs to */ + int32_t socket_id; + int32_t cluster_id; + int32_t core_id; + int32_t thread_id; =20 /* Used to synchronize KVM and QEMU in-kernel device levels */ uint8_t device_irq_level; --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759280965717654.9484331382694; Tue, 30 Sep 2025 18:09:25 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lFD-0005HS-FB; Tue, 30 Sep 2025 21:02:59 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lF6-0005Ga-7x for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:02:53 -0400 Received: from mail-wm1-x335.google.com ([2a00:1450:4864:20::335]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lEm-0007zo-Tf for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:02:51 -0400 Received: by mail-wm1-x335.google.com with SMTP id 5b1f17b1804b1-46e3a50bc0fso45686645e9.3 for ; Tue, 30 Sep 2025 18:02:29 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280544; x=1759885344; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UQKJ1SD5xckgEaV2WySv/XUYqgNyxKKJdcvcoJuQPMY=; b=YkmKU4OA7em/rLP/3WZuO3G56huL9f7Nyv8YS5T/R6MOpeyK5Q/GQB8Y3kZ8zJH5c7 TP/WIVtpM/mrWSID34C7EwGsW7QzsesRs45kR4bD5t3YZVzVnGdyl5Y3p/0k7zI+nCjk PETLo2jc9tRp+pJM29vXztAfgC2RtqBfBTKKzQFKuDufVWuLXp7np+xvQNZ8sIey6pal FOlV3wDtdJLvknolI7kQKHu45gZAtMmsAt+Xd1FRScdWrRSGvcqDyOfQUoM4U6HS5pbj t626a9Ms3Zy+rjB632R8DeMJMtcpEcCZhkP2IZlO6YM3JAIXEaP0j1tDetGnLrWuJKvg Mrfg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280544; x=1759885344; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UQKJ1SD5xckgEaV2WySv/XUYqgNyxKKJdcvcoJuQPMY=; b=LPao+vmc490UBk8IfX/f6T7qhrfhBnT1D7GBovqlSvcPSq4EN7DiETVRbVuy13Lym5 W9IhW+5SY5tlbbXHW81zcKuV9eaO6q8nfTd3kwMDH0gCzcM9p1qd1/2nAyqCuSUnbvJ+ E2owH0sZrPo+KnwfR3TeAElPsGnH9O48DPG2GcjvcLLnO7HyneygZRqxKCj0ZxLh9lqv 3YFpFAyz6mMadDf4Dht6NEAaCzilU6EicNs3bhpaMAImTDWoGOeH3Vgjgczsb3Pc4Jch DzAmcJAdzcLzCa2YXLpAta9vhbYwGKRwWkVSAHk0Ac3FREVN2F907Z7UYVOknSXu8i5/ NA1A== X-Gm-Message-State: AOJu0Yxu+H0xu+AipMRP9WeU+CdJHoUbxPV6BT45p6KBJqwgkd1Op1yg gB5vLseaSbU0aWcfbJxLraoapzMuIeCAmCsKJj2ZrX+qO5Qlv042adkVKoy8UkW22y1QPU+B09a 1VYCKjbc20A== X-Gm-Gg: ASbGncvd9PxF+Kr1swvxWjPWkSaYAzmuyY1e/UQU6pJBDRgr84Gp/VMJL7yYNXQ2Abb CfK+ciNd+YdQIKa2YZogtk3ZyYrcbdiZH5CCxmgCRw1P0JqynT++Aen4TI5IaIXLg/E/xtEW4xv LTdXBWk3A9A6OdPg/IShAS9pIcCbBbJg2lF9KWNuCTZ9HXmN5c6HkTfUzb5l443kboPiei8HaEy Wnf9eeIBsZCsMZJbsgbkF6fItyHj4j31Ils+dwXCrCfTbVitIij9v6SCVQV3Argna2rCHHPT3tT PfKDsZ1xjfmMSKaF5hC1tatS7/iT1WRYwBqDUlBh87nbeZLaf3CMJC2oeIf/8yRvzDGSzQvMLL/ c1N4vmHBxzaIMugDiTrCBVSbQYa8ZmuH7/01AYz/WWaKKdP43Wg6beSo93xwmVuKL8B1bm1Rc9Q kq1+lwSsv8WPEDFPH2MYLKTNj15ng5hNnj4nHURyneHrA= X-Google-Smtp-Source: AGHT+IFf7XYAckpJw8j1JA8eb8qMz4/K94d8qJkSnt184ST0tllV9oC3FK71HQ5ngVT+BL3XoDVweA== X-Received: by 2002:a05:600c:8b65:b0:43c:ec4c:25b4 with SMTP id 5b1f17b1804b1-46e6127cccfmr13217285e9.10.1759280543774; Tue, 30 Sep 2025 18:02:23 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com, Keqian Zhu Subject: [PATCH RFC V6 05/24] arm/virt, kvm: Pre-create KVM vCPUs for 'disabled' QOM vCPUs at machine init Date: Wed, 1 Oct 2025 01:01:08 +0000 Message-Id: <20251001010127.3092631-6-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::335; envelope-from=salil.mehta@opnsrc.net; helo=mail-wm1-x335.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1759280968031116600 From: Salil Mehta ARM CPU architecture does not allow CPUs to be plugged after system has initialized. This is a constraint. Hence, the Kernel must know all the CPUs being booted during its initialization. This applies to the Guest Kernel as well and therefore, the number of KVM vCPU descriptors in the host must be fixed at VM initialization time. Also, the GIC must know all the CPUs it is connected to during its initialization, and this cannot change afterward. This must also be ensured during the initialization of the VGIC in KVM. This is necessary because: 1. The association between GICR and MPIDR must be fixed at VM initialization time. This is represented by the register `GICR_TYPER(mp_affinity, proc_num)`. 2. Memory regions associated with GICR, etc., cannot be changed (added, deleted, or modified) after the VM has been initialized. This is not an ARM architectural constraint but rather invites a difficult and messy change in VGIC data structures. To enable a hot-add=E2=80=93like model while preserving these constraints, = the virt machine may enumerate more CPUs than are enabled at boot using `-smp disabledcpus=3DN`. Such CPUs are present but start offline (i.e., administratively disabled at init). The topology remains fixed at VM creation time; only the online/offline status may change later. Administratively disabled vCPUs are not realized in QOM until first enabled, avoiding creation of unnecessary vCPU threads at boot. On large systems, th= is reduces startup time proportionally to the number of disabled vCPUs. Once a QOM vCPU is realized and its thread created, subsequent enable/disable acti= ons do not unrealize it. This behaviour was adopted following review feedback a= nd differs from earlier RFC versions. Co-developed-by: Keqian Zhu Signed-off-by: Keqian Zhu Signed-off-by: Salil Mehta --- accel/kvm/kvm-all.c | 2 +- hw/arm/virt.c | 77 ++++++++++++++++++++++++++++++++++++++---- hw/core/qdev.c | 17 ++++++++++ include/hw/qdev-core.h | 19 +++++++++++ include/system/kvm.h | 8 +++++ target/arm/cpu.c | 2 ++ target/arm/kvm.c | 40 +++++++++++++++++++++- target/arm/kvm_arm.h | 11 ++++++ 8 files changed, 168 insertions(+), 8 deletions(-) diff --git a/accel/kvm/kvm-all.c b/accel/kvm/kvm-all.c index 890d5ea9f8..0e7d9d5c3d 100644 --- a/accel/kvm/kvm-all.c +++ b/accel/kvm/kvm-all.c @@ -460,7 +460,7 @@ static void kvm_reset_parked_vcpus(KVMState *s) * * @returns: 0 when success, errno (<0) when failed. */ -static int kvm_create_vcpu(CPUState *cpu) +int kvm_create_vcpu(CPUState *cpu) { unsigned long vcpu_id =3D kvm_arch_vcpu_id(cpu); KVMState *s =3D kvm_state; diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 4ded19dc69..f4eeeacf6c 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -2152,6 +2152,49 @@ static void virt_post_cpus_gic_realized(VirtMachineS= tate *vms, } } =20 +static void +virt_setup_lazy_vcpu_realization(Object *cpuobj, VirtMachineState *vms) +{ + /* + * Present & administratively disabled vCPUs: + * + * These CPUs are marked offline at init via '-smp disabledcpus=3DN'. = We + * intentionally do not realize them during the first boot, since it is + * not known if or when they will ever be enabled. The decision to ena= ble + * such CPUs depends on policy (e.g. guided by SLAs or other deployment + * requirements). + * + * Realizing all disabled vCPUs up front would make boot time proporti= onal + * to 'maxcpus', even if policy permits only a small subset to be enab= led. + * This can lead to unacceptable boot delays in some scenarios. + * + * Instead, these CPUs remain administratively disabled and unrealized= at + * boot, to be instantiated and brought online only if policy later al= lows + * it. + */ + + /* set this vCPU to be administratively 'disabled' in QOM */ + qdev_disable(DEVICE(cpuobj), NULL, &error_fatal); + + if (vms->psci_conduit !=3D QEMU_PSCI_CONDUIT_DISABLED) { + object_property_set_int(cpuobj, "psci-conduit", vms->psci_conduit, + NULL); + } + + /* + * [!] Constraint: The ARM CPU architecture does not permit new CPUs + * to be added after system initialization. + * + * Workaround: Pre-create KVM vCPUs even for those that are not yet + * online i.e. powered-off, keeping them `parked` and in an + * `unrealized (at-least during boot time)` state within QEMU until + * they are powered-on and made online. + */ + if (kvm_enabled()) { + kvm_arm_create_host_vcpu(ARM_CPU(cpuobj)); + } +} + static void machvirt_init(MachineState *machine) { VirtMachineState *vms =3D VIRT_MACHINE(machine); @@ -2319,10 +2362,6 @@ static void machvirt_init(MachineState *machine) Object *cpuobj; CPUState *cs; =20 - if (n >=3D smp_cpus) { - break; - } - cpuobj =3D object_new(possible_cpus->cpus[n].type); object_property_set_int(cpuobj, "mp-affinity", possible_cpus->cpus[n].arch_id, NULL); @@ -2427,8 +2466,34 @@ static void machvirt_init(MachineState *machine) } } =20 - qdev_realize(DEVICE(cpuobj), NULL, &error_fatal); - object_unref(cpuobj); + /* start secondary vCPUs in a powered-down state */ + if(n && mc->has_online_capable_cpus) { + object_property_set_bool(cpuobj, "start-powered-off", true, NU= LL); + } + + if (n < smp_cpus) { + /* 'Present' & 'Enabled' vCPUs */ + qdev_realize(DEVICE(cpuobj), NULL, &error_fatal); + object_unref(cpuobj); + } else { + /* 'Present' & 'Disabled' vCPUs */ + virt_setup_lazy_vcpu_realization(cpuobj, vms); + } + + /* + * All possible vCPUs should have QOM vCPU Object pointer & arch-i= d. + * 'cpus_queue' (accessed via qemu_get_cpu()) contains only realiz= ed and + * enabled vCPUs. Hence, we must now populate the 'possible_cpus' = list. + */ + if (kvm_enabled()) { + /* + * Override the default architecture ID with the one retrieved + * from KVM, as they currently differ. + */ + machine->possible_cpus->cpus[n].arch_id =3D + arm_cpu_mp_affinity(ARM_CPU(cs)); + } + machine->possible_cpus->cpus[n].cpu =3D cs; } =20 /* Now we've created the CPUs we can see if they have the hypvirt time= r */ diff --git a/hw/core/qdev.c b/hw/core/qdev.c index 8502d6216f..5816abae39 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -309,6 +309,23 @@ void qdev_assert_realized_properly(void) qdev_assert_realized_properly_cb, NULL); } =20 +bool qdev_disable(DeviceState *dev, BusState *bus, Error **errp) +{ + g_assert(dev); + + if (bus) { + error_setg(errp, "Device %s 'disable' operation not supported", + object_get_typename(OBJECT(dev))); + return false; + } + + /* devices like cpu don't have bus */ + g_assert(!DEVICE_GET_CLASS(dev)->bus_type); + + return object_property_set_str(OBJECT(dev), "admin_power_state", "disa= bled", + errp); +} + bool qdev_machine_modified(void) { return qdev_hot_added || qdev_hot_removed; diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 3bc212ab3a..2c22b32a3f 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -570,6 +570,25 @@ bool qdev_realize(DeviceState *dev, BusState *bus, Err= or **errp); */ bool qdev_realize_and_unref(DeviceState *dev, BusState *bus, Error **errp); =20 +/** + * qdev_disable - Initiate administrative disablement and power-off of dev= ice + * @dev: The device to be administratively powered off + * @bus: The bus on which the device resides (may be NULL for CPUs) + * @errp: Pointer to a location where an error can be reported + * + * This function initiates an administrative transition of the device into= a + * DISABLED state. This may trigger a graceful shutdown process depending = on + * platform capabilities. For ACPI platforms, this typically involves noti= fying + * the guest via events such as Notify(..., 0x03) and executing _EJx. + * + * Once completed, the device's operational power is turned off and it is + * marked as administratively DISABLED. Further guest usage is blocked unt= il + * re-enabled by host-side policy. + * + * Returns true on success; false if an error occurs, with @errp populated. + */ +bool qdev_disable(DeviceState *dev, BusState *bus, Error **errp); + /** * qdev_unrealize: Unrealize a device * @dev: device to unrealize diff --git a/include/system/kvm.h b/include/system/kvm.h index 3c7d314736..4896a3c9c5 100644 --- a/include/system/kvm.h +++ b/include/system/kvm.h @@ -317,6 +317,14 @@ int kvm_create_device(KVMState *s, uint64_t type, bool= test); */ bool kvm_device_supported(int vmfd, uint64_t type); =20 +/** + * kvm_create_vcpu - Gets a parked KVM vCPU or creates a KVM vCPU + * @cpu: QOM CPUState object for which KVM vCPU has to be fetched/created. + * + * @returns: 0 when success, errno (<0) when failed. + */ +int kvm_create_vcpu(CPUState *cpu); + /** * kvm_park_vcpu - Park QEMU KVM vCPU context * @cpu: QOM CPUState object for which QEMU KVM vCPU context has to be par= ked. diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 7e0d5b2ed8..a5906d1672 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1500,6 +1500,8 @@ static void arm_cpu_initfn(Object *obj) /* TCG and HVF implement PSCI 1.1 */ cpu->psci_version =3D QEMU_PSCI_VERSION_1_1; } + + CPU(obj)->thread_id =3D 0; } =20 /* diff --git a/target/arm/kvm.c b/target/arm/kvm.c index 6672344855..1962eb29b2 100644 --- a/target/arm/kvm.c +++ b/target/arm/kvm.c @@ -991,6 +991,38 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu) write_list_to_cpustate(cpu); } =20 +void kvm_arm_create_host_vcpu(ARMCPU *cpu) +{ + CPUState *cs =3D CPU(cpu); + unsigned long vcpu_id =3D cs->cpu_index; + int ret; + + ret =3D kvm_create_vcpu(cs); + if (ret < 0) { + error_report("Failed to create host vcpu %ld", vcpu_id); + abort(); + } + + /* + * Initialize the vCPU in the host. This will reset the sys regs + * for this vCPU and related registers like MPIDR_EL1 etc. also + * get programmed during this call to host. These are referenced + * later while setting device attributes of the GICR during GICv3 + * reset. + */ + ret =3D kvm_arch_init_vcpu(cs); + if (ret < 0) { + error_report("Failed to initialize host vcpu %ld", vcpu_id); + abort(); + } + + /* + * park the created vCPU. shall be used during kvm_get_vcpu() when + * threads are created during realization of ARM vCPUs. + */ + kvm_park_vcpu(cs); +} + /* * Update KVM's MP_STATE based on what QEMU thinks it is */ @@ -1876,7 +1908,13 @@ int kvm_arch_init_vcpu(CPUState *cs) return -EINVAL; } =20 - qemu_add_vm_change_state_handler(kvm_arm_vm_state_change, cpu); + /* + * Install VM change handler only when vCPU thread has been spawned + * i.e. vCPU is being realized + */ + if (cs->thread_id) { + qemu_add_vm_change_state_handler(kvm_arm_vm_state_change, cpu); + } =20 /* Determine init features for this CPU */ memset(cpu->kvm_init_features, 0, sizeof(cpu->kvm_init_features)); diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h index 6a9b6374a6..ec9dc95ee8 100644 --- a/target/arm/kvm_arm.h +++ b/target/arm/kvm_arm.h @@ -98,6 +98,17 @@ bool kvm_arm_cpu_post_load(ARMCPU *cpu); void kvm_arm_reset_vcpu(ARMCPU *cpu); =20 struct kvm_vcpu_init; + +/** + * kvm_arm_create_host_vcpu: + * @cpu: ARMCPU + * + * Called to pre-create possible KVM vCPU within the host during the + * `virt_machine` initialization phase. This pre-created vCPU will be park= ed and + * will be reused when ARM QOM vCPU is actually hotplugged. + */ +void kvm_arm_create_host_vcpu(ARMCPU *cpu); + /** * kvm_arm_create_scratch_host_vcpu: * @fdarray: filled in with kvmfd, vmfd, cpufd file descriptors in that or= der --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 175928127840144.24030043667756; Tue, 30 Sep 2025 18:14:38 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lFO-0005K0-Gx; Tue, 30 Sep 2025 21:03:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lF5-0005GY-WA for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:02:53 -0400 Received: from mail-wm1-x32d.google.com ([2a00:1450:4864:20::32d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lEk-000816-F5 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:02:51 -0400 Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-46e4ad36541so41961845e9.0 for ; Tue, 30 Sep 2025 18:02:28 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280546; x=1759885346; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=UZitGaR6XZBmHkaFZhRFFCJMj6aa5mwQJceoZZyLOic=; b=PC8NP68KG9vJty2MItZPbsS2yTf3JUZafxUOOTa35w9965HnUyrMjlcvha9jp9TEfZ pweJCkR0m7RfgV4z+8O4XE3I79tGzYr/Mam+P39caT7El3cIhD08X+penlWhXsDlInQe Lgnoyyv7SlcFV9p7KCJaZW45+RJZ9kDAwtCRDhQZrxhjDqU+PPe/zNGQCUjnPaPape40 fE8WUycpFUFHftB2ZiYooK+S0x84gLAOA/MLCj8CQc7j3TaBfhNmmScD090paSWQ8dYi G9TRjqMhgl8Jla4hMn1aZFFyXsFCXe7PCFS9yEiTThiPq8owlIbFf4YKtVXwocM3i3cM lIwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280546; x=1759885346; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=UZitGaR6XZBmHkaFZhRFFCJMj6aa5mwQJceoZZyLOic=; b=Lx3zPZ68uf6+yfiy2O062LHjy/tZf69LhOBinq8ewjH6IQwD9gefOOu8UiqR1cw2um J455BbWzAMWT1fdxVs1ukdOy5pB/RefVfLMEa+INJ0mnhZQDUc7EldIjT8Yj7CX6RFDT +KIlowt3JEac++Ga446oeQC4s/vzdPG8yiiL3xJ4MVRm6OdRCbo+2OjG2HZrvNZ8hM/t yoXw/k3MFjdazBVNOZS1AxrHq0ZCg9xFMtSP5RL+Oy1ib7BJbK36A0NzbFFGDCEQI5QL 9pOVWxOnRsX+xN4SUwYe4ZytJPZaY0uxNWoQhzqLyBLYDPYfFdstmBJj4bukUdW9KGRM 0d0Q== X-Gm-Message-State: AOJu0YwpsnMT5gJgfgwiaDZqE6oI6BjEuiv6RCMgA1xezEDk4cJy1U5a zkFqsWdUZ60UpXlYriG00MUS7KT+VL6pXArniGdhJwtlKfUPwO2+vk3jneteYokG4JhmJEG2Bk2 VuipMHmIjaw== X-Gm-Gg: ASbGnctb7q5GlpIdMyMSu94GEiVxDjazh9Ca1uEob3JRIcjfDVD5O/Tmo8cRa0Onbsn kZfBtBvpXbEX4lWZhguZA1qSdUlzOGVYVINwgP8Nb3gKF1q5vTBJM8CXga0HAbK5x1AfG3nqfBu T/+g5n/TW5DGRMc9vKTa+w0FEl9rlEI9r0cKD5L4a37OpuxUEzEkA2QIvZiXiCviHiFpv9OIS43 n+Nn9kCoBGP/p3euspzgaW/dIUawCyqcsZ4lABKAiup7/8y2knpaIWpWjbgOBxI6yUlxbHyLqtu szNgCN1gEmpkl5lAIwlKN0rcuzpMzbc5P5CwunfYSw76oHgHHTzArHXLM6p+L/3BvekVIRZSdM4 IGcDPoCS3KPMzf2p0HWJXiLSa+N6MWZvboTr9+7wPIZUWzQ0MFfb4CFVMmD6uMMtdMc1Oz9XjHo Lcsz56pYjXzOzk4c9jw0rKD8kPi+NDUEBzQtA6HHojZFk= X-Google-Smtp-Source: AGHT+IGpdCaaDWJ2sGIDNgXnBYcsveSYLrNYveQujxSN3gIHiFHHDYnoSvo7DUOClcK2bBipzwxpew== X-Received: by 2002:a05:6000:2388:b0:3ea:bccc:2a2c with SMTP id ffacd0b85a97d-425577ed5d2mr1212744f8f.11.1759280546301; Tue, 30 Sep 2025 18:02:26 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 06/24] arm/virt, gicv3: Pre-size GIC with possible vCPUs at machine init Date: Wed, 1 Oct 2025 01:01:09 +0000 Message-Id: <20251001010127.3092631-7-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::32d; envelope-from=salil.mehta@opnsrc.net; helo=mail-wm1-x32d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1759281280884116600 From: Salil Mehta Pre-size the GIC with the maximum possible vCPUs during machine initializat= ion instead of the currently enabled CPU count. This ensures that the GIC is fu= lly provisioned for any vCPUs that may be enabled later by administrative or hot-add=E2=80=93like operations. Pre-sizing must also include redistributors for administratively disabled v= CPUs, ensuring the GIC is fully provisioned at initialization for all possible CP= Us. This is required because: 1. Memory regions and resources associated with GICC/GICR cannot be modified (added, deleted, or resized) after VM initialization. 2. The GICD_TYPER and related redistributor structures must be initialized = with correct mp_affinity and CPU interface numbering at creation time, and ca= nnot be altered later. 3. Avoids the need to dynamically resize GIC CPU interfaces, which is unsup= ported and would break architectural guarantees. This patch: - Replaces use of `ms->smp.cpus` with `ms->smp.max_cpus` for GIC sizing, redistributor allocation, and interrupt wiring. - Updates GICv3 realization to fetch CPU references via `machine_get_possible_cpu()` instead of `qemu_get_cpu()`, ensuring that = CPUs not yet realized but part of the possible set are accounted for. Co-developed-by: Keqian Zhu Signed-off-by: Keqian Zhu Signed-off-by: Salil Mehta --- hw/arm/virt.c | 24 ++++++++++++------------ hw/core/machine.c | 14 ++++++++++++++ hw/intc/arm_gicv3_common.c | 4 ++-- include/hw/arm/virt.h | 2 +- include/hw/boards.h | 12 ++++++++++++ 5 files changed, 41 insertions(+), 15 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index f4eeeacf6c..ee09aa19bd 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -793,7 +793,7 @@ static void create_gic(VirtMachineState *vms, MemoryReg= ion *mem) SysBusDevice *gicbusdev; const char *gictype; int i; - unsigned int smp_cpus =3D ms->smp.cpus; + unsigned int max_cpus =3D ms->smp.max_cpus; uint32_t nb_redist_regions =3D 0; int revision; =20 @@ -825,7 +825,7 @@ static void create_gic(VirtMachineState *vms, MemoryReg= ion *mem) =20 vms->gic =3D qdev_new(gictype); qdev_prop_set_uint32(vms->gic, "revision", revision); - qdev_prop_set_uint32(vms->gic, "num-cpu", smp_cpus); + qdev_prop_set_uint32(vms->gic, "num-cpu", max_cpus); /* Note that the num-irq property counts both internal and external * interrupts; there are always 32 of the former (mandated by GIC spec= ). */ @@ -837,7 +837,7 @@ static void create_gic(VirtMachineState *vms, MemoryReg= ion *mem) if (vms->gic_version !=3D VIRT_GIC_VERSION_2) { QList *redist_region_count; uint32_t redist0_capacity =3D virt_redist_capacity(vms, VIRT_GIC_R= EDIST); - uint32_t redist0_count =3D MIN(smp_cpus, redist0_capacity); + uint32_t redist0_count =3D MIN(max_cpus, redist0_capacity); =20 nb_redist_regions =3D virt_gicv3_redist_region_count(vms); =20 @@ -848,7 +848,7 @@ static void create_gic(VirtMachineState *vms, MemoryReg= ion *mem) virt_redist_capacity(vms, VIRT_HIGH_GIC_REDIST2); =20 qlist_append_int(redist_region_count, - MIN(smp_cpus - redist0_count, redist1_capacity)); + MIN(max_cpus - redist0_count, redist1_capacity)); } qdev_prop_set_array(vms->gic, "redist-region-count", redist_region_count); @@ -896,8 +896,8 @@ static void create_gic(VirtMachineState *vms, MemoryReg= ion *mem) * and the GIC's IRQ/FIQ/VIRQ/VFIQ/NMI/VINMI interrupt outputs to the * CPU's inputs. */ - for (i =3D 0; i < smp_cpus; i++) { - DeviceState *cpudev =3D DEVICE(qemu_get_cpu(i)); + for (i =3D 0; i < max_cpus; i++) { + DeviceState *cpudev =3D DEVICE(machine_get_possible_cpu(i)); int intidbase =3D NUM_IRQS + i * GIC_INTERNAL; /* Mapping from the output timer irq lines from the CPU to the * GIC PPI inputs we use for the virt board. @@ -926,7 +926,7 @@ static void create_gic(VirtMachineState *vms, MemoryReg= ion *mem) } else if (vms->virt) { qemu_irq irq =3D qdev_get_gpio_in(vms->gic, intidbase + ARCH_GIC_MAINT_IRQ= ); - sysbus_connect_irq(gicbusdev, i + 4 * smp_cpus, irq); + sysbus_connect_irq(gicbusdev, i + 4 * max_cpus, irq); } =20 qdev_connect_gpio_out_named(cpudev, "pmu-interrupt", 0, @@ -934,17 +934,17 @@ static void create_gic(VirtMachineState *vms, MemoryR= egion *mem) + VIRTUAL_PMU_IRQ)); =20 sysbus_connect_irq(gicbusdev, i, qdev_get_gpio_in(cpudev, ARM_CPU_= IRQ)); - sysbus_connect_irq(gicbusdev, i + smp_cpus, + sysbus_connect_irq(gicbusdev, i + max_cpus, qdev_get_gpio_in(cpudev, ARM_CPU_FIQ)); - sysbus_connect_irq(gicbusdev, i + 2 * smp_cpus, + sysbus_connect_irq(gicbusdev, i + 2 * max_cpus, qdev_get_gpio_in(cpudev, ARM_CPU_VIRQ)); - sysbus_connect_irq(gicbusdev, i + 3 * smp_cpus, + sysbus_connect_irq(gicbusdev, i + 3 * max_cpus, qdev_get_gpio_in(cpudev, ARM_CPU_VFIQ)); =20 if (vms->gic_version !=3D VIRT_GIC_VERSION_2) { - sysbus_connect_irq(gicbusdev, i + 4 * smp_cpus, + sysbus_connect_irq(gicbusdev, i + 4 * max_cpus, qdev_get_gpio_in(cpudev, ARM_CPU_NMI)); - sysbus_connect_irq(gicbusdev, i + 5 * smp_cpus, + sysbus_connect_irq(gicbusdev, i + 5 * max_cpus, qdev_get_gpio_in(cpudev, ARM_CPU_VINMI)); } } diff --git a/hw/core/machine.c b/hw/core/machine.c index bd47527479..69d5632464 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -1369,6 +1369,20 @@ bool machine_require_guest_memfd(MachineState *machi= ne) return machine->cgs && machine->cgs->require_guest_memfd; } =20 +CPUState *machine_get_possible_cpu(int64_t cpu_index) +{ + MachineState *ms =3D MACHINE(qdev_get_machine()); + const CPUArchIdList *possible_cpus =3D ms->possible_cpus; + + for (int i =3D 0; i < possible_cpus->len; i++) { + if (possible_cpus->cpus[i].cpu && + possible_cpus->cpus[i].cpu->cpu_index =3D=3D cpu_index) { + return possible_cpus->cpus[i].cpu; + } + } + return NULL; +} + static char *cpu_slot_to_string(const CPUArchId *cpu) { GString *s =3D g_string_new(NULL); diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c index e438d8c042..f6a9f1c68b 100644 --- a/hw/intc/arm_gicv3_common.c +++ b/hw/intc/arm_gicv3_common.c @@ -32,7 +32,7 @@ #include "gicv3_internal.h" #include "hw/arm/linux-boot-if.h" #include "system/kvm.h" - +#include "hw/boards.h" =20 static void gicv3_gicd_no_migration_shift_bug_post_load(GICv3State *cs) { @@ -436,7 +436,7 @@ static void arm_gicv3_common_realize(DeviceState *dev, = Error **errp) s->cpu =3D g_new0(GICv3CPUState, s->num_cpu); =20 for (i =3D 0; i < s->num_cpu; i++) { - CPUState *cpu =3D qemu_get_cpu(i); + CPUState *cpu =3D machine_get_possible_cpu(i); uint64_t cpu_affid; =20 s->cpu[i].cpu =3D cpu; diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h index 683e4b965a..ace4154cc6 100644 --- a/include/hw/arm/virt.h +++ b/include/hw/arm/virt.h @@ -209,7 +209,7 @@ static inline int virt_gicv3_redist_region_count(VirtMa= chineState *vms) =20 assert(vms->gic_version !=3D VIRT_GIC_VERSION_2); =20 - return (MACHINE(vms)->smp.cpus > redist0_capacity && + return (MACHINE(vms)->smp.max_cpus > redist0_capacity && vms->highmem_redists) ? 2 : 1; } =20 diff --git a/include/hw/boards.h b/include/hw/boards.h index b27c2326a2..3ff77a8b3a 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -118,6 +118,18 @@ bool device_is_dynamic_sysbus(MachineClass *mc, Device= State *dev); MemoryRegion *machine_consume_memdev(MachineState *machine, HostMemoryBackend *backend); =20 +/** + * machine_get_possible_cpu: Gets 'CPUState' for the CPU with the given lo= gical + * cpu_index. The slot index in possible_cpus[] list is always sequential,= but + * 'cpu_index' values may not be sequential depending on machine implement= ation + * (e.g. with hotplug/unplug). Therefore, this function must scan the list= to + * find a match. + * @cpu_index: logical cpu index to search for 'CPUState' + * + * Returns: pointer to CPUState, or NULL if not found. + */ +CPUState *machine_get_possible_cpu(int64_t cpu_index); + /** * CPUArchId: * @arch_id - architecture-dependent CPU ID of present or possible CPU --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759281298; cv=none; d=zohomail.com; s=zohoarc; b=L6giJjOmiTnWSTtjyaTb4jmCwgkXreDhnJ8fl/ToqKHe5b8H2bgSPtl5DogHRpeJQsz36drMkQf1A6WC0e7I79BLIfeZtqIlSu6vTzJwtt7Ea9xl7JrhlsZ6sWL5GmqTbUASFBF7EEokcwF3o8IYdvR4GN0LEtS/tD8nH9tcM5g= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759281298; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=WtDeaYzA1MV6zzT9oYsDOCleQdedNz0SYja7ELSAFkY=; b=JUTwsJx8qs67KozLe5j2iNer7hr2J1IdjLGuN4yHwyHqHUN4P9JPWfBJF3bFek7qUG4oy4cwC3FQapr86aBupcsz2y0S6DBOcv66mO7QaIFFu/f5XQmgCuqZcNTXbyCq8wZE0w4MVtUItU5LObnO8tIz7m76duo5WhNpSaSTACE= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759281298749826.0196584653827; Tue, 30 Sep 2025 18:14:58 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lFQ-0005Ks-56; Tue, 30 Sep 2025 21:03:12 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lFJ-0005In-MS for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:05 -0400 Received: from mail-wm1-x329.google.com ([2a00:1450:4864:20::329]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lEp-000826-QH for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:05 -0400 Received: by mail-wm1-x329.google.com with SMTP id 5b1f17b1804b1-46e5b7dfeb0so7059165e9.1 for ; Tue, 30 Sep 2025 18:02:33 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280549; x=1759885349; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=WtDeaYzA1MV6zzT9oYsDOCleQdedNz0SYja7ELSAFkY=; b=d1Aib1xF9goxnwiMn5MC5gqn5ynB0+OU/wo1H+IbRZQfTQdR1lPUjdNiVzuAg5KKYy SqSk+pwunjI6f6CedgLGCxZqZibDgc/bcITkycScu5Jqer11UI79p22iHxIQM0xwlOZd GljJTy3iIAhA9PWX1bQnA+bi4lg+blZfpVjYGZmRKmFLuBh9nyAIKhePJBdrjdk68raA A29S4lumxkBXb3QFLhzh6TtJiO4HQ+ka4a/RokTDC+JYAMM1+fhtrQqgS5jtko6xTHOs 71H3hCp6NOR/tuyo/14Y8Ft4/FsVelDCIsJktqeo4gSxKJ0dATC5GGyvakPMa4iimDh/ ncwA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280549; x=1759885349; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=WtDeaYzA1MV6zzT9oYsDOCleQdedNz0SYja7ELSAFkY=; b=mBvGFGfW1/dKQIUTuruQ2PTWERc1XY/Krh4FH2XqP1Svuz7RTqxtoRYZFNH3mWbmU5 eT4spSffYYxNcnTluy/u1r9ZCf//D9mrzqDFIqeMqn0dqduV+7NqB6ux4dt4I76m0zrm KZIUYRtHO7znD41nvkj9Ljgg5/Hi1eqdQgJ7Wg4J5CfNlwz+EcHf3A5TmTglg2qr62zE 01KkiKGa461YcJvIfSrHZfwGILfMu0AP7XEQpm+7OnvEtKcUb0Uw0svcbqcDhoQSl4s2 sGD/JkX6aihMtpJdUzs+uNLQlpQfNCp2DEi0jgYr+ztjZfSV9NiAdKgMapMsQ77KGAbI vESA== X-Gm-Message-State: AOJu0Yx6oYK4JdQl+hNe7ZxTZuO4x82nOU6ytzpjKVbrVeOx3DsfJuFd 2wSrWex1+iMvnbmT/t2wYfkWl++KtiE/lK27t19/4suV7mPyD7JGihTnH1b+QtundaEWKYMINs9 8nQ5A06QUgw== X-Gm-Gg: ASbGncsWd/zxiR81NPNCcwj9dy8jeQKVnIpdGYd1L4xut/+dQlIyGvZwRjfgePh84EU +EkUtZ5oCTEf8AjxIqOKrd97ulFVtf3LjZyefFTx4NFZHRBWxhaCu/i1u+wMmgX+J6b8ewOJYJp qnZ3YVb1rynLA+6dudI5lVCXIthl6ZIVTFOyvfR0GzxZ8nIvLYQB1wFfN4JaA+YUFyfpTNSs0YK Pc7aBuSB3p2z9c/xLRuS0/oORaO72h+EOMHfz8YpH357EX1wiZSVac9NSICxBpzgWA/xrlpMwmh YgS5gpB/EBSHbBY3UHIb5idTB9i0q8vddM2RhD4xqpOMxy0kZiJ+pnJEbfMKl/QzkBuWqZJQF6x z4A0FL7mOmApLviprLsOgN2QJRKyR0FriFRWztnJnKSwHgYTiS01W3QqMyx2AVc9C9cxC60Usq7 9HMkeVNeRYDcWSLFPtuLRZdgFJHGt6t3CBmgctRsWI7x4= X-Google-Smtp-Source: AGHT+IGmImWBLExOHMlhhQZSBnRbv0NKvnMBuXTQlTH3hM2ZcVnond1I1jMy53iy5AXLeDb5OLHr7Q== X-Received: by 2002:a05:600c:3543:b0:46e:4921:9443 with SMTP id 5b1f17b1804b1-46e612e3f92mr13058985e9.37.1759280548909; Tue, 30 Sep 2025 18:02:28 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 07/24] arm/gicv3: Refactor CPU interface init for shared TCG/KVM use Date: Wed, 1 Oct 2025 01:01:10 +0000 Message-Id: <20251001010127.3092631-8-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::329; envelope-from=salil.mehta@opnsrc.net; helo=mail-wm1-x329.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759281301176116600 From: Salil Mehta GICv3 CPU interface initialization currently has separate logic paths for T= CG and KVM accelerators, even though much of the flow=E2=80=94such as iteratin= g over vCPUs and applying common setup=E2=80=94should be identical. This separation make= s it harder to add new CPU interface features that apply to both backends, as each need= s to be updated individually. To address this, the common CPU interface setup is now centralized in =E2=80=98gicv3_init_cpuif()=E2=80=99, called during GIC realization. Accele= rator-specific code is still handled via a class hook for register-level initialization, but all iteration and shared setup is unified. This refactoring is required to: - Ensure later patches can set =E2=80=98gicc_accessible=E2=80=98 for all v= CPUs in a consistent manner. - Provide a single entry point for any future common initialization, avoid= ing duplication between TCG and KVM. - Maintain correct initialization for both enabled and administratively disabled but present vCPUs. No functional change intended here. Co-developed-by: Keqian Zhu Signed-off-by: Keqian Zhu Signed-off-by: Salil Mehta --- hw/intc/arm_gicv3.c | 1 + hw/intc/arm_gicv3_cpuif.c | 262 ++++++++++++++--------------- hw/intc/arm_gicv3_cpuif_common.c | 11 ++ hw/intc/arm_gicv3_kvm.c | 12 +- hw/intc/gicv3_internal.h | 1 + include/hw/intc/arm_gicv3_common.h | 1 + 6 files changed, 150 insertions(+), 138 deletions(-) diff --git a/hw/intc/arm_gicv3.c b/hw/intc/arm_gicv3.c index 6059ce926a..8ca61413d2 100644 --- a/hw/intc/arm_gicv3.c +++ b/hw/intc/arm_gicv3.c @@ -459,6 +459,7 @@ static void arm_gicv3_class_init(ObjectClass *klass, co= nst void *data) ARMGICv3Class *agc =3D ARM_GICV3_CLASS(klass); =20 agcc->post_load =3D arm_gicv3_post_load; + agcc->init_cpu_reginfo =3D gicv3_init_cpu_reginfo; device_class_set_parent_realize(dc, arm_gic_realize, &agc->parent_real= ize); } =20 diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c index 4b4cf09157..a7904237ac 100644 --- a/hw/intc/arm_gicv3_cpuif.c +++ b/hw/intc/arm_gicv3_cpuif.c @@ -3016,154 +3016,150 @@ static void gicv3_cpuif_el_change_hook(ARMCPU *cp= u, void *opaque) gicv3_cpuif_virt_irq_fiq_update(cs); } =20 -void gicv3_init_cpuif(GICv3State *s) +void gicv3_init_cpu_reginfo(CPUState *cs) { /* Called from the GICv3 realize function; register our system * registers with the CPU */ - int i; - - for (i =3D 0; i < s->num_cpu; i++) { - ARMCPU *cpu =3D ARM_CPU(qemu_get_cpu(i)); - GICv3CPUState *cs =3D &s->cpu[i]; + ARMCPU *cpu =3D ARM_CPU(cs); + GICv3CPUState *gcs =3D icc_cs_from_env(&cpu->env); =20 - /* - * If the CPU doesn't define a GICv3 configuration, probably becau= se - * in real hardware it doesn't have one, then we use default values - * matching the one used by most Arm CPUs. This applies to: - * cpu->gic_num_lrs - * cpu->gic_vpribits - * cpu->gic_vprebits - * cpu->gic_pribits - */ + /* + * If the CPU doesn't define a GICv3 configuration, probably because + * in real hardware it doesn't have one, then we use default values + * matching the one used by most Arm CPUs. This applies to: + * cpu->gic_num_lrs + * cpu->gic_vpribits + * cpu->gic_vprebits + * cpu->gic_pribits + */ =20 - /* Note that we can't just use the GICv3CPUState as an opaque poin= ter - * in define_arm_cp_regs_with_opaque(), because when we're called = back - * it might be with code translated by CPU 0 but run by CPU 1, in - * which case we'd get the wrong value. - * So instead we define the regs with no ri->opaque info, and - * get back to the GICv3CPUState from the CPUARMState. - * - * These CP regs callbacks can be called from either TCG or HVF co= de. - */ - define_arm_cp_regs(cpu, gicv3_cpuif_reginfo); + /* Note that we can't just use the GICv3CPUState as an opaque pointer + * in define_arm_cp_regs_with_opaque(), because when we're called back + * it might be with code translated by CPU 0 but run by CPU 1, in + * which case we'd get the wrong value. + * So instead we define the regs with no ri->opaque info, and + * get back to the GICv3CPUState from the CPUARMState. + * + * These CP regs callbacks can be called from either TCG or HVF code. + */ + define_arm_cp_regs(cpu, gicv3_cpuif_reginfo); =20 - /* - * If the CPU implements FEAT_NMI and FEAT_GICv3 it must also - * implement FEAT_GICv3_NMI, which is the CPU interface part - * of NMI support. This is distinct from whether the GIC proper - * (redistributors and distributor) have NMI support. In QEMU - * that is a property of the GIC device in s->nmi_support; - * cs->nmi_support indicates the CPU interface's support. - */ - if (cpu_isar_feature(aa64_nmi, cpu)) { - cs->nmi_support =3D true; - define_arm_cp_regs(cpu, gicv3_cpuif_gicv3_nmi_reginfo); - } + /* + * If the CPU implements FEAT_NMI and FEAT_GICv3 it must also + * implement FEAT_GICv3_NMI, which is the CPU interface part + * of NMI support. This is distinct from whether the GIC proper + * (redistributors and distributor) have NMI support. In QEMU + * that is a property of the GIC device in s->nmi_support; + * gcs->nmi_support indicates the CPU interface's support. + */ + if (cpu_isar_feature(aa64_nmi, cpu)) { + gcs->nmi_support =3D true; + define_arm_cp_regs(cpu, gicv3_cpuif_gicv3_nmi_reginfo); + } =20 - /* - * The CPU implementation specifies the number of supported - * bits of physical priority. For backwards compatibility - * of migration, we have a compat property that forces use - * of 8 priority bits regardless of what the CPU really has. - */ - if (s->force_8bit_prio) { - cs->pribits =3D 8; - } else { - cs->pribits =3D cpu->gic_pribits ?: 5; - } + /* + * The CPU implementation specifies the number of supported + * bits of physical priority. For backwards compatibility + * of migration, we have a compat property that forces use + * of 8 priority bits regardless of what the CPU really has. + */ + if (gcs->gic->force_8bit_prio) { + gcs->pribits =3D 8; + } else { + gcs->pribits =3D cpu->gic_pribits ?: 5; + } =20 - /* - * The GICv3 has separate ID register fields for virtual priority - * and preemption bit values, but only a single ID register field - * for the physical priority bits. The preemption bit count is - * always the same as the priority bit count, except that 8 bits - * of priority means 7 preemption bits. We precalculate the - * preemption bits because it simplifies the code and makes the - * parallels between the virtual and physical bits of the GIC - * a bit clearer. - */ - cs->prebits =3D cs->pribits; - if (cs->prebits =3D=3D 8) { - cs->prebits--; - } - /* - * Check that CPU code defining pribits didn't violate - * architectural constraints our implementation relies on. - */ - g_assert(cs->pribits >=3D 4 && cs->pribits <=3D 8); + /* + * The GICv3 has separate ID register fields for virtual priority + * and preemption bit values, but only a single ID register field + * for the physical priority bits. The preemption bit count is + * always the same as the priority bit count, except that 8 bits + * of priority means 7 preemption bits. We precalculate the + * preemption bits because it simplifies the code and makes the + * parallels between the virtual and physical bits of the GIC + * a bit clearer. + */ + gcs->prebits =3D gcs->pribits; + if (gcs->prebits =3D=3D 8) { + gcs->prebits--; + } + /* + * Check that CPU code defining pribits didn't violate + * architectural constraints our implementation relies on. + */ + g_assert(gcs->pribits >=3D 4 && gcs->pribits <=3D 8); =20 - /* - * gicv3_cpuif_reginfo[] defines ICC_AP*R0_EL1; add definitions - * for ICC_AP*R{1,2,3}_EL1 if the prebits value requires them. - */ - if (cs->prebits >=3D 6) { - define_arm_cp_regs(cpu, gicv3_cpuif_icc_apxr1_reginfo); - } - if (cs->prebits =3D=3D 7) { - define_arm_cp_regs(cpu, gicv3_cpuif_icc_apxr23_reginfo); - } + /* + * gicv3_cpuif_reginfo[] defines ICC_AP*R0_EL1; add definitions + * for ICC_AP*R{1,2,3}_EL1 if the prebits value requires them. + */ + if (gcs->prebits >=3D 6) { + define_arm_cp_regs(cpu, gicv3_cpuif_icc_apxr1_reginfo); + } + if (gcs->prebits =3D=3D 7) { + define_arm_cp_regs(cpu, gicv3_cpuif_icc_apxr23_reginfo); + } =20 - if (arm_feature(&cpu->env, ARM_FEATURE_EL2)) { - int j; + if (arm_feature(&cpu->env, ARM_FEATURE_EL2)) { + int j; =20 - cs->num_list_regs =3D cpu->gic_num_lrs ?: 4; - cs->vpribits =3D cpu->gic_vpribits ?: 5; - cs->vprebits =3D cpu->gic_vprebits ?: 5; + gcs->num_list_regs =3D cpu->gic_num_lrs ?: 4; + gcs->vpribits =3D cpu->gic_vpribits ?: 5; + gcs->vprebits =3D cpu->gic_vprebits ?: 5; =20 - /* Check against architectural constraints: getting these - * wrong would be a bug in the CPU code defining these, - * and the implementation relies on them holding. - */ - g_assert(cs->vprebits <=3D cs->vpribits); - g_assert(cs->vprebits >=3D 5 && cs->vprebits <=3D 7); - g_assert(cs->vpribits >=3D 5 && cs->vpribits <=3D 8); + /* Check against architectural constraints: getting these + * wrong would be a bug in the CPU code defining these, + * and the implementation relies on them holding. + */ + g_assert(gcs->vprebits <=3D gcs->vpribits); + g_assert(gcs->vprebits >=3D 5 && gcs->vprebits <=3D 7); + g_assert(gcs->vpribits >=3D 5 && gcs->vpribits <=3D 8); =20 - define_arm_cp_regs(cpu, gicv3_cpuif_hcr_reginfo); + define_arm_cp_regs(cpu, gicv3_cpuif_hcr_reginfo); =20 - for (j =3D 0; j < cs->num_list_regs; j++) { - /* Note that the AArch64 LRs are 64-bit; the AArch32 LRs - * are split into two cp15 regs, LR (the low part, with the - * same encoding as the AArch64 LR) and LRC (the high part= ). - */ - ARMCPRegInfo lr_regset[] =3D { - { .name =3D "ICH_LRn_EL2", .state =3D ARM_CP_STATE_BOT= H, - .opc0 =3D 3, .opc1 =3D 4, .crn =3D 12, - .crm =3D 12 + (j >> 3), .opc2 =3D j & 7, - .type =3D ARM_CP_IO | ARM_CP_NO_RAW, - .nv2_redirect_offset =3D 0x400 + 8 * j, - .access =3D PL2_RW, - .readfn =3D ich_lr_read, - .writefn =3D ich_lr_write, - }, - { .name =3D "ICH_LRCn_EL2", .state =3D ARM_CP_STATE_AA= 32, - .cp =3D 15, .opc1 =3D 4, .crn =3D 12, - .crm =3D 14 + (j >> 3), .opc2 =3D j & 7, - .type =3D ARM_CP_IO | ARM_CP_NO_RAW, - .access =3D PL2_RW, - .readfn =3D ich_lr_read, - .writefn =3D ich_lr_write, - }, - }; - define_arm_cp_regs(cpu, lr_regset); - } - if (cs->vprebits >=3D 6) { - define_arm_cp_regs(cpu, gicv3_cpuif_ich_apxr1_reginfo); - } - if (cs->vprebits =3D=3D 7) { - define_arm_cp_regs(cpu, gicv3_cpuif_ich_apxr23_reginfo); - } - } - if (tcg_enabled() || qtest_enabled()) { - /* - * We can only trap EL changes with TCG. However the GIC inter= rupt - * state only changes on EL changes involving EL2 or EL3, so f= or - * the non-TCG case this is OK, as EL2 and EL3 can't exist. + for (j =3D 0; j < gcs->num_list_regs; j++) { + /* Note that the AArch64 LRs are 64-bit; the AArch32 LRs + * are split into two cp15 regs, LR (the low part, with the + * same encoding as the AArch64 LR) and LRC (the high part). */ - arm_register_el_change_hook(cpu, gicv3_cpuif_el_change_hook, c= s); - } else { - assert(!arm_feature(&cpu->env, ARM_FEATURE_EL2)); - assert(!arm_feature(&cpu->env, ARM_FEATURE_EL3)); - } + ARMCPRegInfo lr_regset[] =3D { + { .name =3D "ICH_LRn_EL2", .state =3D ARM_CP_STATE_BOTH, + .opc0 =3D 3, .opc1 =3D 4, .crn =3D 12, + .crm =3D 12 + (j >> 3), .opc2 =3D j & 7, + .type =3D ARM_CP_IO | ARM_CP_NO_RAW, + .nv2_redirect_offset =3D 0x400 + 8 * j, + .access =3D PL2_RW, + .readfn =3D ich_lr_read, + .writefn =3D ich_lr_write, + }, + { .name =3D "ICH_LRCn_EL2", .state =3D ARM_CP_STATE_AA32, + .cp =3D 15, .opc1 =3D 4, .crn =3D 12, + .crm =3D 14 + (j >> 3), .opc2 =3D j & 7, + .type =3D ARM_CP_IO | ARM_CP_NO_RAW, + .access =3D PL2_RW, + .readfn =3D ich_lr_read, + .writefn =3D ich_lr_write, + }, + }; + define_arm_cp_regs(cpu, lr_regset); + } + if (gcs->vprebits >=3D 6) { + define_arm_cp_regs(cpu, gicv3_cpuif_ich_apxr1_reginfo); + } + if (gcs->vprebits =3D=3D 7) { + define_arm_cp_regs(cpu, gicv3_cpuif_ich_apxr23_reginfo); + } + } + if (tcg_enabled() || qtest_enabled()) { + /* + * We can only trap EL changes with TCG. However the GIC interrupt + * state only changes on EL changes involving EL2 or EL3, so for + * the non-TCG case this is OK, as EL2 and EL3 can't exist. + */ + arm_register_el_change_hook(cpu, gicv3_cpuif_el_change_hook, gcs); + } else { + assert(!arm_feature(&cpu->env, ARM_FEATURE_EL2)); + assert(!arm_feature(&cpu->env, ARM_FEATURE_EL3)); } } diff --git a/hw/intc/arm_gicv3_cpuif_common.c b/hw/intc/arm_gicv3_cpuif_com= mon.c index ff1239f65d..f9a9b2d8a3 100644 --- a/hw/intc/arm_gicv3_cpuif_common.c +++ b/hw/intc/arm_gicv3_cpuif_common.c @@ -20,3 +20,14 @@ void gicv3_set_gicv3state(CPUState *cpu, GICv3CPUState *= s) =20 env->gicv3state =3D (void *)s; }; + +void gicv3_init_cpuif(GICv3State *s) +{ + ARMGICv3CommonClass *agcc =3D ARM_GICV3_COMMON_GET_CLASS(s); + int i; + + /* define and register `system registers` with the vCPU */ + for (i =3D 0; i < s->num_cpu; i++) { + agcc->init_cpu_reginfo(s->cpu[i].cpu); + } +} diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c index 6166283cd1..4ca889da45 100644 --- a/hw/intc/arm_gicv3_kvm.c +++ b/hw/intc/arm_gicv3_kvm.c @@ -776,6 +776,10 @@ static void vm_change_state_handler(void *opaque, bool= running, } } =20 +static void kvm_gicv3_init_cpu_reginfo(CPUState *cs) +{ + define_arm_cp_regs(ARM_CPU(cs), gicv3_cpuif_reginfo); +} =20 static void kvm_arm_gicv3_realize(DeviceState *dev, Error **errp) { @@ -811,11 +815,8 @@ static void kvm_arm_gicv3_realize(DeviceState *dev, Er= ror **errp) =20 gicv3_init_irqs_and_mmio(s, kvm_arm_gicv3_set_irq, NULL); =20 - for (i =3D 0; i < s->num_cpu; i++) { - ARMCPU *cpu =3D ARM_CPU(qemu_get_cpu(i)); - - define_arm_cp_regs(cpu, gicv3_cpuif_reginfo); - } + /* initialize vCPU interface */ + gicv3_init_cpuif(s); =20 /* Try to create the device via the device control API */ s->dev_fd =3D kvm_create_device(kvm_state, KVM_DEV_TYPE_ARM_VGIC_V3, f= alse); @@ -929,6 +930,7 @@ static void kvm_arm_gicv3_class_init(ObjectClass *klass= , const void *data) =20 agcc->pre_save =3D kvm_arm_gicv3_get; agcc->post_load =3D kvm_arm_gicv3_put; + agcc->init_cpu_reginfo =3D kvm_gicv3_init_cpu_reginfo; device_class_set_parent_realize(dc, kvm_arm_gicv3_realize, &kgc->parent_realize); resettable_class_set_parent_phases(rc, NULL, kvm_arm_gicv3_reset_hold,= NULL, diff --git a/hw/intc/gicv3_internal.h b/hw/intc/gicv3_internal.h index bc9f518fe8..cc8edc499b 100644 --- a/hw/intc/gicv3_internal.h +++ b/hw/intc/gicv3_internal.h @@ -722,6 +722,7 @@ void gicv3_redist_vinvall(GICv3CPUState *cs, uint64_t v= ptaddr); =20 void gicv3_redist_send_sgi(GICv3CPUState *cs, int grp, int irq, bool ns); void gicv3_init_cpuif(GICv3State *s); +void gicv3_init_cpu_reginfo(CPUState *cs); =20 /** * gicv3_cpuif_update: diff --git a/include/hw/intc/arm_gicv3_common.h b/include/hw/intc/arm_gicv3= _common.h index c18503869f..3720728227 100644 --- a/include/hw/intc/arm_gicv3_common.h +++ b/include/hw/intc/arm_gicv3_common.h @@ -313,6 +313,7 @@ struct ARMGICv3CommonClass { =20 void (*pre_save)(GICv3State *s); void (*post_load)(GICv3State *s); + void (*init_cpu_reginfo)(CPUState *cs); }; =20 void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_handler handler, --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759281322; cv=none; d=zohomail.com; s=zohoarc; b=V1n5kUwIMDVXxOIPzqRrT09ogRvSGP6xd6U89N29Hv6MUau7d7X4sMkUPzQDOPGXNt9T326VfX1cn5QUBprx8Lv12j1M1sRjFZPCJznx28EwIgGOGswY5wE8n4NNgbldgApKpnu8hJl5RDSNZkABkkchL7qU5scchALpPFXQyd8= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759281322; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=QaYYeBmAaJYjpweUU9UC+NfBTdgZDvDM0RYomvcmv9U=; b=Oiq5mZFVNOcSa5ouPBy3JcPv0z9tMpUZAPjq30voM1i6MCxrzOBtz29Ept6zdwRbbFyEVUXh7IEayQpZ5PIO3IMgBiUpW25G/8LGxBbVrzi9XS0spOZno38A40yX7gjgJc4iAJVMQzfyZuTINpIz/tRJ/SPLCL2r+PJlQeeCJbI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759281322638164.48452625994946; Tue, 30 Sep 2025 18:15:22 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lFN-0005J2-DY; Tue, 30 Sep 2025 21:03:09 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lFH-0005IL-9z for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:03 -0400 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lEq-00082a-D1 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:02 -0400 Received: by mail-wr1-x42f.google.com with SMTP id ffacd0b85a97d-3f99ac9acc4so1513239f8f.3 for ; Tue, 30 Sep 2025 18:02:33 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280551; x=1759885351; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QaYYeBmAaJYjpweUU9UC+NfBTdgZDvDM0RYomvcmv9U=; b=I1LXizBYH49rCwF+PGkprUlMHRkBfSopZ3HgXUU2fswlYI7W0x8cYA8zH1K2yAOeqG 35xmNUEUS5wBwALARleTq0m71w4lSszGd93NWmVp7fhrX0T0ZPHcUaqxOwJMOek4T6We blJexb3tSiwtNQB7Gl9m0X66Fu4JoY8ZGReD3QfkEtZP/9JMKxdGwfD86klWQzt8oQ7x Blv/ysJbX7qcr9WqMK86RY0TSv1UOJti6UjF2wOdFERokCPivXkaar2B+6jRE4jfw0Wk Die6ugtFZd5f+1wKvsO2nJDV91b6ytCP7tqxLhOUvnw+Oc4Y4vQmk4eQIPQbKKVb5j6K hd2g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280551; x=1759885351; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QaYYeBmAaJYjpweUU9UC+NfBTdgZDvDM0RYomvcmv9U=; b=dI5rSGOg4p983vBvahfLanSI3FboPP7KKGXgqNlo452ZpuMSglqOGxaEUjtObLBDgZ YQHTpn4KWp6Fa1BeU/vuIaP8HcTDTI2TWPP6XVYQGrzMtREBcQ5iJVLndPgBGHmL6rjd H9KYXK2q+SuSyccRRlGkddPSww9OCmJ2FiQL0s0pPo2ZJqytMTAx1jl5IEbKHccK+lgB ul5PHsXVhdoVYHY9Ku0ZtURal2/DkilXgzKgzFGXr0bahbGT33SA2iHvyoTsOIGi7ZK2 IB4fMoqF/3uOkcAYfBpQf790Dp1EVzf+m/IjGwXLpV7KwTI4W7uthe9KRmDX77ZTRvyg aOuA== X-Gm-Message-State: AOJu0Yzs/0gg8LGK6WWsVGwgyocszXLKJB2toMdKsjT2J6xp/azB7TtB Xl+PtKqxDs4XSY9ByScfoaYA9BFO/Chd+MNfTg/2S8R1OKRduj0wuFBd4SmefWcBNdZe2kRWsMD +E7EFcOOsEg== X-Gm-Gg: ASbGncuhSMes76B6MkJTvAZiPbRySs/+pLEfmS1J1zsNZOoWqUgzQC2ia3D4fW8oCwB 1ZDFgDfZJLoL+2wnc2C+6dSdBqRiXehzww5rRGSBozBNzlowi0gT8KHKVOJiS4wAi68uPCXyJHE xB00TCtL+/1fx4dfllTtXcJjr5BxW+4lm6zPDhYaxd4gLmynrlEuWo40KfXVQ/XLcxkn/WfENLI Wx/hj6n+yubpr4T+1EuaSxnTZnAM0crq/VAff28YVHtHCB1nl3C4t3V0Nj0f4UfOFWvz8A9Zosg ynCHyssCP9qKNRCgoQ9tKwDoWlaSBvPmDDl2p87Rp1LCHpvUb3WdsAvaHAgc/xebrRLDR6kAKWm clVRBwr66Z9dzXoLubqE+i1Hn4wmPDB2Y2xnMs4wffaf5Z6ADTCKtwqov3UuzOISgOPqpTycgLf 7k7dELeBC7pxF4f7PYbbw7GY9Kf7W9xdw2p81iiy8wbi0F+JC/8cAGcQ== X-Google-Smtp-Source: AGHT+IHPw2sry/eKwPyGgmP14A0iWV/z5t3/11TVgLFSjP3KX2o+Ub2UflZ+bY56rCT737vPKwa/sw== X-Received: by 2002:a05:6000:3105:b0:40d:86c9:5c9e with SMTP id ffacd0b85a97d-42557805113mr1324053f8f.40.1759280550658; Tue, 30 Sep 2025 18:02:30 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 08/24] arm/virt, gicv3: Guard CPU interface access for admin disabled vCPUs Date: Wed, 1 Oct 2025 01:01:11 +0000 Message-Id: <20251001010127.3092631-9-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=salil.mehta@opnsrc.net; helo=mail-wr1-x42f.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759281331877116600 From: Salil Mehta Per Arm GIC Architecture Specification (IHI0069H_b, =C2=A711.1), the CPU in= terface and its Processing Element (PE) share a power domain. If the PE is powered = down or administratively disabled, the CPU interface must be quiescent or off, a= nd any access is architecturally UNPREDICTABLE. Without explicit checks, QEMU = may issue GICC register operations for vCPUs that are offline, removed, or otherwise unavailable=E2=80=94risking inconsistent state or undefined behav= ior in both TCG and KVM accelerators. To address this, introduce a per-vCPU gicc_accessible flag that reflects the administrative enablement of the corresponding QOM vCPU in accordance with = the policy. This is permissible when the GICC (GIC CPU Interface) is online-cap= able, meaning vCPUs can be brought online in the guest kernel after boot. The fla= g is set during GIC realization and used to skip VGIC register reads/writes, SGI generation, and CPU interface updates when the GICC is not accessible. This prevents unsafe operations and ensures compliance when managing administrat= ively disabled but present vCPUs. Co-developed-by: Keqian Zhu Signed-off-by: Keqian Zhu Signed-off-by: Salil Mehta --- hw/core/qdev.c | 26 +++++++++++++++++ hw/intc/arm_gicv3_common.c | 23 +++++++++++++++ hw/intc/arm_gicv3_cpuif.c | 8 +++++ hw/intc/arm_gicv3_cpuif_common.c | 47 ++++++++++++++++++++++++++++++ hw/intc/arm_gicv3_kvm.c | 18 ++++++++++++ include/hw/intc/arm_gicv3_common.h | 24 +++++++++++++++ include/hw/qdev-core.h | 24 +++++++++++++++ 7 files changed, 170 insertions(+) diff --git a/hw/core/qdev.c b/hw/core/qdev.c index 5816abae39..8e9a4da6b5 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -326,6 +326,32 @@ bool qdev_disable(DeviceState *dev, BusState *bus, Err= or **errp) errp); } =20 +int qdev_get_admin_power_state(DeviceState *dev) +{ + DeviceClass *dc; + + if (!dev) { + return DEVICE_ADMIN_POWER_STATE_REMOVED; + } + + dc =3D DEVICE_GET_CLASS(dev); + if (dc->admin_power_state_supported) { + return object_property_get_enum(OBJECT(dev), "admin_power_state", + "DeviceAdminPowerState", NULL); + } + + return DEVICE_ADMIN_POWER_STATE_ENABLED; +} + +bool qdev_check_enabled(DeviceState *dev) +{ + /* + * if device supports power state transitions, check if it is not in + * 'disabled' state. + */ + return qdev_get_admin_power_state(dev) =3D=3D DEVICE_ADMIN_POWER_STATE= _ENABLED; +} + bool qdev_machine_modified(void) { return qdev_hot_added || qdev_hot_removed; diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c index f6a9f1c68b..f4428ad165 100644 --- a/hw/intc/arm_gicv3_common.c +++ b/hw/intc/arm_gicv3_common.c @@ -439,6 +439,29 @@ static void arm_gicv3_common_realize(DeviceState *dev,= Error **errp) CPUState *cpu =3D machine_get_possible_cpu(i); uint64_t cpu_affid; =20 + /* + * Ref: Arm Generic Interrupt Controller Architecture Specification + * (GIC Architecture version 3 and version 4), IHI0069H_b, + * Section 11.1: Power Management + * https://developer.arm.com/documentation/ihi0069 + * + * According to this specification, the CPU interface and the + * Processing Element (PE) must reside in the same power domain. + * Therefore, when a CPU/PE is powered off, its corresponding CPU + * interface must also be in the off state or in a quiescent state= =E2=80=94 + * depending on the state of the associated Redistributor. + * + * The Redistributor may reside in a separate power domain and may + * remain powered even when the associated PE is turned off. + * + * Accessing the GIC CPU interface while the PE is powered down can + * lead to UNPREDICTABLE behavior. + * + * Accordingly, the QOM object `GICv3CPUState` should be marked as + * either accessible or inaccessible based on the power state of t= he + * associated `CPUState` vCPU. + */ + s->cpu[i].gicc_accessible =3D qdev_check_enabled(DEVICE(cpu)); s->cpu[i].cpu =3D cpu; s->cpu[i].gic =3D s; /* Store GICv3CPUState in CPUARMState gicv3state pointer */ diff --git a/hw/intc/arm_gicv3_cpuif.c b/hw/intc/arm_gicv3_cpuif.c index a7904237ac..6430b2c649 100644 --- a/hw/intc/arm_gicv3_cpuif.c +++ b/hw/intc/arm_gicv3_cpuif.c @@ -1052,6 +1052,10 @@ void gicv3_cpuif_update(GICv3CPUState *cs) ARMCPU *cpu =3D ARM_CPU(cs->cpu); CPUARMState *env =3D &cpu->env; =20 + if (!gicv3_gicc_accessible(OBJECT(cs->gic), CPU(cpu)->cpu_index)) { + return; + } + g_assert(bql_locked()); =20 trace_gicv3_cpuif_update(gicv3_redist_affid(cs), cs->hppi.irq, @@ -2036,6 +2040,10 @@ static void icc_generate_sgi(CPUARMState *env, GICv3= CPUState *cs, for (i =3D 0; i < s->num_cpu; i++) { GICv3CPUState *ocs =3D &s->cpu[i]; =20 + if (!gicv3_gicc_accessible(OBJECT(s), i)) { + continue; + } + if (irm) { /* IRM =3D=3D 1 : route to all CPUs except self */ if (cs =3D=3D ocs) { diff --git a/hw/intc/arm_gicv3_cpuif_common.c b/hw/intc/arm_gicv3_cpuif_com= mon.c index f9a9b2d8a3..8f9a5b6fa2 100644 --- a/hw/intc/arm_gicv3_cpuif_common.c +++ b/hw/intc/arm_gicv3_cpuif_common.c @@ -12,6 +12,9 @@ #include "qemu/osdep.h" #include "gicv3_internal.h" #include "cpu.h" +#include "qemu/log.h" +#include "monitor/monitor.h" +#include "qapi/visitor.h" =20 void gicv3_set_gicv3state(CPUState *cpu, GICv3CPUState *s) { @@ -21,6 +24,41 @@ void gicv3_set_gicv3state(CPUState *cpu, GICv3CPUState *= s) env->gicv3state =3D (void *)s; }; =20 +static void +gicv3_get_gicc_accessibility(Object *obj, Visitor *v, const char *name, + void *opaque, Error **errp) +{ + GICv3CPUState *cs =3D (GICv3CPUState *)opaque; + bool value =3D cs->gicc_accessible; + + visit_type_bool(v, name, &value, errp); +} + +static void +gicv3_set_gicc_accessibility(Object *obj, Visitor *v, const char *name, + void *opaque, Error **errp) +{ + GICv3CPUState *gcs =3D opaque; + CPUState *cs =3D gcs->cpu; + bool value; + + visit_type_bool(v, name, &value, errp); + + /* Block external attempts to set */ + if (monitor_cur_is_qmp()) { + error_setg(errp, "Property 'gicc-accessible' is read-only external= ly"); + return; + } + + if (gcs->gicc_accessible !=3D value) { + gcs->gicc_accessible =3D value; + + qemu_log_mask(LOG_UNIMP, + "GICC accessibility changed: vCPU %d =3D %s\n", + cs->cpu_index, value ? "accessible" : "inaccessible"= ); + } +} + void gicv3_init_cpuif(GICv3State *s) { ARMGICv3CommonClass *agcc =3D ARM_GICV3_COMMON_GET_CLASS(s); @@ -28,6 +66,15 @@ void gicv3_init_cpuif(GICv3State *s) =20 /* define and register `system registers` with the vCPU */ for (i =3D 0; i < s->num_cpu; i++) { + g_autofree char *propname =3D g_strdup_printf("gicc-accessible[%d]= ", i); + object_property_add(OBJECT(s), propname, "bool", + gicv3_get_gicc_accessibility, + gicv3_set_gicc_accessibility, + NULL, &s->cpu[i]); + + object_property_set_description(OBJECT(s), propname, + "Per-vCPU GICC interface accessibility (internal set only)"); + agcc->init_cpu_reginfo(s->cpu[i].cpu); } } diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c index 4ca889da45..e97578f59a 100644 --- a/hw/intc/arm_gicv3_kvm.c +++ b/hw/intc/arm_gicv3_kvm.c @@ -457,6 +457,16 @@ static void kvm_arm_gicv3_put(GICv3State *s) GICv3CPUState *c =3D &s->cpu[ncpu]; int num_pri_bits; =20 + /* + * We must ensure that we do not attempt to access or update KVM G= ICC + * registers if their corresponding QOM `GICv3CPUState` is marked = as + * 'inaccessible', because their corresponding QOM vCPU objects + * are in administratively 'disabled' state. + */ + if (!gicv3_gicc_accessible(OBJECT(s), ncpu)) { + continue; + } + kvm_gicc_access(s, ICC_SRE_EL1, ncpu, &c->icc_sre_el1, true); kvm_gicc_access(s, ICC_CTLR_EL1, ncpu, &c->icc_ctlr_el1[GICV3_NS], true); @@ -615,6 +625,14 @@ static void kvm_arm_gicv3_get(GICv3State *s) GICv3CPUState *c =3D &s->cpu[ncpu]; int num_pri_bits; =20 + /* + * don't attempt to access KVM VGIC for the disabled vCPUs where + * GICv3CPUState is inaccessible. + */ + if (!gicv3_gicc_accessible(OBJECT(s), ncpu)) { + continue; + } + kvm_gicc_access(s, ICC_SRE_EL1, ncpu, &c->icc_sre_el1, false); kvm_gicc_access(s, ICC_CTLR_EL1, ncpu, &c->icc_ctlr_el1[GICV3_NS], false); diff --git a/include/hw/intc/arm_gicv3_common.h b/include/hw/intc/arm_gicv3= _common.h index 3720728227..bbf899184e 100644 --- a/include/hw/intc/arm_gicv3_common.h +++ b/include/hw/intc/arm_gicv3_common.h @@ -27,6 +27,7 @@ #include "hw/sysbus.h" #include "hw/intc/arm_gic_common.h" #include "qom/object.h" +#include "qapi/error.h" =20 /* * Maximum number of possible interrupts, determined by the GIC architectu= re. @@ -164,6 +165,7 @@ struct GICv3CPUState { uint64_t icc_apr[3][4]; uint64_t icc_igrpen[3]; uint64_t icc_ctlr_el3; + bool gicc_accessible; =20 /* Virtualization control interface */ uint64_t ich_apr[3][4]; /* ich_apr[GICV3_G1][x] never used */ @@ -329,4 +331,26 @@ void gicv3_init_irqs_and_mmio(GICv3State *s, qemu_irq_= handler handler, */ const char *gicv3_class_name(void); =20 +/** + * gicv3_gicc_accessible: + * @obj: QOM object implementing the GICv3 device + * @cpu: Index of the vCPU whose GICC accessibility is being queried + * + * Returns: true if the GICC interface for vCPU @cpu is accessible. + * Uses QOM property lookup for "gicc-accessible[%d]". + */ +static inline bool gicv3_gicc_accessible(Object *obj, int cpu) +{ + g_autofree gchar *propname =3D g_strdup_printf("gicc-accessible[%d]", = cpu); + Error *local_err =3D NULL; + bool value; + + value =3D object_property_get_bool(obj, propname, &local_err); + if (local_err) { + error_report_err(local_err); + return false; + } + + return value; +} #endif diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 2c22b32a3f..b1d3fa4a25 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -589,6 +589,30 @@ bool qdev_realize_and_unref(DeviceState *dev, BusState= *bus, Error **errp); */ bool qdev_disable(DeviceState *dev, BusState *bus, Error **errp); =20 +/** + * qdev_check_enabled - Check if a device is administratively enabled + * @dev: The device to check + * + * This function returns whether the device is currently in administrative + * ENABLED state. It does not reflect runtime operational power state, but + * rather the host policy on whether the guest may interact with the devic= e. + * + * Returns true if the device is administratively enabled; false otherwise. + */ +bool qdev_check_enabled(DeviceState *dev); + +/** + * qdev_get_admin_power_state - Query administrative power state of a devi= ce + * @dev: The device whose state is being queried + * + * Returns the current administrative power state (ENABLED or DISABLED), + * as stored in the device's internal admin state field. This reflects + * host-level policy=E2=80=94not the operational runtime state seen by the= guest. + * + * Returns an integer from the DeviceAdminPowerState enum. + */ +int qdev_get_admin_power_state(DeviceState *dev); + /** * qdev_unrealize: Unrealize a device * @dev: device to unrealize --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759281380; cv=none; d=zohomail.com; s=zohoarc; b=gKWvnO4OW7huvv22bupV9NXWBVU630QRQyDSF+AuUcrTVkURxZ9hbz/U8MTiQzWyeuVj6AAK5NzLdPDkyw7AVi3mdk6KgPattDcFHu0W197n3UEqRcegrF/WNU8XvxR5yMZAcME/ecFNOanWEHrQFS1F+BqHA0hu58jepOSDS1I= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759281380; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=o21SQbmHJ5c7GKJfo1w8RsDWnLTVYYuhCdCc7+7iYhg=; b=A3eUgKuqZHhBJAPtN2OBN+NOc6S0NSF/RkukUNJ7+nAy7P1Rkzzi/MrdOMkfqb/Lr28IR5LKy/l0fuDSWHNV9zuzTrzBrGYutc4IP7jcxuW8alrW4NqyZcM4GXkk9RhxT441zm3tuD06+6hVntxOSzbmBRIXIifawhUlRXF2sgM= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759281380290864.7925999611604; Tue, 30 Sep 2025 18:16:20 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lFa-0005MD-Fg; Tue, 30 Sep 2025 21:03:22 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lFM-0005JH-DP for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:08 -0400 Received: from mail-wr1-x42b.google.com ([2a00:1450:4864:20::42b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lEr-00083F-JN for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:07 -0400 Received: by mail-wr1-x42b.google.com with SMTP id ffacd0b85a97d-3ee15505cdeso338479f8f.0 for ; Tue, 30 Sep 2025 18:02:37 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280554; x=1759885354; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=o21SQbmHJ5c7GKJfo1w8RsDWnLTVYYuhCdCc7+7iYhg=; b=YIO9J+KrmL96QSgo3783pq/q893m/yJarvyCp7D1uMFCdy2Fd152BAvVu2TECNUtOr apScjBxT/JDtCQCaJUSWzdORZDvyQNQ0L5O5EiXD8s8J1guV1d0s/IS97xJ42SY5VoNJ K/8QT0vu75CxGkP77AWoahKwvGHBb7pFyYPAMpArYvs3Lr2w1VwxRtcxS0Z8XMCUilEF JE4zSt4bd+c1x/EE1IQ99Zh6veF6GuhtEQEdMTHnz4s9C2jy5vyjpVEzHIgRQW4+iLqq bo8URDm3zUBeLxBmlHOuN9GIbtPnL/7+kFWCurCGcfOvYqq2bBXLUfSL+zXZ6Q2K0o5H tkUg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280554; x=1759885354; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=o21SQbmHJ5c7GKJfo1w8RsDWnLTVYYuhCdCc7+7iYhg=; b=sfyoeOQg7kXCGIQ6h+wpGvaVMqDDPoiTooyfs5qUKbs6D2HMZCvbq9p2YYTkncv/Si UPYb62oItDx/JhKsEL1GGDYF5zfjH2r1M0brlqi87To9ZQ/lDdDMkoHW/yghY3/oTxRn QY5DKVcTIJJ65/zhR39woz5oBqi1FKD21YpVcrCiXOmpqGvjTNNquNaJIsp+aMYYmf8r PzpKP4d9+Hdy18llvhXxfcaVgc7fH7DTxJJCC0izX2PRhKPqTgftHjZTJyN799H5GZr3 /np0dY+RQnLj2f9dc1v96pGeaBCYv6XDhazgyhkiHJW/mAWUVgD70TvvewcEGUQYZddS 8E4Q== X-Gm-Message-State: AOJu0YwDnnJ7DCul4hl5oGhKVgsn5AfRVfidSN2566sHPwRYEAN6Ui0A c4pUeJMcN8uwVemIjCh1gU5zxEKuHioAI50inxrMmDobSQ///vlWP0JbDTsSuhqjIxdb1BdYISk DxdqCDdN5xg== X-Gm-Gg: ASbGnctL97/nEQ2EZ6PJwWpIVK1TYQzSVUCR+zEfojVQH3nw7DzkjI3Nzzu32o99kvo EWhH+8m75XQOiD1KIaIQPsKdN/PstkOlzXmUJ3Lg5DE0/TsbgP5tuXUm3iH00se7vZlTaxAXFyW p+pgGIKobMs1qB8yIM8F8YIa3YuBhYSQscEq3Db7zfl2OSC7HVeC4yphY2LZICCWbhI7Q/He3ZQ n1VEW/dJngy/OvZhC3O38AwygmKC27Yc5KYqXKRyqtTLshWBxjacTJl3dkZKaof/9NZnKgm/qPi 1REAVPFOIw8ITis2sAY3oMDXHAOzAU3xfzFA5Yci17J7sVq0NkRrHcCax6qA3C8PycQFoBlM5M4 xs7UYW5jhggk0zWjqYCGyRgF3ft1X8w9Gzvbwl2VGV7cE8h6GOC0s1S+4af1zpznvgSNb6bpH5U mPIiY2+ODJRh/FsFVKugZ6u7z4yZIoGjVAA9eMcS1a414= X-Google-Smtp-Source: AGHT+IFKH6S+0qB6r4/RQVzP5vbIJk7BN/G8gLBzoKRJmf/V1TTep5dRu4XD49WOztrcdBIL24geCg== X-Received: by 2002:a5d:5d87:0:b0:3fd:271d:e2a5 with SMTP id ffacd0b85a97d-4240f823772mr5992988f8f.11.1759280553959; Tue, 30 Sep 2025 18:02:33 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 09/24] hw/intc/arm_gicv3_common: Migrate & check 'GICv3CPUState' accessibility mismatch Date: Wed, 1 Oct 2025 01:01:12 +0000 Message-Id: <20251001010127.3092631-10-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::42b; envelope-from=salil.mehta@opnsrc.net; helo=mail-wr1-x42b.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_SPF_HELO_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759281382084116600 Content-Type: text/plain; charset="utf-8" From: Salil Mehta At the source, administratively disabled vCPUs may lack a CPU VMSD: either = they were never realized (never enabled once), or they were realized and later disabled, causing the VMSD to be unregistered. Such vCPUs are not migrated = as CPU devices. However, the GICv3CpuState for all vCPUs is still migrated to = the destination VM and must be checked for mismatches in their CPU interface accessibility. To preserve correctness, migrate the per-vCPU `gicc_accessible` bit as part= of the GICv3 device state, and fail migration on load if a mismatch is detecte= d. Administrators must ensure that the number of possible vCPUs and the number= of administratively disabled vCPUs remain consistent across hosts. Changes: - Add `VMSTATE_BOOL(gicc_accessible)` to the per-vCPU GICv3 state. - Add `post_load` hook that checks for mismatch in disabled vCPUs by verif= ying GIC CPU interface accessibility. Signed-off-by: Salil Mehta --- hw/core/qdev.c | 17 +++++++++++++++++ hw/intc/arm_gicv3_common.c | 37 +++++++++++++++++++++++++++++++++++++ include/hw/qdev-core.h | 15 +++++++++++++++ 3 files changed, 69 insertions(+) diff --git a/hw/core/qdev.c b/hw/core/qdev.c index 8e9a4da6b5..23b84a7756 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -326,6 +326,23 @@ bool qdev_disable(DeviceState *dev, BusState *bus, Err= or **errp) errp); } =20 +bool qdev_enable(DeviceState *dev, BusState *bus, Error **errp) +{ + g_assert(dev); + + if (bus) { + error_setg(errp, "Device %s does not supports 'enable' operation", + object_get_typename(OBJECT(dev))); + return false; + } + + /* devices like cpu don't have bus */ + g_assert(!DEVICE_GET_CLASS(dev)->bus_type); + + return object_property_set_str(OBJECT(dev), "admin_power_state", "enab= led", + errp); +} + int qdev_get_admin_power_state(DeviceState *dev) { DeviceClass *dc; diff --git a/hw/intc/arm_gicv3_common.c b/hw/intc/arm_gicv3_common.c index f4428ad165..9139352330 100644 --- a/hw/intc/arm_gicv3_common.c +++ b/hw/intc/arm_gicv3_common.c @@ -84,6 +84,15 @@ static int gicv3_post_load(void *opaque, int version_id) { GICv3State *s =3D (GICv3State *)opaque; ARMGICv3CommonClass *c =3D ARM_GICV3_COMMON_GET_CLASS(s); + MachineState *ms =3D MACHINE(qdev_get_machine()); + + /* ensure source and destination VM 'maxcpu' count matches */ + if (s->num_cpu !=3D ms->smp.max_cpus) { + error_report("GICv3: source num_cpu(%u) !=3D dest maxcpus(%u). " + "Launch dest with -smp maxcpus=3D%u", + s->num_cpu, ms->smp.max_cpus, s->num_cpu); + return -1; + } =20 gicv3_gicd_no_migration_shift_bug_post_load(s); =20 @@ -127,6 +136,32 @@ static int vmstate_gicv3_cpu_pre_load(void *opaque) return 0; } =20 +static int vmstate_gicv3_cpu_post_load(void *opaque, int version_id) +{ + bool src_enabled, dst_enabled; + GICv3CPUState *gcs =3D opaque; + CPUState *cs =3D gcs->cpu; + + if (!cs) { + return 0; + } + + /* we derive the source vCPU admin state via GIC CPU Interface */ + src_enabled =3D gicv3_gicc_accessible(OBJECT(gcs->gic), cs->cpu_index); + dst_enabled =3D qdev_check_enabled(DEVICE(cs)); + + if (dst_enabled !=3D src_enabled) { + error_report("GICv3: CPU %d admin-state mismatch: dst=3D%s, src=3D= %s;" + " Aborting!", cs->cpu_index, + dst_enabled ? "enabled" : "disabled", + src_enabled ? "enabled" : "disabled"); + + return -1; + } + + return 0; +} + static bool icc_sre_el1_reg_needed(void *opaque) { GICv3CPUState *cs =3D opaque; @@ -187,6 +222,7 @@ static const VMStateDescription vmstate_gicv3_cpu =3D { .version_id =3D 1, .minimum_version_id =3D 1, .pre_load =3D vmstate_gicv3_cpu_pre_load, + .post_load =3D vmstate_gicv3_cpu_post_load, .fields =3D (const VMStateField[]) { VMSTATE_UINT32(level, GICv3CPUState), VMSTATE_UINT32(gicr_ctlr, GICv3CPUState), @@ -208,6 +244,7 @@ static const VMStateDescription vmstate_gicv3_cpu =3D { VMSTATE_UINT64_2DARRAY(icc_apr, GICv3CPUState, 3, 4), VMSTATE_UINT64_ARRAY(icc_igrpen, GICv3CPUState, 3), VMSTATE_UINT64(icc_ctlr_el3, GICv3CPUState), + VMSTATE_BOOL(gicc_accessible, GICv3CPUState), VMSTATE_END_OF_LIST() }, .subsections =3D (const VMStateDescription * const []) { diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index b1d3fa4a25..855ff865ba 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -589,6 +589,21 @@ bool qdev_realize_and_unref(DeviceState *dev, BusState= *bus, Error **errp); */ bool qdev_disable(DeviceState *dev, BusState *bus, Error **errp); =20 +/** + * qdev_enable - Power on and administratively enable a device + * @dev: The device to be powered on and administratively enabled + * @bus: The bus on which the device is connected (may be NULL for CPUs) + * @errp: Pointer to a location where an error can be reported + * + * This function performs both administrative and operational power-on of + * the specified device. It transitions the device into ENABLED state and + * restores runtime availability. If applicable, the device is also re-add= ed + * to the migration stream. + * + * Returns true if the operation succeeds; false otherwise, with @errp set. + */ +bool qdev_enable(DeviceState *dev, BusState *bus, Error **errp); + /** * qdev_check_enabled - Check if a device is administratively enabled * @dev: The device to check --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759281353; cv=none; d=zohomail.com; s=zohoarc; b=WHFCKX1k56zxWLXObKwIxGlBkG9rm5psvOL5B3sP6M2qBNI9J9SFJXnMt8K2AIublw6C0J0ES2tDfwe76Dtf+dffdNc8nu+D5Q2yJ1LBSSCm0H46La6w5eHgQD9A4ev0434ltoi7tim1kb6s8Sv6bB1uxFxZJpMy5b910OawbeI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759281353; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=a+LKVy+uEs1gyTpMUsvOiaFbfXh40S7aK551rrhdKmI=; b=FV0MljAXdvF7bolz4SlKHaVJP0D5j8GN4oVgF377szbVLLojR6UrQQdUEnNhc/GLMSheca8aMYR0ADG2WYNf1Rvl5/WhaJNxXABsc0eNQJogQPB3Z5VVsTKjT9Ou9R/YIznlVfXC+nsjeUe53qxf+eX9p/O0M7sMBankuZSu1FI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759281353072252.55873100699307; Tue, 30 Sep 2025 18:15:53 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lFR-0005Kv-Ge; Tue, 30 Sep 2025 21:03:13 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lFM-0005JI-D9 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:08 -0400 Received: from mail-wr1-x42e.google.com ([2a00:1450:4864:20::42e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lEv-00083i-CR for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:07 -0400 Received: by mail-wr1-x42e.google.com with SMTP id ffacd0b85a97d-3f44000626bso3995492f8f.3 for ; Tue, 30 Sep 2025 18:02:38 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280556; x=1759885356; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=a+LKVy+uEs1gyTpMUsvOiaFbfXh40S7aK551rrhdKmI=; b=hbhbNi2PG29T2NPRYvevo83wjbDO1Hz663MrMOE9MausRRdXvMNWrW+PySIhEeMybs EpeFLu9g0GpZPQtyn0WcM0WDkwu+3II8ecFyq9rKnJqxPdDacVFAruiqLqFDIaEfhA9N ITy1udYdaP0k6BaRf8puJHuWGUwryvcRUOtjskcR3RbaFJ0T49/qr04geZ4OGd+1/Bj3 /uRjU//bkpwUoBZAP+hQvwWnVsCmSIhQXIuoGi/3m5RDm1u/7wpbeaKPo/+BoKsUMTtb dZ1TqyWA6hg5xIAiZdOxVj1oM8mYEAr1rEZPo3yUPD5k7glkB4kLxxZ5zNW/iABu1BcU SGmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280556; x=1759885356; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=a+LKVy+uEs1gyTpMUsvOiaFbfXh40S7aK551rrhdKmI=; b=NtekZKZFa3cNMo0kDic8jmZbvVd3LapW/rLqzoVVaQfjnnlsnFZv/jrz+rKmHqixH9 XwyB+RSekYxRmYeKLMTkg1WXGVQ+T28sGPTigsim6VyBIcIyGwTL8n08ztfNgNkcdZWV ereayy9yL7mqaKEcBTH9P4UzdEl/apwYnKGBwIgvxQJJq4BwtHr7zb1tFpiE/HWMPcwo MRj1XKR6rjGmc/BTSTDGRsO3n6+n0Fg034McwV0x/RcbDSeml8gJSmSfwsVLnOufWOcb i6SbB4z3W0HGSOSFxFTe0WtjsV08tjZkfjEuluKx3eHREqwyi3+2xpAGOMJQVoMDh0m2 U3Aw== X-Gm-Message-State: AOJu0YzIBs2NpyEB7UkdlShT1/Gty2A5rifDW9llNM19MCajSWwK6/hG hsugMpDn5P9u3VJIa7q68FHttaehZCioQvS/cedm/tsl9fwNkkCUiqjtJDuPhR4SGWYjRNSv/g/ e4iniT4RslA== X-Gm-Gg: ASbGncsLvOBLor/6ZfcfPRfF0fOUfqoNx5xTUyqhcyGPmYjyiB7FMJtkqwYu/NACRiv WzmXm0ey1lCDeGF+nMOmm0BaeJGqpdshGDS+M6FBSs/VS5kTYiB3zmc6Z62neJP3KiEBz/13SvK Ehiob034fKjRDu6i0o0trb0UeaqEMFyKGgB4FpT5LpFTCn7uqxV5QSDq3aiKarqPv4KrudyOYqE EmbNeAZ8DjmEXzwDeB/vdGrU4lFkwTyT0CbivAS5bzIbGUjI6dVFresuiARHeXhmg0GZLXO8thp lxskY1zTyGm9QPK0JOQS1FIMe1WtcBtkl5fXT1EitkK2dJW4sDSu4lpZtqToTDv0lvuPxBCUUuU dD+Yorkv1VpMg10rSe/LN1KNiDVbf12l3xAwfPCwSO+WetD4g6rdKxqPxA8fWVboPbNJ/Pqol7g BszgMB6USTtwb4PqJz0MrkLZ6hRPetBA5KgUKiSzeKhc0= X-Google-Smtp-Source: AGHT+IGIOt/MFONJUAOMgOm5H0GxVTM3YTOfkE8m8l8GQdsR9nUtBneNaBYLZrxSvSmST42T3XP8Aw== X-Received: by 2002:a05:6000:430a:b0:3eb:f3de:1a87 with SMTP id ffacd0b85a97d-42557820cbdmr1157560f8f.56.1759280555752; Tue, 30 Sep 2025 18:02:35 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 10/24] arm/virt: Init PMU at host for all present vCPUs Date: Wed, 1 Oct 2025 01:01:13 +0000 Message-Id: <20251001010127.3092631-11-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::42e; envelope-from=salil.mehta@opnsrc.net; helo=mail-wr1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759281353763116600 From: Salil Mehta ARM architecture requires that all CPUs which form part of the VM must expose identical feature sets and consistent system components at creation time. This includes the Performance Monitoring Unit (PMU). If only the boot CPUs had their PMU state initialized, the remaining CPUs defined by `smp.disabled_cpus` would not match this architectural requirement, leading to inconsistencies and guest misbehavior. To comply with this constraint, PMU initialization must cover the entire set of present vCPUs: present =3D smp.cpus + smp.disabled_cpus CPUs outside this set (`smp.max_cpus - present`) are not considered part of the machine at creation and are therefore not initialized. Co-developed-by: Keqian Zhu Signed-off-by: Keqian Zhu Signed-off-by: Salil Mehta --- hw/arm/virt.c | 13 +++++++--- include/hw/arm/virt.h | 1 + include/hw/core/cpu.h | 57 +++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 67 insertions(+), 4 deletions(-) diff --git a/hw/arm/virt.c b/hw/arm/virt.c index ee09aa19bd..3980f553db 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -2087,12 +2087,13 @@ static void finalize_gic_version(VirtMachineState *= vms) static void virt_post_cpus_gic_realized(VirtMachineState *vms, MemoryRegion *sysmem) { + CPUArchIdList *possible_cpus =3D vms->parent.possible_cpus; int max_cpus =3D MACHINE(vms)->smp.max_cpus; - bool aarch64, pmu, steal_time; + bool aarch64, steal_time; CPUState *cpu; =20 aarch64 =3D object_property_get_bool(OBJECT(first_cpu), "aarch64", NUL= L); - pmu =3D object_property_get_bool(OBJECT(first_cpu), "pmu", NULL); + vms->pmu =3D object_property_get_bool(OBJECT(first_cpu), "pmu", NULL); steal_time =3D object_property_get_bool(OBJECT(first_cpu), "kvm-steal-time", NULL); =20 @@ -2123,8 +2124,12 @@ static void virt_post_cpus_gic_realized(VirtMachineS= tate *vms, exit(1); } =20 - CPU_FOREACH(cpu) { - if (pmu) { + CPU_FOREACH_POSSIBLE(cpu, possible_cpus) { + if (!cpu) { + continue; + } + + if (vms->pmu) { assert(arm_feature(&ARM_CPU(cpu)->env, ARM_FEATURE_PMU)); if (kvm_irqchip_in_kernel()) { kvm_arm_pmu_set_irq(ARM_CPU(cpu), VIRTUAL_PMU_IRQ); diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h index ace4154cc6..02cc311452 100644 --- a/include/hw/arm/virt.h +++ b/include/hw/arm/virt.h @@ -154,6 +154,7 @@ struct VirtMachineState { bool mte; bool dtb_randomness; bool second_ns_uart_present; + bool pmu; OnOffAuto acpi; VirtGICType gic_version; VirtIOMMUType iommu; diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h index 5eaf41a566..2ee202a8a5 100644 --- a/include/hw/core/cpu.h +++ b/include/hw/core/cpu.h @@ -602,6 +602,63 @@ extern CPUTailQ cpus_queue; #define CPU_FOREACH_SAFE(cpu, next_cpu) \ QTAILQ_FOREACH_SAFE_RCU(cpu, &cpus_queue, node, next_cpu) =20 + +/** + * CPU_FOREACH_POSSIBLE(cpu_, archid_list_) + * + * Iterate over all entries in a CPUArchIdList, assigning each entry=E2=80= =99s + * CPUState* to @cpu_. This hides the loop index and reads like a normal + * C for-loop. + * + * A CPUArchIdList represents the set of *possible* CPUs for a machine. + * Each entry contains: + * - @cpu: CPUState pointer, or NULL if not realized yet + * - @arch_id: architecture-specific identifier (e.g. MPIDR) + * - @vcpus_count: number of vCPUs represented (usually 1) + * + * The list models *possible* CPUs: it includes (a) currently plugged vCPUs + * made available through hotplug, (b) present (and perhaps visible to OSP= M) + * but kept ACPI-disabled vCPUs, and (c) reserved slots for CPUs that may = be + * created in the future. This supports co-existence of hotpluggable and + * admin-disabled vCPUs if architectures permit. + * + * Example: + * + * CPUArchIdList *alist =3D machine_possible_cpus(ms); + * CPUState *cpu; + * + * CPU_FOREACH_POSSIBLE(cpu, alist) { + * if (!cpu) { + * continue; // reserved slot for hotplug case + * } + * + * < Do Something > + * } + * + * Expanded equivalent: + * + * for (int __cpu_idx =3D 0; alist && __cpu_idx < alist->len; __cpu_idx+= +) { + * if ((cpu =3D alist->cpus[__cpu_idx].cpu, 1)) { + * if (!cpu) { + * continue; + * } + * + * < Do Something > + * } + * } + * + * Notes: + * - Callers must check @cpu for NULL when filtering unplugged CPUs. + * - Mirrors the style of CPU_FOREACH(), but iterates all *possible* CPUs + * (plugged, ACPI-disabled, and reserved slots) rather than only prese= nt + * and enabled vCPUs. + */ +#define CPU_FOREACH_POSSIBLE(cpu_, archid_list_) \ + for (int __cpu_idx =3D 0; \ + (archid_list_) && __cpu_idx < (archid_list_)->len; \ + __cpu_idx++) \ + if (((cpu_) =3D (archid_list_)->cpus[__cpu_idx].cpu, 1)) + extern __thread CPUState *current_cpu; =20 /** --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759280803; cv=none; d=zohomail.com; s=zohoarc; b=TqnXs8s9o8XSd1GNdmE8/Fhg7GtBbfjCVdcFWO+5puXLU/K1WInRnRkkNA9kvHE62PMrysa6J1bCBMHzk4f7QIvsVxV6uqKCIpU69X+/VyilxnLswK8PTy3BZCuhEyKaTO+egCij+peBp4rBNb6tnhR9QOMqrO7cGppiSljTSUQ= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759280803; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=AIp1fTDpYUODV+o3H7vcfIGfQr/6T5pDk40yx4w9U+c=; b=QKHEufyXqDTaYbGnGLR3FZf0ekuRnD8WxO3pLl+zBJKNR32iUJ8ttYtw7oVJCXnEAP+P0/ZLWh5fW52po3wAa8YZtmPH7Ej3PIxmh7lhzWMxso7eikBg4zyhJDLHchDuq53dl+joJuqibvmDn24kjNNW5dp6ByyyvuWp7xMAtQY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759280802976696.5694429519989; Tue, 30 Sep 2025 18:06:42 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lFi-0005Pk-Lj; Tue, 30 Sep 2025 21:03:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lFY-0005Me-B1 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:20 -0400 Received: from mail-wr1-x42c.google.com ([2a00:1450:4864:20::42c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lEy-00084U-UV for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:20 -0400 Received: by mail-wr1-x42c.google.com with SMTP id ffacd0b85a97d-3ece0e4c5faso5385960f8f.1 for ; Tue, 30 Sep 2025 18:02:42 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280558; x=1759885358; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=AIp1fTDpYUODV+o3H7vcfIGfQr/6T5pDk40yx4w9U+c=; b=QZno1tdt4viyyeratTyfGlK0oYCqKDssUHCNjoA88f80xrm2HA4Y/zXw1mjqFaCIch 2zg/wBOAU+29e5A8FR/0MpyvkSSHkovTV2POe8xQfWYkLbxG2oY8rdb9D016+aTy9h9m ywdhyeeg5BS4utJ3ZmFS69xaiEsOpTCWhXflCqo9f61ViXRg2kvrT+S0DwymhgpcBm/p 2161sUt8KwwQDCQserGUqN3fWjDtYtMyFBm0WxTc9JU3BpX9W1XQshP7Aq6UeKPlj/Cy AXkNozTA6+5Vh2m4t22SDR7Z/EZsnPkzmQntjFmK/te6SuVuJ7lvIXH8oME15nLkTc1B GOdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280558; x=1759885358; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=AIp1fTDpYUODV+o3H7vcfIGfQr/6T5pDk40yx4w9U+c=; b=BHqnv3S6MCyRoEvC5xBETr/PfXf0uiN1Uc4utFVosnCxZuknWdELWo4dk25Dco6Z16 PDV65CQZ2oddC99Az4YpIpcmlRNjzsROj1/RaBgtdaoFZIPepu7WZkxV6CQJD3mspACg 7hPW762rcf+15IrO4ze+pvJjR8EXBjmOAdwVvlPPaOlHXM0wkCzsvmSg07E/tNnHeral SIt3hgoZ3s5eXk37FDR0S2mGUBkpsAMCo2lKASHHEn94lqgYL2TgVnCONYLWeAI5gmxS ngAMCTHVEbzQvnup2xEIxfKGE7zlCqXQ60WpoWsN6HgI2GwBxs+oWK9iMG2YXnIgJjkt SDmg== X-Gm-Message-State: AOJu0YwmnACFDTnaX8oGJASGiIpsfg9DWoF1kqxkWPwK7cLao+FurEj4 84cIOPxRTkEvQSPjwBQ3BBhEK7hg8cCP5Qa2tKB6kdYM/9QqyJOMzeclOv7a6piIwTacf4zua1j QNFHbsRVRhA== X-Gm-Gg: ASbGncvNBiqZAEaUWsWR3qwN3thtRVt5B3sdSetjWgWDnAdD17GbyQs/dokalhoy8gf bQyb8kHVQydePF29M+xaifwlYVjaYNjEA7PR196nDb4YUU18m8xlS3U4gKIutnt3ugAweR65Nhs bSgWL/3tyJ6Dmpl4ckoPCBGVKszMqgIRxu9WstgVYU638xx0NiDvJqRUA+EUFLAoUfnmS0L+QJQ Bw2cE1a+0rop2ijgcz95oY7kodVNTJSmiOUpXJrPmf/U8Jn893EQQAdqQiUX9gh9k5JoWZEGyBr oiG+zNQgS8r9RE2BszOVBvXCbDWc8/w2srx/8oICYDgbNG9J/8JV/KrfJkKu0tximsMOJGwNIag SjgvPq6JZaCFDJPHv9P57vt/WZeAwNqiQlgRkzorU9xP2GG6gdFr8as4KBq8dMXp2ZYIrUOu0My VO9++w7PMCN7+YN12JomCU63lVhcv7RJvyb1giEVRix84= X-Google-Smtp-Source: AGHT+IHFSLI7/UdL6eC6mrr3Se5gLK2/8fPG8GzP/SKGovwybfxzCRv6ENVVmMVAr4Fr3OU7PXKjdQ== X-Received: by 2002:a05:6000:22c3:b0:3ee:114f:f88f with SMTP id ffacd0b85a97d-4255782044bmr1069193f8f.59.1759280558171; Tue, 30 Sep 2025 18:02:38 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 11/24] hw/arm/acpi: MADT change to size the guest with possible vCPUs Date: Wed, 1 Oct 2025 01:01:14 +0000 Message-Id: <20251001010127.3092631-12-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::42c; envelope-from=salil.mehta@opnsrc.net; helo=mail-wr1-x42c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_SPF_HELO_TEMPERROR=0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759280805400116600 Content-Type: text/plain; charset="utf-8" From: Salil Mehta When QEMU builds the MADT table, modifications are needed to include inform= ation about possible vCPUs that are exposed as ACPI-disabled (i.e., `_STA.Enabled= =3D0`). This new information will help the guest kernel pre-size its resources duri= ng boot time. Pre-sizing based on possible vCPUs will facilitate the future hot-plugging of the currently disabled vCPUs. Additionally, this change addresses updates to the ACPI MADT GIC CPU interf= ace flags, as introduced in the UEFI ACPI 6.5 specification [1]. These updates enable deferred virtual CPU onlining in the guest kernel. Reference: [1] 5.2.12.14. GIC CPU Interface (GICC) Structure (Table 5.37 GICC CPU Inte= rface Flags) Link: https://uefi.org/specs/ACPI/6.5/05_ACPI_Software_Programming_Mode= l.html#gic-cpu-interface-gicc-structure Co-developed-by: Keqian Zhu Signed-off-by: Keqian Zhu Signed-off-by: Salil Mehta --- hw/arm/virt-acpi-build.c | 40 ++++++++++++++++++++++++++++++++++------ hw/core/machine.c | 14 ++++++++++++++ include/hw/boards.h | 20 ++++++++++++++++++++ 3 files changed, 68 insertions(+), 6 deletions(-) diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c index b01fc4f8ef..7c24dd6369 100644 --- a/hw/arm/virt-acpi-build.c +++ b/hw/arm/virt-acpi-build.c @@ -760,6 +760,32 @@ static void build_append_gicr(GArray *table_data, uint= 64_t base, uint32_t size) build_append_int_noprefix(table_data, size, 4); /* Discovery Range Len= gth */ } =20 +static uint32_t virt_acpi_get_gicc_flags(CPUState *cpu) +{ + MachineClass *mc =3D MACHINE_GET_CLASS(qdev_get_machine()); + const uint32_t GICC_FLAG_ENABLED =3D BIT(0); + const uint32_t GICC_FLAG_ONLINE_CAPABLE =3D BIT(3); + + /* ARM architecture does not support vCPU hotplug yet */ + if (!cpu) { + return 0; + } + + /* + * If the machine does not support online-capable CPUs, report the GIC= C as + * 'enabled' only. + */ + if (!mc->has_online_capable_cpus) { + return GICC_FLAG_ENABLED; + } + + /* + * ACPI 6.5, 5.2.12.14 (GICC): mark the boot CPU 'enabled' and all oth= ers + * 'online-capable'. + */ + return (cpu =3D=3D first_cpu) ? GICC_FLAG_ENABLED : GICC_FLAG_ONLINE_C= APABLE; +} + static void build_madt(GArray *table_data, BIOSLinker *linker, VirtMachineState *vms) { @@ -785,12 +811,14 @@ build_madt(GArray *table_data, BIOSLinker *linker, Vi= rtMachineState *vms) build_append_int_noprefix(table_data, vms->gic_version, 1); build_append_int_noprefix(table_data, 0, 3); /* Reserved */ =20 - for (i =3D 0; i < MACHINE(vms)->smp.cpus; i++) { - ARMCPU *armcpu =3D ARM_CPU(qemu_get_cpu(i)); + for (i =3D 0; i < MACHINE(vms)->smp.max_cpus; i++) { + CPUState *cpu =3D machine_get_possible_cpu(i); uint64_t physical_base_address =3D 0, gich =3D 0, gicv =3D 0; uint32_t vgic_interrupt =3D vms->virt ? ARCH_GIC_MAINT_IRQ : 0; - uint32_t pmu_interrupt =3D arm_feature(&armcpu->env, ARM_FEATURE_P= MU) ? - VIRTUAL_PMU_IRQ : 0; + uint32_t pmu_interrupt =3D vms->pmu ? VIRTUAL_PMU_IRQ : 0; + CPUArchId *archid =3D machine_get_possible_cpu_arch_id(i); + uint32_t flags =3D virt_acpi_get_gicc_flags(cpu); + uint64_t mpidr =3D archid->arch_id; =20 if (vms->gic_version =3D=3D VIRT_GIC_VERSION_2) { physical_base_address =3D memmap[VIRT_GIC_CPU].base; @@ -805,7 +833,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, Virt= MachineState *vms) build_append_int_noprefix(table_data, i, 4); /* GIC ID */ build_append_int_noprefix(table_data, i, 4); /* ACPI Processor = UID */ /* Flags */ - build_append_int_noprefix(table_data, 1, 4); /* Enabled */ + build_append_int_noprefix(table_data, flags, 4); /* Parking Protocol Version */ build_append_int_noprefix(table_data, 0, 4); /* Performance Interrupt GSIV */ @@ -819,7 +847,7 @@ build_madt(GArray *table_data, BIOSLinker *linker, Virt= MachineState *vms) build_append_int_noprefix(table_data, vgic_interrupt, 4); build_append_int_noprefix(table_data, 0, 8); /* GICR Base Addre= ss*/ /* MPIDR */ - build_append_int_noprefix(table_data, arm_cpu_mp_affinity(armcpu),= 8); + build_append_int_noprefix(table_data, mpidr, 8); /* Processor Power Efficiency Class */ build_append_int_noprefix(table_data, 0, 1); /* Reserved */ diff --git a/hw/core/machine.c b/hw/core/machine.c index 69d5632464..65388d859a 100644 --- a/hw/core/machine.c +++ b/hw/core/machine.c @@ -1383,6 +1383,20 @@ CPUState *machine_get_possible_cpu(int64_t cpu_index) return NULL; } =20 +CPUArchId *machine_get_possible_cpu_arch_id(int64_t cpu_index) +{ + MachineState *ms =3D MACHINE(qdev_get_machine()); + CPUArchIdList *possible_cpus =3D ms->possible_cpus; + + for (int i =3D 0; i < possible_cpus->len; i++) { + if (possible_cpus->cpus[i].cpu && + possible_cpus->cpus[i].cpu->cpu_index =3D=3D cpu_index) { + return &possible_cpus->cpus[i]; + } + } + return NULL; +} + static char *cpu_slot_to_string(const CPUArchId *cpu) { GString *s =3D g_string_new(NULL); diff --git a/include/hw/boards.h b/include/hw/boards.h index 3ff77a8b3a..fe51ca58bf 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -461,6 +461,26 @@ struct MachineState { bool acpi_spcr_enabled; }; =20 +/* + * machine_get_possible_cpu_arch_id: + * @cpu_index: logical cpu_index to search for + * + * Return a pointer to the CPUArchId entry matching the given @cpu_index + * in the current machine's MachineState. The possible_cpus array holds + * the full set of CPUs that the machine could support, including those + * that may be created as disabled or taken offline. + * + * The slot index in ms->possible_cpus[] is always sequential, but the + * logical cpu_index values are assigned by QEMU and may or may not be + * sequential depending on the implementation of a particular machine. + * Direct indexing by cpu_index is therefore unsafe in general. This + * helper performs a linear search of the possible_cpus array to find + * the matching entry. + * + * Returns: pointer to the matching CPUArchId, or NULL if not found. + */ +CPUArchId *machine_get_possible_cpu_arch_id(int64_t cpu_index); + /* * The macros which follow are intended to facilitate the * definition of versioned machine types, using a somewhat --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759280911; cv=none; d=zohomail.com; s=zohoarc; b=T+BrnMbkcjGe1DiIs4ruZcMyUE6OWuJIQxxYGYkL3PxqGAUOm1oKBcV3UuNvTMVLieVH4+l3HVti1UqAE7uXUgel3JmVgXyCzcXZhi6qlJOirY2qkMhjCA1NORET0FvKS90BL+tGF/ckxVUQnlDnFd134mFY8dxg1fy2FIUwWyw= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759280911; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=1KLKg1dYtstrbAJPk1FfINsXop7TJ4OcNXkn8NSh+1U=; b=B+Dl9AszqDnLVlScBNantHo7py1otGxM+ZIC1TbORbmZNs//iCAWfuvVFxBFRjiVfbHf5tVsJS2M/aG1ix1E1cjhbA0JEFTw4SoAmulvXJIokvxhUznvze9s0PYMp6hyFdgmGPA5HzDcNjYPNpx5n9BZBNIHYqAJbQn0jLTbzmY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 17592809118721021.5757732189419; Tue, 30 Sep 2025 18:08:31 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lFe-0005OW-Kz; Tue, 30 Sep 2025 21:03:26 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lFS-0005Lj-Q2 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:16 -0400 Received: from mail-wr1-x433.google.com ([2a00:1450:4864:20::433]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lEz-00085O-AU for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:13 -0400 Received: by mail-wr1-x433.google.com with SMTP id ffacd0b85a97d-3ece1102998so4448019f8f.2 for ; Tue, 30 Sep 2025 18:02:43 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280560; x=1759885360; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=1KLKg1dYtstrbAJPk1FfINsXop7TJ4OcNXkn8NSh+1U=; b=YD8fFwxrnhA/DmSuF/GXZXZwE8WHsv2nq16wNNCTihgR1NIIHF+aMRWKStcH38b3kB 6SrkH3OM5i48F3qj3E+bwHwlyRjLNz12paZyU4bFoJWB4rAIIP+DH4/mDsukfN7Ah/VW mJTMWlihV39VpibLIhmS/ZBbhv/Iu1bzaznboUX5hU90AKzS4Wlym6AGktJPuDlxOtu3 LF9jyzb/ObpkLrNze9sAlvYm2uaogUZqBO1C0daGTg0YYRCN2MDPHuqlVqApzLsgu78r kxdVsOLsZN4Mr7+dPaV4vqMZpEkn4S/VbbwPMwRmco59M7Di5QxWEOfXIjNunG1onl8a irng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280560; x=1759885360; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=1KLKg1dYtstrbAJPk1FfINsXop7TJ4OcNXkn8NSh+1U=; b=n2nHPTgdZuncn63QARmEDFbRDU/sz0jeOsCIusQtOWgkaoJkQA+YX1u8Fuy2XUiwNk hUxSylbo9TKXQjo7vxM8PRnGCeRZaOK9uXG+622rZtpMX4BrHblYOBoAXwobmthFkxZs 21KPRn7HVbxBcG6cAVrITzPfmeSDEi+PD4ESt8F0HIbbY81XnopAcZ1g8uvmC/Rsikpa zaXcMN2IDZ2O0LaUvF3Abce9Pmp6zcn4caDmLYq1Ua0/nrJcIXykBsLlGCSO+O2YJOjq bL3rLAqZ2d+r7wnkSpTTFBgwLL3r48QoDPxt/0ssrGC/GkDnlzkOc3GtN7CdOQHNiqIs 9O/A== X-Gm-Message-State: AOJu0YxwYqpxhuoU7bMsvI0cmSjocxEuAxi1ChKm2iZ4qoSDinZ8B5Mk ZGE6Eoe9pGzZtSWIdw2S0bGVgpvDmaII7rp46OJ96R60/mWi+8S5Q693mK6g/tLjUEVlQ9BJIbk uydEFy52IPQ== X-Gm-Gg: ASbGncsR8amqt0xM+GpbyvggqnOuGIh7IfiknuJze1/m5R8dsfLGQWCru7aqaA3mNte 1nHUwSYZCvMSdlbtWACiRy+t6k7iH2OlIGGGe1ceh8Zh33Ku81uvftIkaqselMpHiS+oSDbaVxO nNJjEJ1c9XIg/DeJmq2i/7eDRnKVeX87+y8UPkJ/GnsdI/SWF5WhSFsB6IJoOt9Sza3ny5diJ1A kb9rGThjvZcH9OTJITsHfbJ/JbyzClE6/L7oesBwIDdevpRJSwfvh/BrPZZCIVaMXJyjxmLGLd9 HO/PS3p0iDCPGoKKfiJCOw6JS8yHYiFbE9Uml7pgZ1wFhIXc61dE+F6r7eAot9rGnowG5kTEsnw UDIreRNef/3K8h4aKnvMTaKLBCqhLx/KcHIomya8c/7mf2vBeJe5M0oMSsBV8jzfJRoMnZCKUmj R9zntF+aWnoICEVM67mQnoC3Bov0jUKRy5Ws+Va3XDE4mwZ6ewYLkwuQ== X-Google-Smtp-Source: AGHT+IEOa1mZRIiEXaGp9MtU8lnJshhWrjQdcEIYtIFQ8APD4eiEb81HC4qUqKb3ooyAax2ioj3igg== X-Received: by 2002:a05:6000:240c:b0:3ea:6680:8f97 with SMTP id ffacd0b85a97d-425577ed8a3mr1099336f8f.2.1759280559992; Tue, 30 Sep 2025 18:02:39 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 12/24] hw/core: Introduce generic device power-state handler interface Date: Wed, 1 Oct 2025 01:01:15 +0000 Message-Id: <20251001010127.3092631-13-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::433; envelope-from=salil.mehta@opnsrc.net; helo=mail-wr1-x433.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759280913308116600 From: Salil Mehta Device power-state transitions such as powering on, powering off, or enteri= ng standby may be triggered by administrative state changes (enable to disable= or disable to enable), guest OSPM requests in response to workload or policy, = or platform-specific control flows (e.g. ACPI, firmware, or machine hooks). Th= ese varied triggers require coordinated handling to ensure consistent behavior across devices. Without a common interface, each device type must implement ad-hoc logic, making it harder to manage and extend power-state control in QEMU. This patch introduces a generic PowerStateHandler QOM interface that allows devices to expose callbacks for operational power-state transitions. The mo= del distinguishes between administrative state (enable/disable, host-driven) and operational state (on/off/standby, runtime). An administrative transition m= ay trigger an operational change, with QEMU signaling the guest through platfo= rm interfaces and OSPM coordinating the transition. Some platforms may enforce transitions directly, without OSPM involvement. Key features: - New TYPE_POWERSTATE_HANDLER QOM interface. - PowerStateHandlerClass with optional callbacks for operational transitio= ns: device_request_poweroff() =E2=80=93 notify guest of internal logic to= begin a graceful shutdown sequence. device_post_poweroff() =E2=80=93 complete disable after OSPM has p= owered off operationally; device is inactive and fre= ed. device_pre_poweron() =E2=80=93 prepare for activation on adminis= trative enable; reinit state and notify guest/OSP= M. device_request_standby() =E2=80=93 request a standby state without f= ull poweroff, retaining sufficient state for resume. - Helper functions in hw/core/powerstate.c to: - Retrieve a device=E2=80=99s PowerStateHandler from the machine. - Invoke the registered callbacks if present. - Intended for use by any device type (CPU or non-CPU) that supports contr= olled power transitions, regardless of whether it supports architectural hotpl= ug. High-level flow: QMP/HMP | user issues: {"execute":"device-set", ...} (in later patches) v QDEV (Prop: admin-power-state) (Administrative State Handling) | invokes PowerStateHandler callbacks via interface v Machine (PowerStateHandler) (Operational State Handling) | coordinates platform policy and may call firmware handler v ACPI GED (PowerStateHandler, firmware) | signals events/notifications to the guest v ACPI SCI (System Control Interrupt) to guest OS | SCI is delivered on GSI N (GED Interrupt() _CRS =3D N, with FADT | designating N as SCI) | OSPM receives SCI/GSI IRQ v OSPM (in-guest house keeping) evaluates ACPI methods from firmware tables (e.g. _EJ0, _STA, _OST) and completes the transition Integration model: Both Machine and ACPI GED implement the PowerStateHandler interface. QDEV calls the handler hooks; Machine applies platform policy and can invoke GED to coordinate with OSPM. This keeps Qdev generic while arch-specific lo= gic resides in Machine and firmware. This interface will be used in later patches to coordinate CPU administrati= ve enable/disable operations on architectures that lack native CPU hotplug, and can also be adopted by other device classes requiring similar control. Signed-off-by: Salil Mehta --- hw/core/meson.build | 1 + hw/core/powerstate.c | 100 +++++++++++++++++++++++ include/hw/boards.h | 2 + include/hw/powerstate.h | 171 +++++++++++++++++++++++++++++++++++++++ stubs/meson.build | 1 + stubs/powerstate-stubs.c | 47 +++++++++++ 6 files changed, 322 insertions(+) create mode 100644 hw/core/powerstate.c create mode 100644 include/hw/powerstate.h create mode 100644 stubs/powerstate-stubs.c diff --git a/hw/core/meson.build b/hw/core/meson.build index b5a545a0ed..d9d716ce55 100644 --- a/hw/core/meson.build +++ b/hw/core/meson.build @@ -40,6 +40,7 @@ system_ss.add(files( 'numa.c', 'qdev-fw.c', 'qdev-hotplug.c', + 'powerstate.c', 'qdev-properties-system.c', 'reset.c', 'sysbus.c', diff --git a/hw/core/powerstate.c b/hw/core/powerstate.c new file mode 100644 index 0000000000..0e1d12b3f6 --- /dev/null +++ b/hw/core/powerstate.c @@ -0,0 +1,100 @@ +/* + * Device Power State transition handler interface + * + * An administrative request to 'enable' or 'disable' a device results in a + * change of its operational status. The transition may be performed either + * synchronously or asynchronously, with OSPM assistance where required. + * + * Copyright (c) 2025 Huawei Technologies R&D (UK) Ltd. + * + * Author: Salil Mehta + * + * SPDX-License-Identifier: GPL-2.0-or-later + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ +#include "qemu/osdep.h" +#include "hw/powerstate.h" +#include "qemu/module.h" +#include "qapi/error.h" +#include "hw/boards.h" + +PowerStateHandler *powerstate_handler(DeviceState *dev) +{ + MachineState *machine =3D MACHINE(qdev_get_machine()); + MachineClass *mc =3D MACHINE_GET_CLASS(machine); + + if (mc->get_powerstate_handler) { + return (PowerStateHandler *)mc->get_powerstate_handler(machine, de= v); + } + + return NULL; +} + +DeviceOperPowerState qdev_get_oper_power_state(DeviceState *dev) +{ + PowerStateHandler *h =3D powerstate_handler(dev); + PowerStateHandlerClass *pshc =3D h ? POWERSTATE_HANDLER_GET_CLASS(h) := NULL; + + if (pshc && pshc->get_oper_state) { + return pshc->get_oper_state(dev, &error_warn); + } + + return DEVICE_OPER_POWER_STATE_UNKNOWN; +} + +void device_request_poweroff(DeviceState *dev, Error **errp) +{ + PowerStateHandler *h =3D powerstate_handler(dev); + PowerStateHandlerClass *pshc =3D h ? POWERSTATE_HANDLER_GET_CLASS(h) := NULL; + + if (pshc && pshc->request_poweroff) { + pshc->request_poweroff(h, dev, errp); + } +} + +void device_post_poweroff(DeviceState *dev, Error **errp) +{ + PowerStateHandler *h =3D powerstate_handler(dev); + PowerStateHandlerClass *pshc =3D h ? POWERSTATE_HANDLER_GET_CLASS(h) := NULL; + + if (pshc && pshc->post_poweroff) { + pshc->post_poweroff(h, dev, errp); + } +} + +void device_pre_poweron(DeviceState *dev, Error **errp) +{ + PowerStateHandler *h =3D powerstate_handler(dev); + PowerStateHandlerClass *pshc =3D h ? POWERSTATE_HANDLER_GET_CLASS(h) := NULL; + + if (pshc && pshc->pre_poweron) { + pshc->pre_poweron(h, dev, errp); + } +} + +void device_request_standby(DeviceState *dev, Error **errp) +{ + PowerStateHandler *h =3D powerstate_handler(dev); + PowerStateHandlerClass *pshc =3D h ? POWERSTATE_HANDLER_GET_CLASS(h) := NULL; + + if (pshc && pshc->request_standby) { + pshc->request_standby(h, dev, errp); + } +} + +static const TypeInfo powerstate_handler_info =3D { + .name =3D TYPE_POWERSTATE_HANDLER, + .parent =3D TYPE_INTERFACE, + .class_size =3D sizeof(PowerStateHandlerClass), +}; + +static void powerstate_handler_register_types(void) +{ + type_register_static(&powerstate_handler_info); +} + +type_init(powerstate_handler_register_types) diff --git a/include/hw/boards.h b/include/hw/boards.h index fe51ca58bf..161505911f 100644 --- a/include/hw/boards.h +++ b/include/hw/boards.h @@ -332,6 +332,8 @@ struct MachineClass { =20 HotplugHandler *(*get_hotplug_handler)(MachineState *machine, DeviceState *dev); + void *(*get_powerstate_handler)(MachineState *machine, + DeviceState *dev); bool (*hotplug_allowed)(MachineState *state, DeviceState *dev, Error **errp); CpuInstanceProperties (*cpu_index_to_instance_props)(MachineState *mac= hine, diff --git a/include/hw/powerstate.h b/include/hw/powerstate.h new file mode 100644 index 0000000000..c16da0f24d --- /dev/null +++ b/include/hw/powerstate.h @@ -0,0 +1,171 @@ +/* + * Device Power State transition handler interface + * + * An administrative request to 'enable' or 'disable' a device results in a + * change of its operational status. The transition may be performed either + * synchronously or asynchronously, with OSPM assistance where required. + * + * Copyright (c) 2025 Huawei Technologies R&D (UK) Ltd. + * + * Author: Salil Mehta + * + * SPDX-License-Identifier: GPL-2.0-or-later + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ +#ifndef POWERSTATE_H +#define POWERSTATE_H + +#include "qom/object.h" + +#define TYPE_POWERSTATE_HANDLER "powerstate-handler" + +typedef struct PowerStateHandlerClass PowerStateHandlerClass; +DECLARE_CLASS_CHECKERS(PowerStateHandlerClass, POWERSTATE_HANDLER, + TYPE_POWERSTATE_HANDLER) +#define POWERSTATE_HANDLER(obj) \ + INTERFACE_CHECK(PowerStateHandler, (obj), TYPE_POWERSTATE_HANDLER) + +typedef struct PowerStateHandler PowerStateHandler; + +/** + * DeviceOperPowerState: + * + * Enumeration of operational power states for devices. These represent ru= ntime + * states controlled through platform interfaces (e.g. ACPI, PSCI, or other + * OSPM mechanisms), and are distinct from administrative presence or enab= le/ + * disable state. + * + * Transitions may be initiated by the guest OSPM in response to workload = or + * policy, or triggered by administrative actions due to policy change. Pl= ease + * check PowerStateHandlerClass for more details on these. + * + * Platforms may optionally implement a callback to fetch the current stat= e. + * That callback must map internal platform state to one of the values her= e. + * + * @DEVICE_OPER_POWER_STATE_UNKNOWN: State reporting unsupported, or state + * could not be determined. If @errp is = set, + * this indicates an error. Platform fir= mware + * may also enforce state changes direct= ly; + * the callback must return the resulting + * state. + * + * @DEVICE_OPER_POWER_STATE_ON: Device is powered on and fully active. + * + * @DEVICE_OPER_POWER_STATE_OFF: Device is powered off and inactive. It + * should not consume resources and may + * require reinitialization on power on. + * + * @DEVICE_OPER_POWER_STATE_STANDBY: Device is in a low-power standby stat= e. + * It retains enough state to allow fast + * resume without full reinitialization. + * + * See also: PowerStateHandlerClass, powerstate_get_fn + */ +typedef enum DeviceOperPowerState { + DEVICE_OPER_POWER_STATE_UNKNOWN =3D -1, + DEVICE_OPER_POWER_STATE_ON =3D 0, + DEVICE_OPER_POWER_STATE_OFF, + DEVICE_OPER_POWER_STATE_STANDBY, + DEVICE_OPER_POWER_STATE_MAX +} DeviceOperPowerState; + +/** + * powerstate_fn: + * @handler: Power state handler for the device performing the transition. + * @dev: The device being transitioned as a result of an administrative + * state change (e.g. enable-to-disable or disable-to-enable), which + * in turn affects its operational state (on, off, standby). + * @errp: Pointer to return an error if the function fails. + * + * Generic function signature for device power state transitions. An + * administrative state change triggers the corresponding operational + * transition, which may be implemented synchronously or asynchronously. + */ +typedef void (*powerstate_fn)(PowerStateHandler *handler, DeviceState *dev, + Error **errp); + +/** + * powerstate_get_fn: + * @dev: The device whose operational state is being queried. + * @errp: Pointer to an error object, set on failure. + * + * Callback type to query the current operational power state of a device. + * Platforms may optionally implement this to expose their internal power + * management status. When present, the callback must map the platform=E2= =80=99s + * internal state into one of the DeviceOperPowerState values. + * + * Returns: A DeviceOperPowerState value on success. If the platform does = not + * support state reporting, returns DEVICE_OPER_POWER_STATE_UNKNOWN without + * setting @errp. If the state could not be determined due to an error, se= ts + * @errp and also returns DEVICE_OPER_POWER_STATE_UNKNOWN. In this case, t= he + * return value must be ignored when @errp is set. + */ +typedef DeviceOperPowerState (*powerstate_get_fn)(DeviceState *dev, + Error **errp); + +/** + * PowerStateHandlerClass: + * + * Interface for devices that support transitions of their operational pow= er + * state (on, off, standby). These transitions may be driven by changes in= the + * device=E2=80=99s administrative state (enable to/from disable), or init= iated by the + * guest OSPM based on runtime policy. + * + * Administrative changes are host-driven (e.g. 'device_set') and can trig= ger + * corresponding operational transitions. QEMU may signal the guest via pl= atform + * interfaces (such as ACPI) so that OSPM coordinates the change. Some pla= tforms + * may also enforce transitions directly, without OSPM involvement. + * + * @parent: Opaque parent interface. + * + * @get_oper_state: Optional callback to query the current operational sta= te. + * Implementations must map the internal state to the + * 'DeviceOperPowerState' enum. + * + * @request_poweroff: Optional callback to notify the guest of internal lo= gic + * that the device is about to be disabled. Used to ini= tiate + * graceful shutdown or cleanup within OSPM. + * + * @post_poweroff: Callback invoked after OSPM has powered off the device + * operationally. Completes the administrative transition = to + * 'disabled', ensuring the device is fully inactive and n= ot + * consuming resources. + * + * @pre_poweron: Callback to prepare a device for re-activation after an + * administrative 'enable'. May reinitialize state and notif= y the + * guest that the device is available. Guest of internal OSP= M may + * or may not make the device become operationally active. + * + * @request_standby: Optional callback to place the device into a standby = state + * without full power-off. The device is expected to ret= ain + * sufficient state for efficient resume, e.g. CPU_SUSPE= ND. + */ +struct PowerStateHandlerClass { + /* */ + InterfaceClass parent; + + /* */ + powerstate_get_fn get_oper_state; + + powerstate_fn request_poweroff; + powerstate_fn post_poweroff; + powerstate_fn pre_poweron; + powerstate_fn request_standby; +}; + +PowerStateHandler *powerstate_handler(DeviceState *dev); + +DeviceOperPowerState qdev_get_oper_power_state(DeviceState *dev); + +void device_request_poweroff(DeviceState *dev, Error **errp); + +void device_post_poweroff(DeviceState *dev, Error **errp); + +void device_pre_poweron(DeviceState *dev, Error **errp); + +void device_request_standby(DeviceState *dev, Error **errp); +#endif /* POWERSTATE_H */ diff --git a/stubs/meson.build b/stubs/meson.build index cef046e685..f38cdd1947 100644 --- a/stubs/meson.build +++ b/stubs/meson.build @@ -95,5 +95,6 @@ if have_system or have_user =20 # Also included in have_system for tests/unit/test-qdev-global-props stub_ss.add(files('hotplug-stubs.c')) + stub_ss.add(files('powerstate-stubs.c')) stub_ss.add(files('sysbus.c')) endif diff --git a/stubs/powerstate-stubs.c b/stubs/powerstate-stubs.c new file mode 100644 index 0000000000..01c615cda2 --- /dev/null +++ b/stubs/powerstate-stubs.c @@ -0,0 +1,47 @@ +/* + * Device Power State handler interface Stubs. + * + * Copyright (c) 2025 Huawei Technologies R&D (UK) Ltd. + * + * Author: Salil Mehta + * + * SPDX-License-Identifier: GPL-2.0-or-later + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ +#include "qemu/osdep.h" +#include "hw/powerstate.h" +#include "hw/qdev-core.h" + +PowerStateHandler *powerstate_handler(DeviceState *dev) +{ + return NULL; +} + +DeviceOperPowerState qdev_get_oper_power_state(DeviceState *dev) +{ + return DEVICE_OPER_POWER_STATE_UNKNOWN; +} + +void device_request_poweroff(DeviceState *dev, Error **errp) +{ + g_assert_not_reached(); +} + +void device_post_poweroff(DeviceState *dev, Error **errp) +{ + g_assert_not_reached(); +} + +void device_pre_poweron(DeviceState *dev, Error **errp) +{ + g_assert_not_reached(); +} + +void device_request_standby(DeviceState *dev, Error **errp) +{ + g_assert_not_reached(); +} --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759280808; cv=none; d=zohomail.com; s=zohoarc; b=UjJ4VByRRixlBAjCuoxjc8lUBEdq3YQ+BiA0iTyYqWEUMLwOWz/Db8c4q81u1Wu6/4CIQfJRh7DkkmgH8DZ03CqDdFSWbhOI3GSxs8rIlIlcav4vl3xhRdz8lEPLqD/9CV0+3BfYwXzqY7bGE8agAnAV3cvVJ/xuZhqf+rpinoM= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759280808; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=k9fDeb8+ggS1YDDck/5nKVP/FJk0yCNkZrkXIxC5GuQ=; b=ja18iPa9iBn9LcZLpKHgo+sYKI2mVmVZOfoLKvD5L42pEPLWiSlCkd/9jkS6HBditvfKrywZsSITtq7o6FqXesFwGJ2wnIbHV19Gnn44rKqhMNMnCmWmrWKaYPxjwClnVI3/ldaR6dNtugABpFF4rbh+fVrKAbBKeZPZEhp43gI= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759280808292959.39162828478; Tue, 30 Sep 2025 18:06:48 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lFo-0005RE-NS; Tue, 30 Sep 2025 21:03:38 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lFb-0005NY-KZ for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:25 -0400 Received: from mail-wm1-x32a.google.com ([2a00:1450:4864:20::32a]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lF2-00086C-OI for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:22 -0400 Received: by mail-wm1-x32a.google.com with SMTP id 5b1f17b1804b1-46e37d10f3eso51022375e9.0 for ; Tue, 30 Sep 2025 18:02:46 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:42 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280562; x=1759885362; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=k9fDeb8+ggS1YDDck/5nKVP/FJk0yCNkZrkXIxC5GuQ=; b=Ct+mOKJD1WP/L0Kx2uoNv2ZJAVyn2LzileatBLrtHgKXuQTXVr9RN9zuN1s7uz9S2v fqaYoJxd+EWM++aCxR1AeFrGzuVmnOKEZ4HyEt+L6tJ4DVMX6Dodth1C39vINUunjlxG E6ovFO3WdFKBa4MCAEXnLlmJmbcqQdjvxGSdwGA64rqLoM2M+dg2tVKgzSbB7WDJPLFR tlF3JKX7OeZP43L8tAz4PBHlAbbAyORF5PfmucmYb6oLjeF034Mj8dnQcNIHPe8E5kID bp7tmi4unbyo8wJiXkGcWo6lW6fb0Vh4mVdNa4xMy/DN+rAy9Z5dPjowtAMhkP11tvTC HYsA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280562; x=1759885362; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=k9fDeb8+ggS1YDDck/5nKVP/FJk0yCNkZrkXIxC5GuQ=; b=ONrD+8Z+h3KmRN7ucUUOt87VmH/7TgXA26p6I9wNSY1VhKiI0tny1M9yFMx1w0o4aF HrWMHb6yevTBF1fzy5up1VejQHiSv1VnNoQkJOaDwObPJfYzso2CsqrB1t0PFOUl1gRG 98moGkQlrnFFGjmlmu5IvjMr7GgnkgtB2TZ4CaWjY++8GlGqvsELtRB7QUAYvDeNuaE6 SHqZ6vRX2xlHcBx+KSLHeWXCQgirJ4Mj3dzAy4MmcuOtqi7RDjH8eD8TAvUGaQjb7TeZ i3n0gJsq7QCEi+haps1gsSzdDky64uiLO2Uz4KZDEaqeK+w6sQaqr3UwLemfDGE/zI9k ZFQg== X-Gm-Message-State: AOJu0Yzd43ZJdhx16B3xUcUuaGDLTLVTw0wDFeKt5A0fE4tzfuFJ3hkt Tnrp1DAENAcyFlMOfXUPyM8nY+Amn3gmAkIQrOe+u2fCU0g/2LS50KhsRnfUaj6bSQ19xZLgKjf t/eKOXv5ewA== X-Gm-Gg: ASbGncsjkDEhiwZzB19Zzr5kS2ctwJ/JpELknC63DLo9kn+PnxYmJeLpZs3EP5VWdG4 qrrOlyVTWEd/n1y+LI+D3x84pUmZAKpIIC/zirhLJNyI4MJ01nvjagi3eE5N4WdyYDoWgyOoZyk P1pvO39Fyzm4XgOD0TLS7kS9VxAbrdoDVrpRLo8pz5U8Dw1/J5ZLI8WCwQh3eKcU2uApGnBYvix +xxzDen1gazyeMeLCY7JP1/ALcJRt/LwAEa6CMVF9p/EVmeZv9UAc1yh/UymsFUbMJC1JOiqY1S 2u9J+5jm6nb+Qqdwm0vISM4lRoa5RFO/3Xa/v+LBfMdEQbauJc1VllsAH8qpXwkTBrVT9KULZbn 9VeKfee70EJsg16jvKY5yNKhQpT//ZvTfK+TD+2Ge5tgWCuqKIAU6Y7MrJqzLurUmIJKXaJ9XzR 7XfpWmiQLWgwg1B+DqBONZMhss9ASIb4ydFPebxR3IO8c= X-Google-Smtp-Source: AGHT+IEnd2PaqW6xu2ucnLxNykBidkIjmN42qSgNnmmcJo0aFvhgoqz5Kf6BhzlGTK++4h4Q5DSzog== X-Received: by 2002:a05:6000:2c01:b0:3ec:b899:bc39 with SMTP id ffacd0b85a97d-42557a1b40dmr1064969f8f.58.1759280562511; Tue, 30 Sep 2025 18:02:42 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 13/24] qdev: make admin power state changes trigger platform transitions via ACPI Date: Wed, 1 Oct 2025 01:01:16 +0000 Message-Id: <20251001010127.3092631-14-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::32a; envelope-from=salil.mehta@opnsrc.net; helo=mail-wm1-x32a.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759280809507116600 Content-Type: text/plain; charset="utf-8" From: Salil Mehta Changing a device's administrative power state must trigger a concrete operational transition at the platform layer via ACPI coordination with OSP= M. The platform is responsible for actually powering devices off or on and for notifying the guest when required. Some machines can coordinate transitions asynchronously with OSPM using ACPI methods and events (e.g. _EJx, device-check, _OST), while others cannot or may not be ready when policy flips. Without a defined linkage, admin policy can drift from runtime reality, leaving devices active while 'disabled', or disappearing without guest notification, and migration metadata out of sync. This change establishes that linkage: administrative DISABLED/ENABLED reque= sts first drive the platform's operational transition via ACPI (prefer OSPM coordination; otherwise fall back to a synchronous in-QEMU path) and only then update QOM state and migration registration. This provides uniform semantics and a reliable contract for management and tests. Signed-off-by: Salil Mehta --- hw/core/qdev.c | 68 ++++++++++++++++++++++++++++++++++++----- include/hw/powerstate.h | 6 ++++ include/hw/qdev-core.h | 17 +++++++++++ 3 files changed, 84 insertions(+), 7 deletions(-) diff --git a/hw/core/qdev.c b/hw/core/qdev.c index 23b84a7756..3aba99b912 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -326,6 +326,30 @@ bool qdev_disable(DeviceState *dev, BusState *bus, Err= or **errp) errp); } =20 +void qdev_sync_disable(DeviceState *dev, Error **errp) +{ + g_assert(dev); + g_assert(powerstate_handler(dev)); + + /* + * Administrative disable triggered either after OSPM completes _EJx + * (post Notify(..., 0x03)), or due to lack of async shutdown support. + * + * Device may still appear in ACPI namespace but remains disabled at + * the platform level. Guest cannot re-enable it until host allows. + */ + + /* Perform operational shutdown */ + device_post_poweroff(dev, errp); + if (*errp) { + return; + } + + /* Mark the device administratively disabled */ + qatomic_set(&dev->admin_power_state, DEVICE_ADMIN_POWER_STATE_DISABLED= ); + smp_wmb(); +} + bool qdev_enable(DeviceState *dev, BusState *bus, Error **errp) { g_assert(dev); @@ -705,6 +729,7 @@ device_set_admin_power_state(Object *obj, int new_state= , Error **errp) { DeviceState *dev =3D DEVICE(obj); DeviceClass *dc =3D DEVICE_GET_CLASS(dev); + DeviceAdminPowerState old_state; =20 if (!dc->admin_power_state_supported) { error_setg(errp, "Device '%s' admin power state change not support= ed", @@ -712,25 +737,54 @@ device_set_admin_power_state(Object *obj, int new_sta= te, Error **errp) return; } =20 + g_assert(powerstate_handler(dev)); + old_state =3D qatomic_read(&dev->admin_power_state); + switch (new_state) { case DEVICE_ADMIN_POWER_STATE_DISABLED: { + if (old_state =3D=3D DEVICE_ADMIN_POWER_STATE_DISABLED) { + break; + } + /* - * TODO: Operational state transition triggered by administrative = action + * Operational state transition triggered by administrative action * Powering off the realized device either synchronously or via OS= PM. */ + if (device_graceful_poweroff_supported(dev)) { + /* Graceful shutdown via guest coordination */ + device_request_poweroff(dev, errp); + if (*errp) { + return; + } =20 - qatomic_set(&dev->admin_power_state, DEVICE_ADMIN_POWER_STATE_DISA= BLED); - smp_wmb(); + qatomic_set(&dev->admin_power_state, + DEVICE_ADMIN_POWER_STATE_DISABLED); + smp_wmb(); + } else { + /* Immediate shutdown within QEMU synchronously */ + qdev_sync_disable(dev, errp); + if (*errp) { + return; + } + } break; } case DEVICE_ADMIN_POWER_STATE_ENABLED: { - /* - * TODO: Operational state transition triggered by administrative = action - * Powering on the device and restoring migration registration. - */ + if (old_state =3D=3D DEVICE_ADMIN_POWER_STATE_ENABLED) { + break; + } =20 qatomic_set(&dev->admin_power_state, DEVICE_ADMIN_POWER_STATE_ENAB= LED); smp_wmb(); + + /* + * Operational state transition triggered by administrative action + * Powering on the device and restoring migration registration. + */ + device_pre_poweron(dev, errp); + if (*errp) { + return; + } break; } default: diff --git a/include/hw/powerstate.h b/include/hw/powerstate.h index c16da0f24d..b35650bac4 100644 --- a/include/hw/powerstate.h +++ b/include/hw/powerstate.h @@ -168,4 +168,10 @@ void device_post_poweroff(DeviceState *dev, Error **er= rp); void device_pre_poweron(DeviceState *dev, Error **errp); =20 void device_request_standby(DeviceState *dev, Error **errp); + +static inline bool device_graceful_poweroff_supported(DeviceState *dev) +{ + PowerStateHandler *h =3D powerstate_handler(dev); + return h && POWERSTATE_HANDLER_GET_CLASS(h)->request_poweroff; +} #endif /* POWERSTATE_H */ diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 855ff865ba..3e08cfb59f 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -8,6 +8,7 @@ #include "qemu/rcu_queue.h" #include "qom/object.h" #include "hw/hotplug.h" +#include "hw/powerstate.h" #include "hw/resettable.h" =20 /** @@ -589,6 +590,22 @@ bool qdev_realize_and_unref(DeviceState *dev, BusState= *bus, Error **errp); */ bool qdev_disable(DeviceState *dev, BusState *bus, Error **errp); =20 +/** + * qdev_sync_disable - Force immediate power-off and administrative disable + * @dev: The device to be powered off and administratively disabled + * @errp: Pointer to a location where an error can be reported + * + * This function performs a synchronous power-off of the device and marks = it + * as administratively DISABLED. It assumes that prior graceful handling (= e.g., + * ACPI _EJx) has already been completed, or that asynchronous mechanisms = are + * unsupported. + * + * After execution, the device remains visible to the guest (e.g. via ACPI= ), + * but cannot be brought back online unless explicitly re-enabled via admin + * policy. This function also removes the device from the migration stream. + */ +void qdev_sync_disable(DeviceState *dev, Error **errp); + /** * qdev_enable - Power on and administratively enable a device * @dev: The device to be powered on and administratively enabled --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759281251; cv=none; d=zohomail.com; s=zohoarc; b=CWt/+9VCPgEh4iDjv4U6kbvOwpCQx6uba19dkjFFQu1P66Kqd3f2UFUViI42kV0TjduxD8gLPw1eimz/jQuNAht1XGwtaC9qTeuepqERVyga08Ic3YJ6Ywe6dKzJZXhABLJlid+fHXklfqRWpzBzSHC9VdRAp8hMybTuMhxUY38= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759281251; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=F+Wd6GIBEkYM6mFg5BXGG8t9TBuYY5DbN7lfkRVErL8=; b=a3Jtu6Nm04jFXQb+yxrWScwYtll+clGuV3S+5Zn/MONdXX0oQdnI5q/aWsKP30TRTmmr2fiJVwMnaSNt0Fw2rDxLNHy0r4bbRSH+kxgNFcWb66wpFqAQuGtGPnGAb665/SdVs8PKzlIg9fUUPdB8pQC0pE0ReIPcmqMZ5FiFwTU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759281251956844.9614128894772; Tue, 30 Sep 2025 18:14:11 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lGI-0005aU-Ef; Tue, 30 Sep 2025 21:04:06 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lFg-0005Oo-4H for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:28 -0400 Received: from mail-wr1-x434.google.com ([2a00:1450:4864:20::434]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lF6-00087E-PG for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:26 -0400 Received: by mail-wr1-x434.google.com with SMTP id ffacd0b85a97d-3ed20bdfdffso5413761f8f.2 for ; Tue, 30 Sep 2025 18:02:50 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280565; x=1759885365; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=F+Wd6GIBEkYM6mFg5BXGG8t9TBuYY5DbN7lfkRVErL8=; b=KpnehI7i1dq82VK2CzZ1A6mRo69WOgitluz7r5459zb7u4W+7D/dgqFFJCtyfaZvX+ Urm38vCJBMbHhC4sIhNwTFi4sRoSzOwHu9Jklq5ysiHXuYphwbUNtjgWFyB/OrZBEyRh NnqBKUObPtLQOlAsdPRJCS6oFrj+3A+Hxc8ogqJcp5ljI1fo8acp89KNuTstx2DClyAp y/UF8VX4cjyVmPsLSVkr/6QkNFq5PcSoSATtYIS1Roq1GA4fN8dooxLAVxLwpQKBOP2a aIFNaGLZMOO8KSDWwGz4vIO9+Qvwxwqag8aXk8/8Sp84mfeWKrYOm6bsxDGgUcEX8ew2 NZvw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280565; x=1759885365; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=F+Wd6GIBEkYM6mFg5BXGG8t9TBuYY5DbN7lfkRVErL8=; b=VuFDY5TmxJASCEDWkCCaAXj9Y6qhy8q+wgC7CsJz3EP1iduqOA0ub2cPFnCKm0WJ/Z cgDmLRnVTQLb8aEiFsFhXtEA/wd1XUaU9xzBa3xTYPwGlODySZVUZswIahdH5DWTPaMo wr0+nNbeZZEa6c8r5Qz7ta3OV4ScA8TVUNrFWFH4QKoEggcYujII/U7x3TdXf54/9m7z Xs5oBPFhqIkw4LDcBTZ7lbI7PA6/0KpkeT26yFZw9kGDwtCRGFBSSdNNm/Ocb+Do0p6+ 7Ho44Z29eeY1N3Qbn54QIaNRHJlkxpwle/e2ReisvLoK1PCeAKRlbKbXLnSMCuVAp96V 7zxQ== X-Gm-Message-State: AOJu0YyhUDyItdad9kqhwztSNcF2lUgdzL9HAmQ0bCIaoh2tm/RuSxAJ ZnjCvJX8UQkGxvN+bRTkc9I4f0lxBhVt0R1CAbZE1NOlXOO90dirhQ3zNl/xRzpbk7pRKdBmcYj GgRQTtRQ05w== X-Gm-Gg: ASbGnctrAVg8BGeZW0MTi2ETEQFSg8ei47hadn4h4unVISmJPTo0LQ+ZmrgWMZt0Rho ET928UiiqDx0RGWwbVeG27T+hivrTqIrdcR+kQ6j1rpIqMcacZT7S20Zz/TWvbd3mN/JplCRm44 yFEmJD//nx6UDSzwbS6f/ZjdfFHzgueNqRzufANeCGtE4Zrrhey3q8dyZ8TkOSigZqGl6fzqCsw Jm9NNwKqsjCvlT8Wa6FSHSSJLb8Ebrg48uU5lmpnY69wwDfgso286vlzaCwjpIc1lEen57KSru6 7M9h4lWcTnTqXBagZ4YD3FO1I5nN88Y6n0j9GzTcjd1JTv6U6NpZkrILTv5xSsFMOpQWjt94wT3 OyMQf1GbQNfZuuISxh1Z0isaJjAErDXbUHu+r31IXIunOqRLUzPJX/0aeCqvfA2d3R7UfHcXBTw 3XKJ0ugteDVOiQO+AKFZI32nTL7Wg4SZWRLP4BBDnhjao= X-Google-Smtp-Source: AGHT+IHgOYDCJ5mKjVRpXTQ1RHEh8Z8GT+M1FCMYfv4F9PeMlWj+KPIyjMVs/sZIq3PnM/ck2SrOTg== X-Received: by 2002:a05:6000:2282:b0:3ec:df2b:14ff with SMTP id ffacd0b85a97d-4255780b81dmr1259111f8f.40.1759280565260; Tue, 30 Sep 2025 18:02:45 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 14/24] arm/acpi: Introduce dedicated CPU OSPM interface for ARM-like platforms Date: Wed, 1 Oct 2025 01:01:17 +0000 Message-Id: <20251001010127.3092631-15-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::434; envelope-from=salil.mehta@opnsrc.net; helo=mail-wr1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759281254354116600 From: Salil Mehta The existing ACPI CPU hotplug interface is built for x86 platforms where CP= Us can be inserted or removed and resources are allocated dynamically. On ARM,= CPUs are never hotpluggable: resources are allocated at boot and QOM vCPU objects always exist. Instead, CPUs are administratively managed by toggling ACPI _= STA to enable or disable them, which gives a hotplug-like effect but does not m= atch the x86 model. Reusing the x86 hotplug AML code would complicate maintenance since much of= its logic relies on toggling the _STA.Present bit to notify OSPM about CPU inse= rtion or removal. Such usage is not architecturally valid on ARM, where CPUs cann= ot appear or disappear at runtime. Mixing both models in one interface would increase complexity and make the AML harder to extend. A separate path is therefore required. The new design is heavily inspired by the CPU hotplug interface but avoids its unsuitable semantics. This patch adds a dedicated CPU OSPM (Operating System Power Management) interface. It provides a memory-mapped control region with selector, flags, command, and data fields, and AML methods for device-check, eject request, = and _OST reporting. OSPM is notified through GED events and can coordinate CPU events directly with QEMU. Other ARM-like architectures may also use this interface. Signed-off-by: Salil Mehta --- hw/acpi/Kconfig | 3 + hw/acpi/acpi-cpu-ospm-interface-stub.c | 41 ++ hw/acpi/cpu_ospm_interface.c | 747 +++++++++++++++++++++++++ hw/acpi/meson.build | 2 + hw/acpi/trace-events | 17 + hw/arm/Kconfig | 1 + include/hw/acpi/cpu_ospm_interface.h | 78 +++ 7 files changed, 889 insertions(+) create mode 100644 hw/acpi/acpi-cpu-ospm-interface-stub.c create mode 100644 hw/acpi/cpu_ospm_interface.c create mode 100644 include/hw/acpi/cpu_ospm_interface.h diff --git a/hw/acpi/Kconfig b/hw/acpi/Kconfig index 1d4e9f0845..aa52f0468f 100644 --- a/hw/acpi/Kconfig +++ b/hw/acpi/Kconfig @@ -21,6 +21,9 @@ config ACPI_ICH9 config ACPI_CPU_HOTPLUG bool =20 +config ACPI_CPU_OSPM_INTERFACE + bool + config ACPI_MEMORY_HOTPLUG bool select MEM_DEVICE diff --git a/hw/acpi/acpi-cpu-ospm-interface-stub.c b/hw/acpi/acpi-cpu-ospm= -interface-stub.c new file mode 100644 index 0000000000..f6f333f641 --- /dev/null +++ b/hw/acpi/acpi-cpu-ospm-interface-stub.c @@ -0,0 +1,41 @@ +/* + * ACPI CPU OSPM Interface Handling. + * + * Copyright (c) 2025 Huawei Technologies R&D (UK) Ltd. + * + * Author: Salil Mehta + * + * SPDX-License-Identifier: GPL-2.0-or-later + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include "qemu/osdep.h" +#include "hw/acpi/cpu_ospm_interface.h" + +void acpi_cpu_device_check_cb(AcpiCpuOspmState *cpu_st, DeviceState *dev, + uint32_t event_st, Error **errp) +{ +} + +void acpi_cpu_eject_request_cb(AcpiCpuOspmState *cpu_st, DeviceState *dev, + uint32_t event_st, Error **errp) +{ +} + +void acpi_cpu_eject_cb(AcpiCpuOspmState *cpu_st, DeviceState *dev, Error *= *errp) +{ +} + +void acpi_cpu_ospm_state_interface_init(MemoryRegion *as, Object *owner, + AcpiCpuOspmState *state, + hwaddr base_addr) +{ +} + +void acpi_cpus_ospm_status(AcpiCpuOspmState *cpu_st, ACPIOSTInfoList ***li= st) +{ +} diff --git a/hw/acpi/cpu_ospm_interface.c b/hw/acpi/cpu_ospm_interface.c new file mode 100644 index 0000000000..61aab8a793 --- /dev/null +++ b/hw/acpi/cpu_ospm_interface.c @@ -0,0 +1,747 @@ +/* + * ACPI CPU OSPM Interface Handling. + * + * Copyright (c) 2025 Huawei Technologies R&D (UK) Ltd. + * + * Author: Salil Mehta + * + * SPDX-License-Identifier: GPL-2.0-or-later + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ + +#include "qemu/osdep.h" +#include "migration/vmstate.h" +#include "hw/core/cpu.h" +#include "qapi/error.h" +#include "trace.h" +#include "qapi/qapi-events-acpi.h" +#include "hw/acpi/cpu_ospm_interface.h" + +/* CPU identifier and resource device */ +#define CPU_NAME_FMT "C%.03X" /* CPU name format (e.g., C001) */ +#define CPU_RES_DEVICE "CPUR" /* CPU resource device name */ +#define CPU_DEVICE "CPUS" /* CPUs device name */ +#define CPU_LOCK "CPLK" /* CPU lock object */ +/* ACPI method(_STA, _EJ0, etc.) handlers */ +#define CPU_STS_METHOD "CSTA" /* CPU status method (_STA.Enabled) */ +#define CPU_SCAN_METHOD "CSCN" /* CPU scan method for enumeration */ +#define CPU_NOTIFY_METHOD "CTFY" /* Notify method for CPU events */ +#define CPU_EJECT_METHOD "CEJ0" /* CPU eject method (_EJ0) */ +#define CPU_OST_METHOD "COST" /* OSPM status reporting (_OST) */ +/* CPU MMIO region fields (in PRST region) */ +#define CPU_SELECTOR "CSEL" /* CPU selector index (WO) */ +#define CPU_ENABLED_F "CPEN" /* Flag: CPU enabled status(_STA) (RO) */ +#define CPU_DEVCHK_F "CDCK" /* Flag: Device-check event (RW) */ +#define CPU_EJECTRQ_F "CEJR" /* Flag: Eject-request event (RW)*/ +#define CPU_EJECT_F "CEJ0" /* Flag: Ejection trigger (WO) */ +#define CPU_COMMAND "CCMD" /* Command register (RW) */ +#define CPU_DATA "CDAT" /* Data register (RW) */ + + /* + * CPU OSPM Interface MMIO Layout (Total: 16 bytes) + * + * +--------+--------+--------+--------+--------+--------+--------+-------= -+ + * | 0x00 | 0x01 | 0x02 | 0x03 | 0x04 | 0x05 | 0x06 | 0x07 = | + * +--------+--------+--------+--------+--------+--------+--------+-------= -+ + * | Selector (DWord, write-only) | Flags |Command |Reserve= d| + * | | (RO/RW)| (WO) |(2B pad= )| + * | 4 bytes (32 bits) | 1B | 1B | 2B = | + * +----------------------------------------------------------------------= -+ + * | 0x08 | 0x09 | 0x0A | 0x0B | 0x0C | 0x0D | 0x0E | 0x0F = | + * +--------+--------+--------+--------+--------+--------+--------+-------= -+ + * | Data (QWord, read/write) = | + * | Used by CPU scan and _OST methods (64 bits) = | + * +----------------------------------------------------------------------= -+ + * + * Field Overview: + * + * - Selector: 4 bytes @0x00 (DWord, WO) + * - Selects target CPU index for the current operation. + * - Flags: 1 byte @0x04 (RO/RW) + * - Bit 0: ENABLED =E2=80=93 CPU is powered on (RO) + * - Bit 1: DEVCHK =E2=80=93 Device-check completed (RW) + * - Bit 2: EJECTRQ =E2=80=93 Guest requests CPU eject (RW) + * - Bit 3: EJECT =E2=80=93 Trigger CPU ejection (WO) + * - Bits 4=E2=80=937: Reserved (write 0) + * - Command: 1 byte @0x05 (WO) + * - Specifies control operation (e.g., scan, _OST, eject). + * - Reserved: 2 bytes @0x06=E2=80=930x07 + * - Alignment padding; must be zero on write. + * - Data: 8 bytes @0x08 (QWord, RW) + * - Input/output for command-specific data. + * - Used by CPU scan or _OST. + */ + +/* + * Macros defining the CPU MMIO region layout. Change field sizes here to + * alter the overall MMIO region size. + */ +/* Sub-Field sizes (in bytes) */ +#define ACPI_CPU_MR_SELECTOR_SIZE 4 /* Write-only (DWord access) */ +#define ACPI_CPU_MR_FLAGS_SIZE 1 /* Read-write (Byte access) */ +#define ACPI_CPU_MR_RES_FLAGS_SIZE 0 /* Reserved padding */ +#define ACPI_CPU_MR_CMD_SIZE 1 /* Write-only (Byte access) */ +#define ACPI_CPU_MR_RES_CMD_SIZE 2 /* Reserved padding */ +#define ACPI_CPU_MR_CMD_DATA_SIZE 8 /* Read-write (QWord access) */ + +#define ACPI_CPU_OSPM_IF_MAX_FIELD_SIZE \ + MAX_CONST(ACPI_CPU_MR_CMD_DATA_SIZE, \ + MAX_CONST(ACPI_CPU_MR_SELECTOR_SIZE, \ + MAX_CONST(ACPI_CPU_MR_CMD_SIZE, ACPI_CPU_MR_FLAGS_SIZE))) + +/* Validate layout against exported total length */ +_Static_assert(ACPI_CPU_OSPM_IF_REG_LEN =3D=3D + (ACPI_CPU_MR_SELECTOR_SIZE + + ACPI_CPU_MR_FLAGS_SIZE + + ACPI_CPU_MR_RES_FLAGS_SIZE + + ACPI_CPU_MR_CMD_SIZE + + ACPI_CPU_MR_RES_CMD_SIZE + + ACPI_CPU_MR_CMD_DATA_SIZE), + "ACPI_CPU_OSPM_IF_REG_LEN mismatch with internal MMIO layou= t"); + +/* Sub-Field sizes (in bits) */ +#define ACPI_CPU_MR_SELECTOR_SIZE_BITS \ + (ACPI_CPU_MR_SELECTOR_SIZE * BITS_PER_BYTE) /* Write-only (DWord Acc)= */ +#define ACPI_CPU_MR_FLAGS_SIZE_BITS \ + (ACPI_CPU_MR_FLAGS_SIZE * BITS_PER_BYTE) /* Read-write (Byte Acc) = */ +#define ACPI_CPU_MR_RES_FLAGS_SIZE_BITS \ + (ACPI_CPU_MR_RES_FLAGS_SIZE * BITS_PER_BYTE) /* Reserved padding */ +#define ACPI_CPU_MR_CMD_SIZE_BITS \ + (ACPI_CPU_MR_CMD_SIZE * BITS_PER_BYTE) /* Write-only (Byte Acc) = */ +#define ACPI_CPU_MR_RES_CMD_SIZE_BITS \ + (ACPI_CPU_MR_RES_CMD_SIZE * BITS_PER_BYTE) /* Reserved padding */ +#define ACPI_CPU_MR_CMD_DATA_SIZE_BITS \ + (ACPI_CPU_MR_CMD_DATA_SIZE * BITS_PER_BYTE) /* Read-write (QWord Acc)= */ + +/* Field offsets (in bytes) */ +#define ACPI_CPU_MR_SELECTOR_OFFSET_WO 0 +#define ACPI_CPU_MR_FLAGS_OFFSET_RW \ + (ACPI_CPU_MR_SELECTOR_OFFSET_WO + \ + ACPI_CPU_MR_SELECTOR_SIZE) +#define ACPI_CPU_MR_CMD_OFFSET_WO \ + (ACPI_CPU_MR_FLAGS_OFFSET_RW + \ + ACPI_CPU_MR_FLAGS_SIZE + \ + ACPI_CPU_MR_RES_FLAGS_SIZE) +#define ACPI_CPU_MR_CMD_DATA_OFFSET_RW \ + (ACPI_CPU_MR_CMD_OFFSET_WO + \ + ACPI_CPU_MR_CMD_SIZE + \ + ACPI_CPU_MR_RES_CMD_SIZE) + +/* ensure all offsets are at their natural size alignment boundaries */ +#define STATIC_ASSERT_FIELD_ALIGNMENT(offset, type, field_name) = \ + _Static_assert((offset) % sizeof(type) =3D=3D 0, = \ + field_name " is not aligned to its natural boundary") + +STATIC_ASSERT_FIELD_ALIGNMENT(ACPI_CPU_MR_SELECTOR_OFFSET_WO, + uint32_t, "Selector"); +STATIC_ASSERT_FIELD_ALIGNMENT(ACPI_CPU_MR_FLAGS_OFFSET_RW, + uint8_t, "Flags"); +STATIC_ASSERT_FIELD_ALIGNMENT(ACPI_CPU_MR_CMD_OFFSET_WO, + uint8_t, "Command"); +STATIC_ASSERT_FIELD_ALIGNMENT(ACPI_CPU_MR_CMD_DATA_OFFSET_RW, + uint64_t, "Command Data"); + +/* Flag bit positions (used within 'flags' subfield) */ +#define ACPI_CPU_FLAGS_USED_BITS 4 +#define ACPI_CPU_MR_FLAGS_BIT_ENABLED BIT(0) +#define ACPI_CPU_MR_FLAGS_BIT_DEVCHK BIT(1) +#define ACPI_CPU_MR_FLAGS_BIT_EJECTRQ BIT(2) +#define ACPI_CPU_MR_FLAGS_BIT_EJECT BIT(ACPI_CPU_FLAGS_USED_BITS - 1) + +#define ACPI_CPU_MR_RES_FLAG_BITS (BITS_PER_BYTE - ACPI_CPU_FLAGS_USED_BIT= S) + +enum { + ACPI_GET_NEXT_CPU_WITH_EVENT_CMD =3D 0, + ACPI_OST_EVENT_CMD =3D 1, + ACPI_OST_STATUS_CMD =3D 2, + ACPI_CMD_MAX +}; + +#define AML_APPEND_MR_RESVD_FIELD(mr_field, size_bits) \ + do { \ + if ((size_bits) !=3D 0) { \ + aml_append((mr_field), aml_reserved_field(size_bits)); \ + } \ + } while (0) + +#define AML_APPEND_MR_NAMED_FIELD(mr_field, name, size_bits) \ + do { \ + if ((size_bits) !=3D 0) { \ + aml_append((mr_field), aml_named_field((name), (size_bits))); \ + } \ + } while (0) + +#define AML_CPU_RES_DEV(base, field) \ + aml_name("%s.%s.%s", (base), CPU_RES_DEVICE, (field)) + +static ACPIOSTInfo * +acpi_cpu_ospm_ost_status(int idx, AcpiCpuOspmStateStatus *cdev) +{ + ACPIOSTInfo *info =3D g_new0(ACPIOSTInfo, 1); + + info->source =3D cdev->ost_event; + info->status =3D cdev->ost_status; + if (cdev->cpu) { + DeviceState *dev =3D DEVICE(cdev->cpu); + if (dev->id) { + info->device =3D g_strdup(dev->id); + } + } + return info; +} + +void acpi_cpus_ospm_status(AcpiCpuOspmState *cpu_st, ACPIOSTInfoList ***li= st) +{ + ACPIOSTInfoList ***tail =3D list; + int i; + + for (i =3D 0; i < cpu_st->dev_count; i++) { + QAPI_LIST_APPEND(*tail, acpi_cpu_ospm_ost_status(i, &cpu_st->devs[= i])); + } +} + +static uint64_t +acpi_cpu_ospm_intf_mr_read(void *opaque, hwaddr addr, unsigned size) +{ + AcpiCpuOspmState *cpu_st =3D opaque; + AcpiCpuOspmStateStatus *cdev; + uint64_t val =3D 0; + + if (cpu_st->selector >=3D cpu_st->dev_count) { + return val; + } + cdev =3D &cpu_st->devs[cpu_st->selector]; + switch (addr) { + case ACPI_CPU_MR_FLAGS_OFFSET_RW: + val |=3D qdev_check_enabled(DEVICE(cdev->cpu)) ? + ACPI_CPU_MR_FLAGS_BIT_ENABLED : 0; + val |=3D cdev->devchk_pending ? ACPI_CPU_MR_FLAGS_BIT_DEVCHK : 0; + val |=3D cdev->ejrqst_pending ? ACPI_CPU_MR_FLAGS_BIT_EJECTRQ : 0; + trace_acpi_cpuos_if_read_flags(cpu_st->selector, val); + break; + case ACPI_CPU_MR_CMD_DATA_OFFSET_RW: + switch (cpu_st->command) { + case ACPI_GET_NEXT_CPU_WITH_EVENT_CMD: + val =3D cpu_st->selector; + break; + default: + trace_acpi_cpuos_if_read_invalid_cmd_data(cpu_st->selector, + cpu_st->command); + break; + } + trace_acpi_cpuos_if_read_cmd_data(cpu_st->selector, val); + break; + default: + break; + } + return val; +} + +static void +acpi_cpu_ospm_intf_mr_write(void *opaque, hwaddr addr, uint64_t data, + unsigned int size) +{ + AcpiCpuOspmState *cpu_st =3D opaque; + AcpiCpuOspmStateStatus *cdev; + ACPIOSTInfo *info; + + assert(cpu_st->dev_count); + if (addr) { + if (cpu_st->selector >=3D cpu_st->dev_count) { + trace_acpi_cpuos_if_invalid_idx_selected(cpu_st->selector); + return; + } + } + + switch (addr) { + case ACPI_CPU_MR_SELECTOR_OFFSET_WO: /* current CPU selector */ + cpu_st->selector =3D data; + trace_acpi_cpuos_if_write_idx(cpu_st->selector); + break; + case ACPI_CPU_MR_FLAGS_OFFSET_RW: /* set is_* fields */ + cdev =3D &cpu_st->devs[cpu_st->selector]; + if (data & ACPI_CPU_MR_FLAGS_BIT_DEVCHK) { + /* clear device-check pending event */ + cdev->devchk_pending =3D false; + trace_acpi_cpuos_if_clear_devchk_evt(cpu_st->selector); + } else if (data & ACPI_CPU_MR_FLAGS_BIT_EJECTRQ) { + /* clear eject-request pending event */ + cdev->ejrqst_pending =3D false; + trace_acpi_cpuos_if_clear_ejrqst_evt(cpu_st->selector); + } else if (data & ACPI_CPU_MR_FLAGS_BIT_EJECT) { + DeviceState *dev =3D NULL; + if (!cdev->cpu || cdev->cpu =3D=3D first_cpu) { + trace_acpi_cpuos_if_ejecting_invalid_cpu(cpu_st->selector); + break; + } + /* + * OSPM has returned with eject. Hence, it is now safe to put = the + * cpu device on powered-off state. + */ + trace_acpi_cpuos_if_ejecting_cpu(cpu_st->selector); + dev =3D DEVICE(cdev->cpu); + qdev_sync_disable(dev, &error_fatal); + } + break; + case ACPI_CPU_MR_CMD_OFFSET_WO: + trace_acpi_cpuos_if_write_cmd(cpu_st->selector, data); + if (data < ACPI_CMD_MAX) { + cpu_st->command =3D data; + if (cpu_st->command =3D=3D ACPI_GET_NEXT_CPU_WITH_EVENT_CMD) { + uint32_t iter =3D cpu_st->selector; + + do { + cdev =3D &cpu_st->devs[iter]; + if (cdev->devchk_pending || cdev->ejrqst_pending) { + cpu_st->selector =3D iter; + trace_acpi_cpuos_if_cpu_has_events(cpu_st->selecto= r, + cdev->devchk_pending, cdev->ejrqst_pending); + break; + } + iter =3D iter + 1 < cpu_st->dev_count ? iter + 1 : 0; + } while (iter !=3D cpu_st->selector); + } + } + break; + case ACPI_CPU_MR_CMD_DATA_OFFSET_RW: + switch (cpu_st->command) { + case ACPI_OST_EVENT_CMD: { + cdev =3D &cpu_st->devs[cpu_st->selector]; + cdev->ost_event =3D data; + trace_acpi_cpuos_if_write_ost_ev(cpu_st->selector, cdev->ost_ev= ent); + break; + } + case ACPI_OST_STATUS_CMD: { + cdev =3D &cpu_st->devs[cpu_st->selector]; + cdev->ost_status =3D data; + info =3D acpi_cpu_ospm_ost_status(cpu_st->selector, cdev); + qapi_event_send_acpi_device_ost(info); + qapi_free_ACPIOSTInfo(info); + trace_acpi_cpuos_if_write_ost_status(cpu_st->selector, + cdev->ost_status); + break; + } + default: + trace_acpi_cpuos_if_write_invalid_cmd(cpu_st->selector, + cpu_st->command); + break; + } + break; + default: + trace_acpi_cpuos_if_write_invalid_offset(cpu_st->selector, addr); + break; + } +} + +static const MemoryRegionOps cpu_common_mr_ops =3D { + .read =3D acpi_cpu_ospm_intf_mr_read, + .write =3D acpi_cpu_ospm_intf_mr_write, + .endianness =3D DEVICE_LITTLE_ENDIAN, + .valid =3D { + .min_access_size =3D 1, + .max_access_size =3D ACPI_CPU_OSPM_IF_MAX_FIELD_SIZE, + }, + .impl =3D { + .min_access_size =3D 1, + .max_access_size =3D ACPI_CPU_OSPM_IF_MAX_FIELD_SIZE, + .unaligned =3D false, + }, +}; + +void acpi_cpu_ospm_state_interface_init(MemoryRegion *as, Object *owner, + AcpiCpuOspmState *state, + hwaddr base_addr) +{ + MachineState *machine =3D MACHINE(qdev_get_machine()); + MachineClass *mc =3D MACHINE_GET_CLASS(machine); + const CPUArchIdList *id_list; + int i; + + assert(mc->possible_cpu_arch_ids); + id_list =3D mc->possible_cpu_arch_ids(machine); + state->dev_count =3D id_list->len; + state->devs =3D g_new0(typeof(*state->devs), state->dev_count); + for (i =3D 0; i < id_list->len; i++) { + state->devs[i].cpu =3D CPU(id_list->cpus[i].cpu); + state->devs[i].arch_id =3D id_list->cpus[i].arch_id; + } + memory_region_init_io(&state->ctrl_reg, owner, &cpu_common_mr_ops, sta= te, + "ACPI CPU OSPM State Interface Memory Region", + ACPI_CPU_OSPM_IF_REG_LEN); + memory_region_add_subregion(as, base_addr, &state->ctrl_reg); +} + +static AcpiCpuOspmStateStatus * +acpi_get_cpu_status(AcpiCpuOspmState *cpu_st, DeviceState *dev) +{ + CPUClass *k =3D CPU_GET_CLASS(dev); + uint64_t cpu_arch_id =3D k->get_arch_id(CPU(dev)); + int i; + + for (i =3D 0; i < cpu_st->dev_count; i++) { + if (cpu_arch_id =3D=3D cpu_st->devs[i].arch_id) { + return &cpu_st->devs[i]; + } + } + return NULL; +} + +void acpi_cpu_device_check_cb(AcpiCpuOspmState *cpu_st, DeviceState *dev, + uint32_t event_st, Error **errp) +{ + AcpiCpuOspmStateStatus *cdev; + cdev =3D acpi_get_cpu_status(cpu_st, dev); + if (!cdev) { + return; + } + assert(cdev->cpu); + + /* + * Tell OSPM via GED IRQ(GSI) that a powered-off cpu is being powered-= on. + * Also, mark 'device-check' event pending for this cpu. This will + * eventually result in OSPM evaluating the ACPI _EVT method and scan = of + * cpus + */ + cdev->devchk_pending =3D true; + acpi_send_event(cpu_st->acpi_dev, event_st); +} + +void acpi_cpu_eject_request_cb(AcpiCpuOspmState *cpu_st, DeviceState *dev, + uint32_t event_st, Error **errp) +{ + AcpiCpuOspmStateStatus *cdev; + cdev =3D acpi_get_cpu_status(cpu_st, dev); + if (!cdev) { + return; + } + assert(cdev->cpu); + + /* + * Tell OSPM via GED IRQ(GSI) that a cpu wants to power-off or go on s= tandby + * Also,mark 'eject-request' event pending for this cpu. (graceful shu= tdown) + */ + cdev->ejrqst_pending =3D true; + acpi_send_event(cpu_st->acpi_dev, event_st); +} + +void +acpi_cpu_eject_cb(AcpiCpuOspmState *cpu_st, DeviceState *dev, Error **errp) +{ + /* TODO: possible handling here */ +} + +static const VMStateDescription vmstate_cpu_ospm_state_sts =3D { + .name =3D "CPU OSPM state status", + .version_id =3D 1, + .minimum_version_id =3D 1, + .fields =3D (const VMStateField[]) { + VMSTATE_BOOL(devchk_pending, AcpiCpuOspmStateStatus), + VMSTATE_BOOL(ejrqst_pending, AcpiCpuOspmStateStatus), + VMSTATE_UINT32(ost_event, AcpiCpuOspmStateStatus), + VMSTATE_UINT32(ost_status, AcpiCpuOspmStateStatus), + VMSTATE_END_OF_LIST() + } +}; + +const VMStateDescription vmstate_cpu_ospm_state =3D { + .name =3D "CPU OSPM state", + .version_id =3D 1, + .minimum_version_id =3D 1, + .fields =3D (const VMStateField[]) { + VMSTATE_UINT32(selector, AcpiCpuOspmState), + VMSTATE_UINT8(command, AcpiCpuOspmState), + VMSTATE_STRUCT_VARRAY_POINTER_UINT32(devs, AcpiCpuOspmState, + dev_count, + vmstate_cpu_ospm_state_sts, + AcpiCpuOspmStateStatus), + VMSTATE_END_OF_LIST() + } +}; + +void acpi_build_cpus_aml(Aml *table, hwaddr base_addr, const char *root, + const char *event_handler_method) +{ + MachineState *machine =3D MACHINE(qdev_get_machine()); + MachineClass *mc =3D MACHINE_GET_CLASS(machine); + const CPUArchIdList *arch_ids =3D mc->possible_cpu_arch_ids(machine); + Aml *sb_scope =3D aml_scope("_SB"); /* System Bus Scope */ + Aml *ifctx, *field, *method, *cpu_res_dev, *cpus_dev; + Aml *zero =3D aml_int(0); + Aml *one =3D aml_int(1); + + cpu_res_dev =3D aml_device("%s.%s", root, CPU_RES_DEVICE); + { + Aml *crs; + + aml_append(cpu_res_dev, + aml_name_decl("_HID", aml_eisaid("PNP0A06"))); + aml_append(cpu_res_dev, + aml_name_decl("_UID", aml_string("CPU OSPM Interface resources= "))); + aml_append(cpu_res_dev, aml_mutex(CPU_LOCK, 0)); + + crs =3D aml_resource_template(); + aml_append(crs, aml_memory32_fixed(base_addr, ACPI_CPU_OSPM_IF_REG= _LEN, + AML_READ_WRITE)); + + aml_append(cpu_res_dev, aml_name_decl("_CRS", crs)); + + /* declare CPU OSPM Interface MMIO region related access fields */ + aml_append(cpu_res_dev, + aml_operation_region("PRST", AML_SYSTEM_MEMORY, + aml_int(base_addr), + ACPI_CPU_OSPM_IF_REG_LEN)); + + /* + * define named fields within PRST region with 'Byte' access widths + * and reserve fields with other access width + */ + field =3D aml_field("PRST", AML_BYTE_ACC, AML_NOLOCK, AML_PRESERVE= ); + /* reserve CPU 'selector' field (size in bits) */ + AML_APPEND_MR_RESVD_FIELD(field, ACPI_CPU_MR_SELECTOR_SIZE_BITS); + /* Flag::Enabled Bit(RO) - Read '1' if enabled */ + AML_APPEND_MR_NAMED_FIELD(field, CPU_ENABLED_F, 1); + /* Flag::Devchk Bit(RW) - Read '1', has a event. Write '1', to cle= ar */ + AML_APPEND_MR_NAMED_FIELD(field, CPU_DEVCHK_F, 1); + /* Flag::Ejectrq Bit(RW) - Read 1, has event. Write 1 to clear */ + AML_APPEND_MR_NAMED_FIELD(field, CPU_EJECTRQ_F, 1); + /* Flag::Eject Bit(WO) - OSPM evals _EJx, initiates CPU Eject in Q= emu*/ + AML_APPEND_MR_NAMED_FIELD(field, CPU_EJECT_F, 1); + /* Flag::Bit(ACPI_CPU_FLAGS_USED_BITS)-Bit(7) - Reserve left over = bits*/ + AML_APPEND_MR_RESVD_FIELD(field, ACPI_CPU_MR_RES_FLAG_BITS); + /* Reserved space: padding after flags */ + AML_APPEND_MR_RESVD_FIELD(field, ACPI_CPU_MR_RES_FLAGS_SIZE_BITS); + /* Command field written by OSPM */ + AML_APPEND_MR_NAMED_FIELD(field, CPU_COMMAND, + ACPI_CPU_MR_CMD_SIZE_BITS); + /* Reserved space: padding after command field */ + AML_APPEND_MR_RESVD_FIELD(field, ACPI_CPU_MR_RES_CMD_SIZE_BITS); + /* Command data: 64-bit payload associated with command */ + AML_APPEND_MR_RESVD_FIELD(field, ACPI_CPU_MR_CMD_DATA_SIZE_BITS); + aml_append(cpu_res_dev, field); + + /* + * define named fields with 'Dword' access widths and reserve fiel= ds + * with other access width + */ + field =3D aml_field("PRST", AML_DWORD_ACC, AML_NOLOCK, AML_PRESERV= E); + /* CPU selector, write only */ + AML_APPEND_MR_NAMED_FIELD(field, CPU_SELECTOR, + ACPI_CPU_MR_SELECTOR_SIZE_BITS); + aml_append(cpu_res_dev, field); + + /* + * define named fields with 'Qword' access widths and reserve fiel= ds + * with other access width + */ + field =3D aml_field("PRST", AML_QWORD_ACC, AML_NOLOCK, AML_PRESERV= E); + /* + * Reserve space: selector, flags, reserved flags, command, reserv= ed + * command for Qword alignment. + */ + AML_APPEND_MR_RESVD_FIELD(field, ACPI_CPU_MR_SELECTOR_SIZE_BITS + + ACPI_CPU_MR_FLAGS_SIZE_BITS + + ACPI_CPU_MR_RES_FLAGS_SIZE_BIT= S + + ACPI_CPU_MR_CMD_SIZE_BITS + + ACPI_CPU_MR_RES_CMD_SIZE_BITS); + /* Command data accessible via Qword */ + AML_APPEND_MR_NAMED_FIELD(field, CPU_DATA, + ACPI_CPU_MR_CMD_DATA_SIZE_BITS); + aml_append(cpu_res_dev, field); + } + aml_append(sb_scope, cpu_res_dev); + + cpus_dev =3D aml_device("%s.%s", root, CPU_DEVICE); + { + Aml *ctrl_lock =3D AML_CPU_RES_DEV(root, CPU_LOCK); + Aml *cpu_selector =3D AML_CPU_RES_DEV(root, CPU_SELECTOR); + Aml *is_enabled =3D AML_CPU_RES_DEV(root, CPU_ENABLED_F); + Aml *dvchk_evt =3D AML_CPU_RES_DEV(root, CPU_DEVCHK_F); + Aml *ejrq_evt =3D AML_CPU_RES_DEV(root, CPU_EJECTRQ_F); + Aml *ej_evt =3D AML_CPU_RES_DEV(root, CPU_EJECT_F); + Aml *cpu_cmd =3D AML_CPU_RES_DEV(root, CPU_COMMAND); + Aml *cpu_data =3D AML_CPU_RES_DEV(root, CPU_DATA); + int i; + + aml_append(cpus_dev, aml_name_decl("_HID", aml_string("ACPI0010"))= ); + aml_append(cpus_dev, aml_name_decl("_CID", aml_eisaid("PNP0A05"))); + + method =3D aml_method(CPU_NOTIFY_METHOD, 2, AML_NOTSERIALIZED); + for (i =3D 0; i < arch_ids->len; i++) { + Aml *cpu =3D aml_name(CPU_NAME_FMT, i); + Aml *uid =3D aml_arg(0); + Aml *event =3D aml_arg(1); + + ifctx =3D aml_if(aml_equal(uid, aml_int(i))); + { + aml_append(ifctx, aml_notify(cpu, event)); + } + aml_append(method, ifctx); + } + aml_append(cpus_dev, method); + + method =3D aml_method(CPU_STS_METHOD, 1, AML_SERIALIZED); + { + Aml *idx =3D aml_arg(0); + Aml *sta =3D aml_local(0); + Aml *else_ctx; + + aml_append(method, aml_acquire(ctrl_lock, 0xFFFF)); + aml_append(method, aml_store(idx, cpu_selector)); + aml_append(method, aml_store(zero, sta)); + ifctx =3D aml_if(aml_equal(is_enabled, one)); + { + /* cpu is present and enabled */ + aml_append(ifctx, aml_store(aml_int(0xF), sta)); + } + aml_append(method, ifctx); + else_ctx =3D aml_else(); + { + /* cpu is present but disabled */ + aml_append(else_ctx, aml_store(aml_int(0xD), sta)); + } + aml_append(method, else_ctx); + aml_append(method, aml_release(ctrl_lock)); + aml_append(method, aml_return(sta)); + } + aml_append(cpus_dev, method); + + method =3D aml_method(CPU_EJECT_METHOD, 1, AML_SERIALIZED); + { + Aml *idx =3D aml_arg(0); + + aml_append(method, aml_acquire(ctrl_lock, 0xFFFF)); + aml_append(method, aml_store(idx, cpu_selector)); + aml_append(method, aml_store(one, ej_evt)); + aml_append(method, aml_release(ctrl_lock)); + } + aml_append(cpus_dev, method); + + method =3D aml_method(CPU_SCAN_METHOD, 0, AML_SERIALIZED); + { + Aml *has_event =3D aml_local(0); /* Local0: Loop control flag = */ + Aml *uid =3D aml_local(1); /* Local1: Current CPU UID */ + /* Constants */ + Aml *dev_chk =3D aml_int(1); /* Notify: device check to enable= */ + Aml *eject_req =3D aml_int(3); /* Notify: eject for removal */ + Aml *next_cpu_cmd =3D aml_int(ACPI_GET_NEXT_CPU_WITH_EVENT_CMD= ); + + /* Acquire CPU lock */ + aml_append(method, aml_acquire(ctrl_lock, 0xFFFF)); + + /* Initialize loop */ + aml_append(method, aml_store(zero, uid)); + aml_append(method, aml_store(one, has_event)); + + Aml *while_ctx =3D aml_while(aml_land( + aml_equal(has_event, one), + aml_lless(uid, aml_int(arch_ids->len)) + )); + { + aml_append(while_ctx, aml_store(zero, has_event)); + /* + * Issue scan cmd: QEMU will return next CPU with event in + * cpu_data + */ + aml_append(while_ctx, aml_store(uid, cpu_selector)); + aml_append(while_ctx, aml_store(next_cpu_cmd, cpu_cmd)); + + /* If scan wrapped around to an earlier UID, exit loop */ + Aml *wrap_check =3D aml_if(aml_lless(cpu_data, uid)); + aml_append(wrap_check, aml_break()); + aml_append(while_ctx, wrap_check); + + /* Set UID to scanned result */ + aml_append(while_ctx, aml_store(cpu_data, uid)); + + /* send CPU device-check(resume) event to OSPM */ + Aml *if_devchk =3D aml_if(aml_equal(dvchk_evt, one)); + { + aml_append(if_devchk, + aml_call2(CPU_NOTIFY_METHOD, uid, dev_chk)); + /* clear local device-check event sent flag */ + aml_append(if_devchk, aml_store(one, dvchk_evt)); + aml_append(if_devchk, aml_store(one, has_event)); + } + aml_append(while_ctx, if_devchk); + + /* + * send CPU eject-request event to OSPM to gracefully hand= le + * OSPM related tasks running on this CPU + */ + Aml *else_ctx =3D aml_else(); + Aml *if_ejrq =3D aml_if(aml_equal(ejrq_evt, one)); + { + aml_append(if_ejrq, + aml_call2(CPU_NOTIFY_METHOD, uid, eject_req)); + /* clear local eject-request event sent flag */ + aml_append(if_ejrq, aml_store(one, ejrq_evt)); + aml_append(if_ejrq, aml_store(one, has_event)); + } + aml_append(else_ctx, if_ejrq); + aml_append(while_ctx, else_ctx); + + /* Increment UID */ + aml_append(while_ctx, aml_increment(uid)); + } + aml_append(method, while_ctx); + + /* Release cpu lock */ + aml_append(method, aml_release(ctrl_lock)); + } + aml_append(cpus_dev, method); + + method =3D aml_method(CPU_OST_METHOD, 4, AML_SERIALIZED); + { + Aml *uid =3D aml_arg(0); + Aml *ev_cmd =3D aml_int(ACPI_OST_EVENT_CMD); + Aml *st_cmd =3D aml_int(ACPI_OST_STATUS_CMD); + + aml_append(method, aml_acquire(ctrl_lock, 0xFFFF)); + aml_append(method, aml_store(uid, cpu_selector)); + aml_append(method, aml_store(ev_cmd, cpu_cmd)); + aml_append(method, aml_store(aml_arg(1), cpu_data)); + aml_append(method, aml_store(st_cmd, cpu_cmd)); + aml_append(method, aml_store(aml_arg(2), cpu_data)); + aml_append(method, aml_release(ctrl_lock)); + } + aml_append(cpus_dev, method); + + /* build Processor object for each processor */ + for (i =3D 0; i < arch_ids->len; i++) { + Aml *dev; + Aml *uid =3D aml_int(i); + + dev =3D aml_device(CPU_NAME_FMT, i); + aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0007"))); + aml_append(dev, aml_name_decl("_UID", uid)); + + method =3D aml_method("_STA", 0, AML_SERIALIZED); + aml_append(method, aml_return(aml_call1(CPU_STS_METHOD, uid))); + aml_append(dev, method); + + if (CPU(arch_ids->cpus[i].cpu) !=3D first_cpu) { + method =3D aml_method("_EJ0", 1, AML_NOTSERIALIZED); + aml_append(method, aml_call1(CPU_EJECT_METHOD, uid)); + aml_append(dev, method); + } + + method =3D aml_method("_OST", 3, AML_SERIALIZED); + aml_append(method, + aml_call4(CPU_OST_METHOD, uid, aml_arg(0), + aml_arg(1), aml_arg(2)) + ); + aml_append(dev, method); + aml_append(cpus_dev, dev); + } + } + aml_append(sb_scope, cpus_dev); + aml_append(table, sb_scope); + + method =3D aml_method(event_handler_method, 0, AML_NOTSERIALIZED); + aml_append(method, aml_call0("\\_SB.CPUS." CPU_SCAN_METHOD)); + aml_append(table, method); +} diff --git a/hw/acpi/meson.build b/hw/acpi/meson.build index 73f02b9691..6d83396ab4 100644 --- a/hw/acpi/meson.build +++ b/hw/acpi/meson.build @@ -8,6 +8,8 @@ acpi_ss.add(files( )) acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_true: files('cpu.c', 'cpu_= hotplug.c')) acpi_ss.add(when: 'CONFIG_ACPI_CPU_HOTPLUG', if_false: files('acpi-cpu-hot= plug-stub.c')) +acpi_ss.add(when: 'CONFIG_ACPI_CPU_OSPM_INTERFACE', if_true: files('cpu_os= pm_interface.c')) +acpi_ss.add(when: 'CONFIG_ACPI_CPU_OSPM_INTERFACE', if_false: files('acpi-= cpu-ospm-interface-stub.c')) acpi_ss.add(when: 'CONFIG_ACPI_MEMORY_HOTPLUG', if_true: files('memory_hot= plug.c')) acpi_ss.add(when: 'CONFIG_ACPI_MEMORY_HOTPLUG', if_false: files('acpi-mem-= hotplug-stub.c')) acpi_ss.add(when: 'CONFIG_ACPI_NVDIMM', if_true: files('nvdimm.c')) diff --git a/hw/acpi/trace-events b/hw/acpi/trace-events index edc93e703c..c0ecbdd48f 100644 --- a/hw/acpi/trace-events +++ b/hw/acpi/trace-events @@ -40,6 +40,23 @@ cpuhp_acpi_fw_remove_cpu(uint32_t idx) "0x%"PRIx32 cpuhp_acpi_write_ost_ev(uint32_t slot, uint32_t ev) "idx[0x%"PRIx32"] OST = EVENT: 0x%"PRIx32 cpuhp_acpi_write_ost_status(uint32_t slot, uint32_t st) "idx[0x%"PRIx32"] = OST STATUS: 0x%"PRIx32 =20 +#cpu_ospm_interface.c +acpi_cpuos_if_invalid_idx_selected(uint32_t idx) "selector idx[0x%"PRIx32"= ]" +acpi_cpuos_if_read_flags(uint32_t idx, uint8_t flags) "cpu idx[0x%"PRIx32"= ] flags: 0x%"PRIx8 +acpi_cpuos_if_write_idx(uint32_t idx) "set active cpu idx: 0x%"PRIx32 +acpi_cpuos_if_write_cmd(uint32_t idx, uint8_t cmd) "cpu idx[0x%"PRIx32"] c= md: 0x%"PRIx8 +acpi_cpuos_if_write_invalid_cmd(uint32_t idx, uint8_t cmd) "cpu idx[0x%"PR= Ix32"] invalid cmd: 0x%"PRIx8 +acpi_cpuos_if_write_invalid_offset(uint32_t idx, uint64_t addr) "cpu idx[0= x%"PRIx32"] invalid offset: 0x%"PRIx64 +acpi_cpuos_if_read_cmd_data(uint32_t idx, uint32_t data) "cpu idx[0x%"PRIx= 32"] data: 0x%"PRIx32 +acpi_cpuos_if_read_invalid_cmd_data(uint32_t idx, uint8_t cmd) "cpu idx[0x= %"PRIx32"] invalid cmd: 0x%"PRIx8 +acpi_cpuos_if_cpu_has_events(uint32_t idx, bool devchk, bool ejrqst) "cpu = idx[0x%"PRIx32"] device-check pending: %d, eject-request pending: %d" +acpi_cpuos_if_clear_devchk_evt(uint32_t idx) "cpu idx[0x%"PRIx32"]" +acpi_cpuos_if_clear_ejrqst_evt(uint32_t idx) "cpu idx[0x%"PRIx32"]" +acpi_cpuos_if_ejecting_invalid_cpu(uint32_t idx) "invalid cpu idx[0x%"PRIx= 32"]" +acpi_cpuos_if_ejecting_cpu(uint32_t idx) "cpu idx[0x%"PRIx32"]" +acpi_cpuos_if_write_ost_ev(uint32_t idx, uint32_t ev) "cpu idx[0x%"PRIx32"= ] OST Event: 0x%"PRIx32 +acpi_cpuos_if_write_ost_status(uint32_t idx, uint32_t st) "cpu idx[0x%"PRI= x32"] OST Status: 0x%"PRIx32 + # pcihp.c acpi_pci_eject_slot(unsigned bsel, unsigned slot) "bsel: %u slot: %u" acpi_pci_unplug(int bsel, int slot) "bsel: %d slot: %d" diff --git a/hw/arm/Kconfig b/hw/arm/Kconfig index 2aa4b5d778..c9991e00c7 100644 --- a/hw/arm/Kconfig +++ b/hw/arm/Kconfig @@ -39,6 +39,7 @@ config ARM_VIRT select VIRTIO_MEM_SUPPORTED select ACPI_CXL select ACPI_HMAT + select ACPI_CPU_OSPM_INTERFACE =20 config CUBIEBOARD bool diff --git a/include/hw/acpi/cpu_ospm_interface.h b/include/hw/acpi/cpu_osp= m_interface.h new file mode 100644 index 0000000000..5dda327a34 --- /dev/null +++ b/include/hw/acpi/cpu_ospm_interface.h @@ -0,0 +1,78 @@ +/* + * ACPI CPU OSPM Interface Handling. + * + * Copyright (c) 2025 Huawei Technologies R&D (UK) Ltd. + * + * Author: Salil Mehta + * + * SPDX-License-Identifier: GPL-2.0-or-later + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the ree Software Foundation; either version 2 of the License, or + * (at your option) any later version. + */ +#ifndef CPU_OSPM_INTERFACE_H +#define CPU_OSPM_INTERFACE_H + +#include "qapi/qapi-types-acpi.h" +#include "hw/qdev-core.h" +#include "hw/acpi/acpi.h" +#include "hw/acpi/aml-build.h" +#include "hw/boards.h" + +/** + * Total size (in bytes) of the ACPI CPU OSPM Interface MMIO region. + * + * This region contains control and status fields such as CPU selector, + * flags, command register, and data register. It must exactly match the + * layout defined in the AML code and the memory region implementation. + * + * Any mismatch between this definition and the AML layout may result in + * runtime errors or build-time assertion failures (e.g., _Static_assert), + * breaking correct device emulation and guest OS coordination. + */ +#define ACPI_CPU_OSPM_IF_REG_LEN 16 + +typedef struct { + CPUState *cpu; + uint64_t arch_id; + bool devchk_pending; /* device-check pending */ + bool ejrqst_pending; /* eject-request pending */ + uint32_t ost_event; + uint32_t ost_status; +} AcpiCpuOspmStateStatus; + +typedef struct AcpiCpuOspmState { + DeviceState *acpi_dev; + MemoryRegion ctrl_reg; + uint32_t selector; + uint8_t command; + uint32_t dev_count; + AcpiCpuOspmStateStatus *devs; +} AcpiCpuOspmState; + +void acpi_cpu_device_check_cb(AcpiCpuOspmState *cpu_st, DeviceState *dev, + uint32_t event_st, Error **errp); + +void acpi_cpu_eject_request_cb(AcpiCpuOspmState *cpu_st, DeviceState *dev, + uint32_t event_st, Error **errp); + +void acpi_cpu_eject_cb(AcpiCpuOspmState *cpu_st, DeviceState *dev, + Error **errp); + +void acpi_cpu_ospm_state_interface_init(MemoryRegion *as, Object *owner, + AcpiCpuOspmState *state, + hwaddr base_addr); + +void acpi_build_cpus_aml(Aml *table, hwaddr base_addr, const char *root, + const char *event_handler_method); + +void acpi_cpus_ospm_status(AcpiCpuOspmState *cpu_st, + ACPIOSTInfoList ***list); + +extern const VMStateDescription vmstate_cpu_ospm_state; +#define VMSTATE_CPU_OSPM_STATE(cpuospm, state) \ + VMSTATE_STRUCT(cpuospm, state, 1, \ + vmstate_cpu_ospm_state, AcpiCpuOspmState) +#endif /* CPU_OSPM_INTERFACE_H */ --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759281129; cv=none; d=zohomail.com; s=zohoarc; b=RlG50gk1RDroNUug7NoUE28LeojdbW3ZmvpCrOVm31EcvVwohDK/YyOs2EANucYJv4TeuwtP1RRPsj4Dvg52yObaAbon2+57kZLD+JwHTJVmlV06Gfj6nxfnbfQSW1Dv+oX9E2xOxmr3AC4OcKVXMXJZqOwWcso6tMVYyMKV8wk= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759281129; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=pxcfPHc8rOInaK/rBPQBZ9/nywW3P+jwOuWG6ZmnnIw=; b=fl1WecNx3xrm5ZuXIOrzqB/lYh65sbHj0Fk3aT96tNoUmzdHbQJMgGmOvAr81wP8s11ZE5Znv4AUk9rrFRCC2nehN4lRA6q5iaWlnwJnpkl2AIMN/ykswQ60Yxv2GeAi4p9okk3mwhPhiwVqR277l6UXZ61UYKQR+vBjDhmKRus= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759281129443835.6516007353122; Tue, 30 Sep 2025 18:12:09 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lGM-0005f2-Df; Tue, 30 Sep 2025 21:04:10 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lFk-0005QN-Ik for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:32 -0400 Received: from mail-wm1-x32c.google.com ([2a00:1450:4864:20::32c]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lF8-00087z-FC for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:32 -0400 Received: by mail-wm1-x32c.google.com with SMTP id 5b1f17b1804b1-46e3a50bc0fso45689485e9.3 for ; Tue, 30 Sep 2025 18:02:50 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280568; x=1759885368; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=pxcfPHc8rOInaK/rBPQBZ9/nywW3P+jwOuWG6ZmnnIw=; b=LrUfQu/CRRqn+JPcbzo/gDpz7F3WiOkWEz0jnPSuKO06Pcl2yPND/qZPGmeqlJE6AB VQxbdXFbiAXsmAQM4KDfS2HY6qQxfu6NnpKoV/VXj6yIgYnNKsF0sD4Hj4M/H2aQYfBs nHJWuDet2quDVL9yQJEGTTDpHutflFmZkF542W1m01w1NIY9bFGBaID8u0TcBuZk2jwh 7vo95N7gXqhaGwOmh5JnJ0GePkwgl941RL47AfGxM+r8QcVHDzCKOufYOf7UeMvL1nWA xKufYxQ4FpmXx2KEExm/XQ0a2I7ENQBGOaA/cOm4ZbNf0Kerv3Qg99Eu3bdkNpnE3sVW Pn5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280568; x=1759885368; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=pxcfPHc8rOInaK/rBPQBZ9/nywW3P+jwOuWG6ZmnnIw=; b=WsvQwqrq/hutaH4YbkDl3r1sbpIOon4eG05OUIFjhWSPkjQXxGI4vIp9GyP6cfLR6E BJ9iYpkgzxhPZqzhbKFJ8GPCoP5Gb22JRBkllmiZbSnbt38T4D+T8GMzJZnp0n3caxkm kCvJ3nbesd3dHsq5NmgzZh2yUDGF/oZKrEsVbP/mn1wCzkhP7GDJmPwczwg/S8Y+jjco Ic3Vdy08otEitvABlvaIL4rbZ6Z6/NOa2yVtkcoM9iGnOklKDtNCYKYBNhv9nvBfb/kj QPkP+VyO7bS6wj727sCL8JVD0fiBzbP3QQLPydvrvROUMf0CHbCLSC3lBQukje9L2Ktc hSfw== X-Gm-Message-State: AOJu0Yzh2yJsN/2X/yUOZx45grAnFFPzhO3mQCQEWN5ctTUdoRpSlHgx /skeLEBgQpBW17zIrZmpa514nSxibiTgFu8fZ4i0YmwAVM41WmoTrciI+7J5Ax41iaZkzxrFYde T6klV1w3jsQ== X-Gm-Gg: ASbGncvugNuix9ugo68al6IXuRbZ6ZUAk53p7Rg4360Su5iXQZpqd6p0Gj+wFqTbvHF MEhh/mxJXtdf47JzojvRHhB34DBbwAcIuGAUrhlcAgcfGssTiEZRXt7nmDr61/zLW4k+r1azghJ 3lfjr09IaJQOpvbuWcDPDW4TSoTXnw52oFglSuCgxY9VadE47c2pq3azE9RaXl5YEzF7rgmsqXR Hvj4VZ3YDntD8i7/KOVWje2GFcosWpuUBHt4Nzoe1G84K0MZzsxRoH8DTwJTgpfNf9k140e2IkM t0fjulyMOuldYY/6nWvpd8m3lhdzyuW7pUw3saO953pKBrhfQrEN1G10dRYk4ChwVbZRO7xHZT0 VPAbnbxcZYHhUWcIKIm+pUFQgn0PSIzhO0bnuDEn7v89Tjd1zjBVXKBLheXUNZyPlfxHOr1GNWJ gnJuuSGO5r5PphgQ/dcedz+GZ4G+lVJfTd5PhOjcuWZYQ= X-Google-Smtp-Source: AGHT+IGj/RZIjav8CCq0VUnatmGH+py6HOaa0NVdgzVsL0LeYJoCeBhWny/xhFaB2PnT0N0oVITBkw== X-Received: by 2002:a5d:5f52:0:b0:3ff:d5c5:6b03 with SMTP id ffacd0b85a97d-4255781466fmr1134196f8f.35.1759280567800; Tue, 30 Sep 2025 18:02:47 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 15/24] acpi/ged: Notify OSPM of CPU administrative state changes via GED Date: Wed, 1 Oct 2025 01:01:18 +0000 Message-Id: <20251001010127.3092631-16-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::32c; envelope-from=salil.mehta@opnsrc.net; helo=mail-wm1-x32c.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759281130561116600 Content-Type: text/plain; charset="utf-8" From: Salil Mehta When vCPUs are administratively enabled or disabled, the guest OSPM must be notified so it can coordinate the corresponding operational transitions and preserve system stability. When a CPU is administratively enabled, GED raises a Device Check event. OS= PM then uses the ACPI _EVT handler to identify the CPU device and evaluates its _STA, ensuring the CPU is identified, registered with the Linux device mode= l, enabled in the guest kernel, and made available to the scheduler. When a CPU is administratively disabled, GED raises an Eject Request event.= OSPM again uses the ACPI _EVT handler to identify the CPU device and evaluates i= ts _STA, marking the CPU absent. This allows OSPM to invoke the _EJ0 path, gracefully offload tasks, and shut down state before removal. Without this coordination, CPUs may be forcefully removed, risking state loss or kernel instability. Platform code (e.g. Arm virt machine) calls PowerStateHandler hooks, which = in turn drive the GED callbacks. Those callbacks use ACPI events to reflect the administrative change and let OSPM orchestrate the operational transition. Signed-off-by: Salil Mehta --- hw/acpi/generic_event_device.c | 91 ++++++++++++++++++++++++++ hw/arm/virt.c | 9 ++- include/hw/acpi/acpi_dev_interface.h | 1 + include/hw/acpi/generic_event_device.h | 6 ++ include/hw/arm/virt.h | 1 + 5 files changed, 107 insertions(+), 1 deletion(-) diff --git a/hw/acpi/generic_event_device.c b/hw/acpi/generic_event_device.c index 95682b79a2..4fbf5aaa20 100644 --- a/hw/acpi/generic_event_device.c +++ b/hw/acpi/generic_event_device.c @@ -23,11 +23,13 @@ #include "migration/vmstate.h" #include "qemu/error-report.h" #include "system/runstate.h" +#include "hw/powerstate.h" =20 static const uint32_t ged_supported_events[] =3D { ACPI_GED_MEM_HOTPLUG_EVT, ACPI_GED_PWR_DOWN_EVT, ACPI_GED_NVDIMM_HOTPLUG_EVT, + ACPI_GED_CPU_POWERSTATE_EVT, ACPI_GED_CPU_HOTPLUG_EVT, ACPI_GED_PCI_HOTPLUG_EVT, }; @@ -112,6 +114,9 @@ void build_ged_aml(Aml *table, const char *name, Hotplu= gHandler *hotplug_dev, aml_append(if_ctx, aml_call0(MEMORY_DEVICES_CONTAINER "." MEMORY_SLOT_SCAN_METHOD)); break; + case ACPI_GED_CPU_POWERSTATE_EVT: + aml_append(if_ctx, aml_call0(AML_GED_EVT_CPUPS_SCAN_METHOD= )); + break; case ACPI_GED_CPU_HOTPLUG_EVT: aml_append(if_ctx, aml_call0(AML_GED_EVT_CPU_SCAN_METHOD)); break; @@ -302,12 +307,57 @@ static void acpi_ged_unplug_cb(HotplugHandler *hotplu= g_dev, } } =20 +static void +acpi_ged_pre_poweron_cb(PowerStateHandler *handler, DeviceState *dev, + Error **errp) +{ + AcpiGedState *s =3D ACPI_GED(handler); + + if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { + acpi_cpu_device_check_cb(&s->cpuospm_state, dev, + ACPI_CPU_POWERSTATE_STATUS, errp); + } else { + error_setg(errp, "acpi: poweron transition on unsupported device" + " type %s", object_get_typename(OBJECT(dev))); + } +} + +static void +acpi_ged_request_poweroff_cb(PowerStateHandler *handler, DeviceState *dev, + Error **errp) +{ + AcpiGedState *s =3D ACPI_GED(handler); + + if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { + acpi_cpu_eject_request_cb(&s->cpuospm_state, dev, + ACPI_CPU_POWERSTATE_STATUS, errp); + } else { + error_setg(errp, "acpi: poweroff transition request for unsupporte= d" + " device type: %s", object_get_typename(OBJECT(dev))); + } +} + +static void +acpi_ged_post_poweroff_cb(PowerStateHandler *handler, DeviceState *dev, + Error **errp) +{ + AcpiGedState *s =3D ACPI_GED(handler); + + if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { + acpi_cpu_eject_cb(&s->cpuospm_state, dev, errp); + } else { + error_setg(errp, "acpi: post poweroff handling on unsupported devi= ce" + " type %s", object_get_typename(OBJECT(dev))); + } +} + static void acpi_ged_ospm_status(AcpiDeviceIf *adev, ACPIOSTInfoList ***li= st) { AcpiGedState *s =3D ACPI_GED(adev); =20 acpi_memory_ospm_status(&s->memhp_state, list); acpi_cpu_ospm_status(&s->cpuhp_state, list); + acpi_cpus_ospm_status(&s->cpuospm_state, list); } =20 static void acpi_ged_send_event(AcpiDeviceIf *adev, AcpiEventStatusBits ev) @@ -322,6 +372,8 @@ static void acpi_ged_send_event(AcpiDeviceIf *adev, Acp= iEventStatusBits ev) sel =3D ACPI_GED_PWR_DOWN_EVT; } else if (ev & ACPI_NVDIMM_HOTPLUG_STATUS) { sel =3D ACPI_GED_NVDIMM_HOTPLUG_EVT; + } else if (ev & ACPI_CPU_POWERSTATE_STATUS) { + sel =3D ACPI_GED_CPU_POWERSTATE_EVT; } else if (ev & ACPI_CPU_HOTPLUG_STATUS) { sel =3D ACPI_GED_CPU_HOTPLUG_EVT; } else if (ev & ACPI_PCI_HOTPLUG_STATUS) { @@ -379,6 +431,24 @@ static const VMStateDescription vmstate_cpuhp_state = =3D { } }; =20 +static bool cpuospm_needed(void *opaque) +{ + MachineClass *mc =3D MACHINE_GET_CLASS(qdev_get_machine()); + + return mc->has_online_capable_cpus; +} + +static const VMStateDescription vmstate_cpuospm_state =3D { + .name =3D "acpi-ged/cpu-ospm", + .version_id =3D 1, + .minimum_version_id =3D 1, + .needed =3D cpuospm_needed, + .fields =3D (VMStateField[]) { + VMSTATE_CPU_OSPM_STATE(cpuospm_state, AcpiGedState), + VMSTATE_END_OF_LIST() + } +}; + static const VMStateDescription vmstate_ged_state =3D { .name =3D "acpi-ged-state", .version_id =3D 1, @@ -447,6 +517,7 @@ static const VMStateDescription vmstate_acpi_ged =3D { .subsections =3D (const VMStateDescription * const []) { &vmstate_memhp_state, &vmstate_cpuhp_state, + &vmstate_cpuospm_state, &vmstate_ghes_state, &vmstate_pcihp_state, NULL @@ -461,6 +532,8 @@ static void acpi_ged_realize(DeviceState *dev, Error **= errp) uint32_t ged_events; int i; =20 + s->cpuospm_state.acpi_dev =3D dev; + if (pcihp_state->use_acpi_hotplug_bridge) { s->ged_event_bitmap |=3D ACPI_GED_PCI_HOTPLUG_EVT; } @@ -474,6 +547,18 @@ static void acpi_ged_realize(DeviceState *dev, Error *= *errp) } =20 switch (event) { + case ACPI_GED_CPU_POWERSTATE_EVT: + /* initialize regions related to CPU OSPM interface to be used + * during notification of the power-on,off events to the OSPM + */ + memory_region_init(&s->container_cpuospm, OBJECT(dev), + ACPI_CPUOSPM_REGION_NAME, + ACPI_CPU_OSPM_IF_REG_LEN); + sysbus_init_mmio(sbd, &s->container_cpuospm); + acpi_cpu_ospm_state_interface_init(&s->container_cpuospm, + OBJECT(dev), + &s->cpuospm_state, 0); + break; case ACPI_GED_CPU_HOTPLUG_EVT: /* initialize CPU Hotplug related regions */ memory_region_init(&s->container_cpuhp, OBJECT(dev), @@ -544,6 +629,7 @@ static void acpi_ged_class_init(ObjectClass *class, con= st void *data) { DeviceClass *dc =3D DEVICE_CLASS(class); HotplugHandlerClass *hc =3D HOTPLUG_HANDLER_CLASS(class); + PowerStateHandlerClass *pshc =3D POWERSTATE_HANDLER_CLASS(class); AcpiDeviceIfClass *adevc =3D ACPI_DEVICE_IF_CLASS(class); ResettableClass *rc =3D RESETTABLE_CLASS(class); AcpiGedClass *gedc =3D ACPI_GED_CLASS(class); @@ -560,6 +646,10 @@ static void acpi_ged_class_init(ObjectClass *class, co= nst void *data) resettable_class_set_parent_phases(rc, NULL, ged_reset_hold, NULL, &gedc->parent_phases); =20 + pshc->pre_poweron =3D acpi_ged_pre_poweron_cb; + pshc->request_poweroff =3D acpi_ged_request_poweroff_cb; + pshc->post_poweroff =3D acpi_ged_post_poweroff_cb; + adevc->ospm_status =3D acpi_ged_ospm_status; adevc->send_event =3D acpi_ged_send_event; } @@ -573,6 +663,7 @@ static const TypeInfo acpi_ged_info =3D { .class_size =3D sizeof(AcpiGedClass), .interfaces =3D (const InterfaceInfo[]) { { TYPE_HOTPLUG_HANDLER }, + { TYPE_POWERSTATE_HANDLER }, { TYPE_ACPI_DEVICE_IF }, { } } diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 3980f553db..8d498708ab 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -188,6 +188,7 @@ static const MemMapEntry base_memmap[] =3D { [VIRT_PVTIME] =3D { 0x090a0000, 0x00010000 }, [VIRT_SECURE_GPIO] =3D { 0x090b0000, 0x00001000 }, [VIRT_ACPI_PCIHP] =3D { 0x090c0000, ACPI_PCIHP_SIZE }, + [VIRT_ACPI_CPUPS] =3D { 0x090d0000, ACPI_CPU_OSPM_IF_REG_LEN }, [VIRT_MMIO] =3D { 0x0a000000, 0x00000200 }, /* ...repeating for a total of NUM_VIRTIO_TRANSPORTS, each of that siz= e */ [VIRT_PLATFORM_BUS] =3D { 0x0c000000, 0x02000000 }, @@ -688,9 +689,10 @@ static inline DeviceState *create_acpi_ged(VirtMachine= State *vms) { DeviceState *dev; MachineState *ms =3D MACHINE(vms); + MachineClass *mc =3D MACHINE_GET_CLASS(ms); SysBusDevice *sbdev; int irq =3D vms->irqmap[VIRT_ACPI_GED]; - uint32_t event =3D ACPI_GED_PWR_DOWN_EVT; + uint32_t event =3D ACPI_GED_PWR_DOWN_EVT | ACPI_GED_CPU_POWERSTATE_EVT; bool acpi_pcihp; =20 if (ms->ram_slots) { @@ -711,6 +713,11 @@ static inline DeviceState *create_acpi_ged(VirtMachine= State *vms) sysbus_mmio_map_name(sbdev, ACPI_MEMHP_REGION_NAME, vms->memmap[VIRT_PCDIMM_ACPI].base); =20 + if (mc->has_online_capable_cpus) { + sysbus_mmio_map_name(sbdev, ACPI_CPUOSPM_REGION_NAME, + vms->memmap[VIRT_ACPI_CPUPS].base); + } + acpi_pcihp =3D object_property_get_bool(OBJECT(dev), ACPI_PM_PROP_ACPI_PCIHP_BRIDGE, = NULL); =20 diff --git a/include/hw/acpi/acpi_dev_interface.h b/include/hw/acpi/acpi_de= v_interface.h index 68d9d15f50..eea03ca47d 100644 --- a/include/hw/acpi/acpi_dev_interface.h +++ b/include/hw/acpi/acpi_dev_interface.h @@ -13,6 +13,7 @@ typedef enum { ACPI_NVDIMM_HOTPLUG_STATUS =3D 16, ACPI_VMGENID_CHANGE_STATUS =3D 32, ACPI_POWER_DOWN_STATUS =3D 64, + ACPI_CPU_POWERSTATE_STATUS =3D 128, } AcpiEventStatusBits; =20 #define TYPE_ACPI_DEVICE_IF "acpi-device-interface" diff --git a/include/hw/acpi/generic_event_device.h b/include/hw/acpi/gener= ic_event_device.h index 2c5b055327..87e4e5e6ce 100644 --- a/include/hw/acpi/generic_event_device.h +++ b/include/hw/acpi/generic_event_device.h @@ -64,6 +64,7 @@ #include "hw/acpi/ghes.h" #include "hw/acpi/cpu.h" #include "hw/acpi/pcihp.h" +#include "hw/acpi/cpu_ospm_interface.h" #include "qom/object.h" =20 #define ACPI_POWER_BUTTON_DEVICE "PWRB" @@ -92,6 +93,7 @@ OBJECT_DECLARE_TYPE(AcpiGedState, AcpiGedClass, ACPI_GED) #define AML_GED_EVT_REG "EREG" #define AML_GED_EVT_SEL "ESEL" #define AML_GED_EVT_CPU_SCAN_METHOD "\\_SB.GED.CSCN" +#define AML_GED_EVT_CPUPS_SCAN_METHOD "\\_SB.GED.PSCN" /* Power State Sca= n */ =20 /* * Platforms need to specify the GED event bitmap @@ -103,6 +105,7 @@ OBJECT_DECLARE_TYPE(AcpiGedState, AcpiGedClass, ACPI_GE= D) #define ACPI_GED_NVDIMM_HOTPLUG_EVT 0x4 #define ACPI_GED_CPU_HOTPLUG_EVT 0x8 #define ACPI_GED_PCI_HOTPLUG_EVT 0x10 +#define ACPI_GED_CPU_POWERSTATE_EVT 0x20 =20 typedef struct GEDState { MemoryRegion evt; @@ -112,6 +115,7 @@ typedef struct GEDState { =20 #define ACPI_PCIHP_REGION_NAME "pcihp container" #define ACPI_MEMHP_REGION_NAME "memhp container" +#define ACPI_CPUOSPM_REGION_NAME "cpuospm container" =20 struct AcpiGedState { SysBusDevice parent_obj; @@ -121,6 +125,8 @@ struct AcpiGedState { MemoryRegion container_cpuhp; AcpiPciHpState pcihp_state; MemoryRegion container_pcihp; + AcpiCpuOspmState cpuospm_state; + MemoryRegion container_cpuospm; GEDState ged_state; uint32_t ged_event_bitmap; qemu_irq irq; diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h index 02cc311452..68081b79bb 100644 --- a/include/hw/arm/virt.h +++ b/include/hw/arm/virt.h @@ -81,6 +81,7 @@ enum { VIRT_NVDIMM_ACPI, VIRT_PVTIME, VIRT_ACPI_PCIHP, + VIRT_ACPI_CPUPS, VIRT_LOWMEMMAP_LAST, }; =20 --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759280965; cv=none; d=zohomail.com; s=zohoarc; b=bSL9A6DATRHCNS8fg1vLH0e2VQ4Q1HnMpzVHxBQ1qRPLdbnSP7Oz60JQHwPcegjVbF8A63gdqct9aoYOjHLdlEn1VHvo4A0DL8Rn4ckUB0il87TvYVKXESWq2bCM2vazkIcPMHDR8/NXLRgJfFZfS8THShZbAwRu7xEosoRqDMo= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759280965; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=3d/1PkHX+6ySvY2cHkD8QFKwUFKHxum1HSH+EnGsK+0=; b=ZOHXWRabChFfSzL1b5GsjZuQ7tBuPex8Ay4risBRhqXDRtdzDipJZ7JUdThiUoyvc7mxuhQcEm6hunHc6bB+9hOFcq9TVd9SPpuFu18NoythPUSggdeJCX9XtHn9f8jhbcB8H5kZQqrQ26KJ2aW5SUwozy2caD+rTC/8Ee3tezs= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 175928096576698.70337240863716; Tue, 30 Sep 2025 18:09:25 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lGq-0005vk-5F; Tue, 30 Sep 2025 21:04:43 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lG0-0005UR-S8 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:49 -0400 Received: from mail-wm1-x329.google.com ([2a00:1450:4864:20::329]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lFD-00089l-79 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:46 -0400 Received: by mail-wm1-x329.google.com with SMTP id 5b1f17b1804b1-46e42deffa8so58167015e9.0 for ; Tue, 30 Sep 2025 18:02:54 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280571; x=1759885371; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=3d/1PkHX+6ySvY2cHkD8QFKwUFKHxum1HSH+EnGsK+0=; b=ie+Wpi+NqHkLSNBCJGSqLtlniguWh0bA7VYS5WUGwL9sZ69Ox6v3JyHR/a2FEpTdI+ Pya+ZTDFbVE9BjH95cVdyhut95D3gcULHx0ARTvR2shfVNF3NNv2AJ9r5W3bTnrx20/V Xk7r1OOjYue/mMV7NaU1QSP5YokojK95pXwKNtAi8Is0wofx1hJ4Ue6pug94Jc5yOE5c I9gPdgTsZiXGiC6fCytE5gKeRURBsudPzWxM/qjvuCDQLPlHXl3EntcUqyzKxXXT7Msp 5x6jVMpkzftL0Xg06PcCMNBbQeW1CRgQsxjQBlgvPtjxqBN3u+akF8i7QFlNmrHwPaiE F6Bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280571; x=1759885371; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=3d/1PkHX+6ySvY2cHkD8QFKwUFKHxum1HSH+EnGsK+0=; b=dWY88QLFaR7EmySQ2SwxCgy48clKX8aQJ8ZyDYbX3Na7pDJvdudy3rRh6SpghkisqF icc6p5IEgSoivW3cNSVdUWb+pvn9pZ3l6Yh27QnNGlzII/A/8aFL3fMRfPdN7MoJegwa uEBnbIGEL2EfSPvS9mrGynX+/3jgvM/scTXOLMcoh96JRWdb5f7LQa0mHZevbmGH1WWp addgkaPfud/GFShJJ5vLsDorjgIulyAeibb2YVEyPUyVe8UfkpktqFiXQ8r0vKOJHtEb qNFaodDlcX6Z5hLCjfquMXP7dz5PFhkmTZVY48HP60Pz6+YVsMCe1T906A2pPLOeE+QP lIIA== X-Gm-Message-State: AOJu0YyfMajOEa2W9euvjtsG8+RCkjx8bvU2ZNM+4VxpfjZbjasPZv7l 6s6Q/YjBtMYF74duHxuU/i2D5pfZCYgKM4asHklpS9NMBoGw2ZpDgIpzICwYlv3KRctROLCEAz1 rWE+65x0ssQ== X-Gm-Gg: ASbGncvjHXDI53YJxNtwFAoIQKHGbIqmvWHf5Iolk99qpPrBfdyD6JIUkJgzdY8S7zJ NgiF8Kkp7iaudOzUn325oTCImBuvo2JOlNmfY1QG7ZklNOwFmmRMbLLXFohLVlnPWp4zMpbcr4S uDrqx+tveCMfOBMydnC3CNxEtIMEA8JvVckDmkwJ2jRXx4IblpXMyzJ5LWbnEIkmeUhCk8D95OX ij3q6uDodgBvWLgqCLnrfRLc6ERmhSB28+kJM1s3EKarErIpZ0ThWD9Vd36eNwIs3V0SLf7/7U+ X2qJh2vM7jzOcXRkwRwyPVo286yhVNYLXsE+EUUJ+ldYxgg1ZBe4ftbLNoqiAQZ4+rAs+EeIrqm 8zyxzsIM4HPW4qFklkHMjcaAiu3F9eMxg5ldYZEA14s+gX6cT5LMXFgeCB5LGbo7kouCmv55w0I h84VhRePMQZ0expnQptgvboEzOTe+X3wImCBb2p/asjsQ= X-Google-Smtp-Source: AGHT+IHRdDsT1RAOSF3kAiKK2Tu3EectRNRiOrgC1ZbY7bH5hpBDTXxVCHt05ceFQOLhIYQooXwHcA== X-Received: by 2002:a05:600c:468f:b0:46e:447d:858e with SMTP id 5b1f17b1804b1-46e612dce3dmr13937645e9.28.1759280571088; Tue, 30 Sep 2025 18:02:51 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 16/24] arm/virt/acpi: Update ACPI DSDT Tbl to include 'Online-Capable' CPUs AML Date: Wed, 1 Oct 2025 01:01:19 +0000 Message-Id: <20251001010127.3092631-17-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::329; envelope-from=salil.mehta@opnsrc.net; helo=mail-wm1-x329.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_FILL_THIS_FORM_SHORT=0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759280968113116600 Content-Type: text/plain; charset="utf-8" From: Salil Mehta This change emits AML in DSDT to support vCPU deferred online-capability on arm/virt. It wires the CPU OSPM coordination paths so that CPUs which are administratively disabled at boot can be brought online later under policy, providing hotplug-like functionality without claiming full hotplug support. The AML connects the CPUS scan method to a GED handler so QEMU and the guest OSPM can coordinate CPU add/remove while the VM is running (e.g. device-check, eject-request, _EJ0, CPU scan, _OST status reporting). It also fixes an ACPI namespace load error: AE_NOT_FOUND resolving \_SB.GED.PSCN Error excerpt: [ 0.070518] ACPI BIOS Error (bug): Object does not exist: GED_ [ 0.071457] ACPI BIOS Error (bug): Could not resolve symbol [\_SB.GED.PS= CN], [ 0.073084] ACPI Error: AE_NOT_FOUND, During name lookup/catalog Root cause was build order and naming: the PSCN handler must be created under \_SB.GED using a short ACPI 'NameSeg', and referenced elsewhere by its fully qualified path. The GED device (and PSCN) are now defined before the = CPUS AML, preventing the early lookup failure. Notes: * CPU enumeration remains from MADT (GICC). CPU0 is Enabled; other CPUs may be Disabled but Online-Capable. * Policy (which CPUs start disabled, later enabled) is administrative and not decided by OSPM. Tested: boot with EDK2/ACPI; no AE_NOT_FOUND for \_SB.GED.PSCN; generic CPU devices register; sysfs topology group warnings do not occur. DSDT.dsl (Not Working) = DSDT.dsl (Working) --------------------- = ------------------ DefinitionBlock ("", "DSDT", 2, "BOCHS ", "BXPC ", 0x00000001) De= finitionBlock ("", "DSDT", 2, "BOCHS ", "BXPC ", 0x00000001) { { Scope (\_SB) = Scope (\_SB) { = { Scope (_SB) = Device (\_SB.GED) { = { Device (\_SB.CPUR) = Name (_HID, "ACPI0013" { = Name (_UID, "GED") [...] = Name (_CRS, ResourceTemplate () Device (\_SB.CPUS) = [...] { = Method (_EVT, 1, Serialized) Name (_HID, "ACPI0010") = { Name (_CID, EisaId ("PNP0A05")) = Local0 =3D ESEL /* \_SB_.GED_.ESEL */ Method (CTFY, 2, NotSerialized) = If (((Local0 & 0x02) =3D=3D 0x02)) { = { [...] = Notify (PWRB, 0x80) Method (CSTA, 1, Serialized) = } { [...] = If (((Local0 & 0x08) =3D=3D 0x08)) Method (CEJ0, 1, Serialized) = { { = \_SB.GED.PSCN () [...] = } Method (CSCN, 0, Serialized) = } { = } [...] Method (COST, 4, Serialized) = Scope (_SB) { { [...] = Device (\_SB.CPUR) Device (C000) = { { [= ...] [...] = Device (\_SB.CPUS) Device (C001) = { { = Name (_HID, "ACPI0010") [...] = Name (_CID, EisaId ("PNP0A05")) Device (C002) = Method (CTFY, 2, NotSerialized) { = { [...] = [...] Device (C003) = Method (CSTA, 1, Serialized) { = { [...] = [...] Device (C004) = Method (CEJ0, 1, Serialized) { = { [...] = [...] Device (C005) = Method (CSCN, 0, Serialized) { { } = [...] } = Method (COST, 4, Serialized) = { Method (\_SB.GED.PSCN, 0, NotSerialized) = [...] { = Device (C000) \_SB.CPUS.CSCN () = { } = [...] = Device (C001) Device (COM0) = { { = [...] [...] = Device (C002) = { Device (\_SB.GED) = [...] { = Device (C003) Name (_HID, "ACPI0013") = { Name (_UID, "GED") = [...] Name (_CRS, ResourceTemplate () = Device (C004) { = { [...] = [...] OperationRegion (EREG, SystemMemory, 0x09080000, 0x04) = Device (C005) Field (EREG, DWordAcc, NoLock, WriteAsZeros) = { { = } [...] } Method (_EVT, 1, Serialized) = Method (\_SB.GED.PSCN, 0, NotSerialized) { = { Local0 =3D ESEL = \_SB.CPUS.CSCN () If (((Local0 & 0x02) =3D=3D 0x02)) = } { Notify (PWRB, 0x80) = Device (COM0) } = { = [...] If (((Local0 & 0x08) =3D=3D 0x08)) = } { } \_SB.GED.PSCN () } } } Device (PWRB) { [...] } } Signed-off-by: Salil Mehta --- hw/arm/virt-acpi-build.c | 35 +++++++++++++++++++++++++---------- 1 file changed, 25 insertions(+), 10 deletions(-) diff --git a/hw/arm/virt-acpi-build.c b/hw/arm/virt-acpi-build.c index 7c24dd6369..5e5acb3026 100644 --- a/hw/arm/virt-acpi-build.c +++ b/hw/arm/virt-acpi-build.c @@ -931,6 +931,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, Virt= MachineState *vms) VirtMachineClass *vmc =3D VIRT_MACHINE_GET_CLASS(vms); Aml *scope, *dsdt; MachineState *ms =3D MACHINE(vms); + MachineClass *mc =3D MACHINE_GET_CLASS(ms); const MemMapEntry *memmap =3D vms->memmap; const int *irqmap =3D vms->irqmap; AcpiTable table =3D { .sig =3D "DSDT", .rev =3D 2, .oem_id =3D vms->oe= m_id, @@ -946,7 +947,30 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, Vir= tMachineState *vms) * the RTC ACPI device at all when using UEFI. */ scope =3D aml_scope("\\_SB"); - acpi_dsdt_add_cpus(scope, vms); + if (vms->acpi_dev) { + build_ged_aml(scope, "\\_SB."GED_DEVICE, + HOTPLUG_HANDLER(vms->acpi_dev), + irqmap[VIRT_ACPI_GED] + ARM_SPI_BASE, AML_SYSTEM_MEM= ORY, + memmap[VIRT_ACPI_GED].base); + } else { + acpi_dsdt_add_gpio(scope, &memmap[VIRT_GPIO], + (irqmap[VIRT_GPIO] + ARM_SPI_BASE)); + } + + /* + * If the machine supports bringing administratively disabled vCPUs + * deferred-online under policy, build AML to coordinate the addition = and + * removal of CPUs gracefully with the OSPM while the VM is running. T= his + * includes events such as device-check, eject-request, ejection (_EJ0= ), + * CPU scan, _OST status reporting, etc. + */ + if (vms->acpi_dev && mc->has_online_capable_cpus) { + acpi_build_cpus_aml(scope, memmap[VIRT_ACPI_CPUPS].base, "\\_SB", + AML_GED_EVT_CPUPS_SCAN_METHOD); + } else { + acpi_dsdt_add_cpus(scope, vms); + } + acpi_dsdt_add_uart(scope, &memmap[VIRT_UART0], (irqmap[VIRT_UART0] + ARM_SPI_BASE), 0); if (vms->second_ns_uart_present) { @@ -961,15 +985,6 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, Vir= tMachineState *vms) (irqmap[VIRT_MMIO] + ARM_SPI_BASE), 0, NUM_VIRTIO_TRANSPORTS); acpi_dsdt_add_pci(scope, memmap, irqmap[VIRT_PCIE] + ARM_SPI_BASE, vms= ); - if (vms->acpi_dev) { - build_ged_aml(scope, "\\_SB."GED_DEVICE, - HOTPLUG_HANDLER(vms->acpi_dev), - irqmap[VIRT_ACPI_GED] + ARM_SPI_BASE, AML_SYSTEM_MEM= ORY, - memmap[VIRT_ACPI_GED].base); - } else { - acpi_dsdt_add_gpio(scope, &memmap[VIRT_GPIO], - (irqmap[VIRT_GPIO] + ARM_SPI_BASE)); - } =20 if (vms->acpi_dev) { uint32_t event =3D object_property_get_uint(OBJECT(vms->acpi_dev), --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759281154228532.7471873820015; Tue, 30 Sep 2025 18:12:34 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lGQ-0005iv-MG; Tue, 30 Sep 2025 21:04:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lG0-0005VB-PY for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:49 -0400 Received: from mail-wr1-x42e.google.com ([2a00:1450:4864:20::42e]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lFD-0008AE-5H for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:48 -0400 Received: by mail-wr1-x42e.google.com with SMTP id ffacd0b85a97d-414f48bd5a7so4359823f8f.2 for ; Tue, 30 Sep 2025 18:02:56 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280573; x=1759885373; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=riMmSmFMW9DU05Ls6u53RzcUxCDMOS/nOG/ia4hBxSM=; b=fSdVlBkHwuIfLyB9mzyq8Npdm5k7CFChluDY5nwfo46RF2fJxMnzQdDjuwVqcTb0J5 UexUe0gB7H9odjvd+CnpFOFZAMXJUgRK4h1rvNfqi6CblRNnXX5CTVjaZZKZvxntewkB 0VldUVPYndC+xeEUw0z/Edl8F5r9GwmHNfiZLFNQfeBS02vPfCIu++Q9tY3WYbZmXvXj 63Y0DISlZA67EuyPCirY4NBmn/Xt+eBM13Vk3ARMzkGay6+HJyiAg89eVX8awFtt8VmN sOvkUFv9eDgT1vG8b1EEbdmC4Da++HovGMA7drQX5oho7+uyajAWPNAM9I+vPwdxoGi4 7ooQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280573; x=1759885373; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=riMmSmFMW9DU05Ls6u53RzcUxCDMOS/nOG/ia4hBxSM=; b=gAz05tAlawOwSis4lL8Xc26oavwIZQBqZtkpCL7PF/B8RrJTaCeshriwF4FBMBs7NR M2qydFewSVeXzF5vNdiCcgwTfdaoG4gECFdeTbEvGFgzL++nXK9wajDwAMEe7+dEJhND IWY4pTzPm6bZohFaJw82zKR2Ml1HlcznFzwpqr3tSRl8he1V/cUCA8AsXoLQKOE30nXC WAvKD6PZliPsW6XcGQGmoHbkXc2k30JecPIBWQ4+kiJ/wvVxWE2SXUyupoN7f8bWZQhw URM74e2xzkZ8Kozt0bzFekWwnaNFSFibZBUCy6cj9YjTgdZeKli0ralMtFxSyjQkjF+D 3/1A== X-Gm-Message-State: AOJu0YzA1IJ+zdoyR68mGzXaXyd92UGJjO36efBVv4DdfV1INHWh9mi8 JDlW+xV9ZdpZRvyehXSB8ni5lJ39GRXwyOLqFR4H2kpfIpqAksFy22HmukkLzAxHp9imluI7g69 d4hJO0iya2A== X-Gm-Gg: ASbGncuuTm+CLVHVM905WvpPo3S0g6MXHEGkixtlinoiGrPqz+lswH20grYgBaVgb81 QUdx5BOPyLWHdWIKRZ2MSVeOQNzl4lKWPoqsYZmRy1I5FZ8Tpp2eOZC82zfW7RDUuFZ6nc/AQkX DclyMkjU1tFU9Uiwu3h99NeV2/SORp/HKzYue8U0O3DbEwQITJYqQEFBXYQROjGAqNaUlWFLWVX uXhz1+4vFlrP2/UJYZ6IG+3VwJoMVxKjqbx3GGNjLFdJsvvbi/MslI2DPSCaxFYxmIOEK2DHXOb AMc3IPPU3zYFCPe8Agewj48bS8kx18qif03mCeywbX6RakysnuPcww2DXXKeZpvrRVXspm1aNGW 38BbVGoWgBWRjwS/KDd9FT9/9PfZDWXEMVbbSZKDhtUkUbC2dSCWerxWqySpacbWj1I/xxgrXR6 2alCBn8pi9HjuWmbRnF6e0Z99d43vguOtPjwVA2JVCZ+k= X-Google-Smtp-Source: AGHT+IEeV6YRYEjIM1tPPHAaCG+k1453yOkfUWb6kpDdBDoPwpEYmXNOWuISxfXZE6pxJiI/dnNzaw== X-Received: by 2002:a05:6000:18a3:b0:3ec:dfe5:17d0 with SMTP id ffacd0b85a97d-425577edd2dmr869712f8f.9.1759280572888; Tue, 30 Sep 2025 18:02:52 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 17/24] hw/arm/virt, acpi/ged: Add PowerStateHandler hooks for runtime CPU state changes Date: Wed, 1 Oct 2025 01:01:20 +0000 Message-Id: <20251001010127.3092631-18-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::42e; envelope-from=salil.mehta@opnsrc.net; helo=mail-wr1-x42e.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1759281154833116600 Content-Type: text/plain; charset="utf-8" From: Salil Mehta Administrative power state property has been recently introduced as part of= this patch-set, and QEMU currently lacks a way for platforms to react to such co= ntrol (e.g. 'device_set ... admin-state=3Ddisable'). These host-driven changes mu= st drive corresponding operational transitions and involve OSPM where appropri= ate. Summary of Handling: =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D Since vCPUs are always enumerated as present, administrative enable must en= sure they also become operationally usable. This requires realizing the vCPU (if enabled for the first time) or unparking it otherwise, re-registering it wi= th the VMState handler, adding it back to the active vCPU list, and kicking its sleeping thread into KVM so it can transition to the guest runnable state o= nce the kernel issues CPU_ON. The GICC interface must also be marked accessible= , and OSPM must be notified through a Device Check event so that _EVT/_STA evalua= tion can identify the CPU, register it with the Linux device model, enable it in= the guest kernel, and make it available to the scheduler. When a CPU is administratively disabled, the virt machine invokes its PowerStateHandler callbacks to request powering off the vCPU. As a conseque= nce, GED raises an Eject Request event so OSPM can invoke _EJ0 to offload tasks = and shut down state before removal. The vCPU is then quiesced, unregistered from VMState, removed from the active vCPU list, its sleeping vCPU thread is kic= ked from KVM, re-blocked inside QEMU, and the vCPU is parked in userspace. This helps reduce locking contention inside the kernel. The callbacks introduced as part of this patch-set handle the above flows a= nd avoid forceful removal without kernel coordination, keep firmware and GIC a= ccess in sync, and integrate with existing ACPI GED-based signaling. Signed-off-by: Salil Mehta --- cpu-common.c | 4 +- hw/arm/virt.c | 233 ++++++++++++++++++++++++++++- include/hw/arm/virt.h | 1 + include/hw/core/cpu.h | 2 + include/hw/intc/arm_gicv3_common.h | 30 ++++ system/cpus.c | 4 +- target/arm/cpu.c | 1 + 7 files changed, 271 insertions(+), 4 deletions(-) diff --git a/cpu-common.c b/cpu-common.c index ef5757d23b..7eced58434 100644 --- a/cpu-common.c +++ b/cpu-common.c @@ -103,7 +103,9 @@ void cpu_list_remove(CPUState *cpu) } =20 QTAILQ_REMOVE_RCU(&cpus_queue, cpu, node); - cpu->cpu_index =3D UNASSIGNED_CPU_INDEX; + if (!cpu->preserve_assigned_cpu_index) { + cpu->cpu_index =3D UNASSIGNED_CPU_INDEX; + } cpu_list_generation_id++; } =20 diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 8d498708ab..9a41a0682b 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -45,6 +45,7 @@ #include "system/device_tree.h" #include "system/numa.h" #include "system/runstate.h" +#include "system/reset.h" #include "system/tpm.h" #include "system/tcg.h" #include "system/kvm.h" @@ -91,6 +92,8 @@ #include "hw/cxl/cxl.h" #include "hw/cxl/cxl_host.h" #include "qemu/guest-random.h" +#include "hw/powerstate.h" +#include "arm-powerctl.h" =20 static GlobalProperty arm_virt_compat[] =3D { { TYPE_VIRTIO_IOMMU_PCI, "aw-bits", "48" }, @@ -1400,7 +1403,7 @@ static FWCfgState *create_fw_cfg(const VirtMachineSta= te *vms, AddressSpace *as) char *nodename; =20 fw_cfg =3D fw_cfg_init_mem_wide(base + 8, base, 8, base + 16, as); - fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, (uint16_t)ms->smp.cpus); + fw_cfg_add_i16(fw_cfg, FW_CFG_NB_CPUS, vms->boot_cpus); =20 nodename =3D g_strdup_printf("/fw-cfg@%" PRIx64, base); qemu_fdt_add_subnode(ms->fdt, nodename); @@ -1821,6 +1824,179 @@ void virt_machine_done(Notifier *notifier, void *da= ta) virt_build_smbios(vms); } =20 +static void virt_park_cpu_in_userspace(CPUState *cs) +{ + /* we don't want to migrate 'disabled' vCPU state(even if realized) */ + cpu_vmstate_unregister(cs); + /* remove from 'present' and 'enabled' list of active vCPUs */ + cpu_list_remove(cs); + /* ensure that other context do not kick us out of the parked state */ + cs->parked =3D true; + /* this will kick the sleeping KVM vCPUs to Qemu; releasing vCPU mutex= */ + cpu_pause(cs); +} + +static void virt_unpark_cpu_in_userspace(CPUState *cs) +{ + /* disabled vCPUs lack a VMStateDescription; re-register */ + cpu_vmstate_register(cs); + /* add back to 'present' and 'enabled' list of active vCPUs */ + cpu_list_add(cs); + /* + * kick back the vCPU into action; operational power-on will happen in + * context to PSCI CPU_ON executed by the Guest. We are just enabling = the + * infrastructre here and making it available to the Guest. + */ + cs->parked =3D false; + cpu_resume(cs); +} + +static void +virt_cpu_pre_poweron(PowerStateHandler *handler, DeviceState *dev, Error *= *errp) +{ + VirtMachineState *vms =3D VIRT_MACHINE(handler); + PowerStateHandlerClass *pshc; + CPUState *cs =3D CPU(dev); + + /* + * Lazy realization path: bring the CPU to a realized state the first = time + * it is powered on. Saves boot time; later power-ons skips this. + */ + if (!dev->realized) { + qdev_realize(dev, NULL, errp); + } else { + /* Realized but parked 'disabled' vCPUs */ + virt_unpark_cpu_in_userspace(cs); + } + + gicv3_mark_gicc_accessible(OBJECT(vms->gic), cs->cpu_index, errp); + if (*errp) { + error_setg(errp, "couldn't mark GICC accessibile for CPU %d", + cs->cpu_index); + return; + } + + /* update the firmware information for the next boot. */ + vms->boot_cpus++; + if (vms->fw_cfg) { + fw_cfg_modify_i16(vms->fw_cfg, FW_CFG_NB_CPUS, vms->boot_cpus); + } + + /* + * Notify the guest that a CPU is powered-on(_STA.Ena =3D 1), triggeri= ng a + * Device Check (Notify(..., 0x80)) via GED. This prompts OSPM to + * re-evaluate ACPI _STA method. + * + * Only notify after the VM is ready i.e., the guest kernel is initial= ized. + * For example, during boot-time '-deviceset' usage, the kernel isn't = ready, + * so sending a notification is pointless. + */ + if (phase_check(PHASE_MACHINE_READY) && + !runstate_check(RUN_STATE_INMIGRATE)) { + pshc =3D POWERSTATE_HANDLER_GET_CLASS(vms->acpi_dev); + pshc->pre_poweron(POWERSTATE_HANDLER(vms->acpi_dev), dev, errp); + if (*errp) { + error_setg(errp, "failed to notify OSPM about CPU %d power-on", + cs->cpu_index); + return; + } + } + + /* + * Guest Kernel/OSPM will issue PSCI CPU_ON, which performs the cold s= tart + * (reset + entry state) for this CPU + */ +} + +static void +virt_cpu_request_poweroff(PowerStateHandler *handler, DeviceState *dev, + Error **errp) +{ + VirtMachineState *vms =3D VIRT_MACHINE(handler); + PowerStateHandlerClass *pshc; + ARMCPU *cpu =3D ARM_CPU(dev); + CPUState *cs =3D CPU(dev); + + if (cs->cpu_index =3D=3D first_cpu->cpu_index) { + error_setg(errp, "can't power-off boot CPU (id=3D%d [%d:%d:%d:%d]= )", + first_cpu->cpu_index, cpu->socket_id, cpu->cluster_id, + cpu->core_id, cpu->thread_id); + return; + } + + /* + * Check that we are not tearing down too early when no live state exi= sts. + * This can happen in: + * 1. Lazy device realization + * 2. Use of '-device-set' at qemu prompt + * 3. Post-migration on the destination VM + */ + if (!dev->realized) { + return; + } + + if (!phase_check(PHASE_MACHINE_READY) || + runstate_check(RUN_STATE_INMIGRATE)) { + virt_park_cpu_in_userspace(cs); + return; + } + + /* + * powering-off a CPU triggers an Eject Request (Notify(..., 0x03)) + * via GED, prompting the OSPM to invoke _EJ0 for device removal handl= ing. + */ + pshc =3D POWERSTATE_HANDLER_GET_CLASS(vms->acpi_dev); + pshc->request_poweroff(POWERSTATE_HANDLER(vms->acpi_dev), dev, errp); + if (*errp) { + error_setg(errp, "request failed to power-off CPU %d", cs->cpu_ind= ex); + return; + } +} + +static void +virt_cpu_post_poweroff(PowerStateHandler *handler, DeviceState *dev, + Error **errp) +{ + VirtMachineState *vms =3D VIRT_MACHINE(handler); + PowerStateHandlerClass *pshc; + CPUState *cs =3D CPU(dev); + + /* + * Just in case we are here too early. Ignore admin power-off before + * realize; no live state to tear down. + */ + if (!dev->realized) { + return; + } + + /* we are here because OSPM has already offline'd CPU and issued EJ0 */ + pshc =3D POWERSTATE_HANDLER_GET_CLASS(vms->acpi_dev); + pshc->post_poweroff(POWERSTATE_HANDLER(vms->acpi_dev), dev, errp); + if (*errp) { + error_setg(errp, "failed to complete CPU %d power-off", cs->cpu_in= dex); + return; + } + + vms->boot_cpus--; + if (vms->fw_cfg) { + fw_cfg_modify_i16(vms->fw_cfg, FW_CFG_NB_CPUS, vms->boot_cpus); + } + + gicv3_mark_gicc_inaccessible(OBJECT(vms->gic), cs->cpu_index, errp); + if (*errp) { + error_setg(errp, "couldn't mark GICC inaccessibile for CPU %d", + cs->cpu_index); + return; + } + + /* + * A 'disabled' vCPU is quiesced; now park it in userspace. For KVM, + * this unblocks the sleeping vCPU thread and re-blocks it inside QEMU, + * reducing KVM vCPU lock contention. + */ + virt_park_cpu_in_userspace(cs); +} + static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx) { uint8_t clustersz; @@ -3218,6 +3394,53 @@ static HotplugHandler *virt_machine_get_hotplug_hand= ler(MachineState *machine, return NULL; } =20 +static void +virt_machine_device_request_poweroff(PowerStateHandler *handler, + DeviceState *dev, + Error **errp) +{ + if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { + virt_cpu_request_poweroff(handler, dev, errp); + } else { + error_setg(errp, "power-off request for unsupported device-type: %= s", + object_get_typename(OBJECT(dev))); + } +} + +static void +virt_machine_device_post_poweroff(PowerStateHandler *handler, DeviceState = *dev, + Error **errp) +{ + if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { + virt_cpu_post_poweroff(handler, dev, errp); + } else { + error_setg(errp, "can't complete power-off, unsupported device-typ= e %s", + object_get_typename(OBJECT(dev))); + } +} + +static void +virt_machine_device_pre_poweron(PowerStateHandler *handler, DeviceState *d= ev, + Error **errp) +{ + if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { + virt_cpu_pre_poweron(handler, dev, errp); + } else { + error_setg(errp, "can't prepare power-on, unsupported device-type = %s", + object_get_typename(OBJECT(dev))); + } +} + +static void * +virt_machine_powerstate_handler(MachineState *machine, DeviceState *dev) +{ + if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { + return (void *)POWERSTATE_HANDLER(machine); + } + + return NULL; +} + /* * for arm64 kvm_type [7-0] encodes the requested number of bits * in the IPA address space @@ -3294,6 +3517,7 @@ static void virt_machine_class_init(ObjectClass *oc, = const void *data) { MachineClass *mc =3D MACHINE_CLASS(oc); HotplugHandlerClass *hc =3D HOTPLUG_HANDLER_CLASS(oc); + PowerStateHandlerClass *pshc =3D POWERSTATE_HANDLER_CLASS(oc); static const char * const valid_cpu_types[] =3D { #ifdef CONFIG_TCG ARM_CPU_TYPE_NAME("cortex-a7"), @@ -3358,7 +3582,13 @@ static void virt_machine_class_init(ObjectClass *oc,= const void *data) hc->unplug_request =3D virt_machine_device_unplug_request_cb; hc->unplug =3D virt_machine_device_unplug_cb; =20 + /* virt machine device powerstate handlers & callbacks */ + assert(!mc->get_powerstate_handler); mc->has_online_capable_cpus =3D true; + mc->get_powerstate_handler =3D virt_machine_powerstate_handler; + pshc->request_poweroff =3D virt_machine_device_request_poweroff; + pshc->post_poweroff =3D virt_machine_device_post_poweroff; + pshc->pre_poweron =3D virt_machine_device_pre_poweron; =20 mc->nvdimm_supported =3D true; mc->smp_props.clusters_supported =3D true; @@ -3560,6 +3790,7 @@ static const TypeInfo virt_machine_info =3D { .instance_init =3D virt_instance_init, .interfaces =3D (const InterfaceInfo[]) { { TYPE_HOTPLUG_HANDLER }, + { TYPE_POWERSTATE_HANDLER }, { } }, }; diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h index 68081b79bb..0898e8eed3 100644 --- a/include/hw/arm/virt.h +++ b/include/hw/arm/virt.h @@ -166,6 +166,7 @@ struct VirtMachineState { MemMapEntry *memmap; char *pciehb_nodename; const int *irqmap; + uint16_t boot_cpus; int fdt_size; uint32_t clock_phandle; uint32_t gic_phandle; diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h index 2ee202a8a5..ccf5588011 100644 --- a/include/hw/core/cpu.h +++ b/include/hw/core/cpu.h @@ -485,6 +485,7 @@ struct CPUState { bool created; bool stop; bool stopped; + bool parked; =20 /* Should CPU start in powered-off state? */ bool start_powered_off; @@ -549,6 +550,7 @@ struct CPUState { =20 /* TODO Move common fields from CPUArchState here. */ int cpu_index; + bool preserve_assigned_cpu_index; int cluster_index; uint32_t tcg_cflags; uint32_t halted; diff --git a/include/hw/intc/arm_gicv3_common.h b/include/hw/intc/arm_gicv3= _common.h index bbf899184e..a8a84c4687 100644 --- a/include/hw/intc/arm_gicv3_common.h +++ b/include/hw/intc/arm_gicv3_common.h @@ -353,4 +353,34 @@ static inline bool gicv3_gicc_accessible(Object *obj, = int cpu) =20 return value; } + +/** + * gicv3_mark_gicc_accessible: + * @obj: QOM object implementing the GICv3 device + * @cpu: Index of the vCPU to mark as GICC-accessible + * @errp: Pointer to an Error* for reporting failures + * + * Marks GICv3CPUState::gicc_accessible as accessible and available for us= e. + */ +static inline void +gicv3_mark_gicc_accessible(Object *obj, int cpu, Error **errp) +{ + g_autofree gchar *propname =3D g_strdup_printf("gicc-accessible[%d]", = cpu); + object_property_set_bool(obj, propname, true, errp); +} + +/** + * gicv3_mark_gicc_inaccessible: + * @obj: QOM object implementing the GICv3 device + * @cpu: Index of the vCPU to mark as GICC-inaccessible + * @errp: Pointer to an Error* for reporting failures + * + * Marks GICv3CPUState::gicc_accessible as inaccessible and unavailable fo= r use. + */ +static inline void +gicv3_mark_gicc_inaccessible(Object *obj, int cpu, Error **errp) +{ + g_autofree gchar *propname =3D g_strdup_printf("gicc-accessible[%d]", = cpu); + object_property_set_bool(obj, propname, false, errp); +} #endif diff --git a/system/cpus.c b/system/cpus.c index 256723558d..0545aaaa0f 100644 --- a/system/cpus.c +++ b/system/cpus.c @@ -89,7 +89,7 @@ bool cpu_thread_is_idle(CPUState *cpu) if (cpu->stop || !cpu_work_list_empty(cpu)) { return false; } - if (cpu_is_stopped(cpu)) { + if (cpu_is_stopped(cpu) || cpu->parked) { return true; } if (!cpu->halted || cpu_has_work(cpu)) { @@ -327,7 +327,7 @@ bool cpu_can_run(CPUState *cpu) if (cpu->stop) { return false; } - if (cpu_is_stopped(cpu)) { + if (cpu_is_stopped(cpu) || cpu->parked) { return false; } return true; diff --git a/target/arm/cpu.c b/target/arm/cpu.c index a5906d1672..0ceaf69092 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -1502,6 +1502,7 @@ static void arm_cpu_initfn(Object *obj) } =20 CPU(obj)->thread_id =3D 0; + CPU(obj)->preserve_assigned_cpu_index =3D true; } =20 /* --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 17592812175471007.580214819844; Tue, 30 Sep 2025 18:13:37 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lGQ-0005ij-KH; Tue, 30 Sep 2025 21:04:17 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lFw-0005Tt-40 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:44 -0400 Received: from mail-wm1-x335.google.com ([2a00:1450:4864:20::335]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lFF-0008Bc-2T for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:43 -0400 Received: by mail-wm1-x335.google.com with SMTP id 5b1f17b1804b1-46e34052bb7so65872745e9.2 for ; Tue, 30 Sep 2025 18:02:57 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280575; x=1759885375; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7h307QOx44Ki9SH92ohHJPER/ZS3m1zJos18uaOTvw0=; b=C6QyghUrCxfJ+KRNmYgHfY+dcYHY7SAjydkStHZg5q7TQJWhfaKqBJUZkCk2+jqemb xWONMyeZMz8np6A2l6cHc9y2kc3vhARRLm02uk/z9/g1eJm0xe7gPuairKURQUWEqqj8 Ui4ujIa2StkQweL/OVzKtkxSQOUlY9vGC6RWEA4thzgiBmgvvDX+S5OrZbQ1FlsAGpt5 EaT7ToL+KGnV8bCUmxoZD17DzE7pxlpqntl2ouSvb8lIMByyh4xQANZStIg7wOIB8TBd CgaJ8kK8UM7DKhBt31oiwxg4ArtbS8UW3YsTtGxtUpg/NLg7Fx7vj8BXwKrMqmKv37yL f4Mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280575; x=1759885375; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7h307QOx44Ki9SH92ohHJPER/ZS3m1zJos18uaOTvw0=; b=kqoL4aMSgHSMC+YYMBFWv4KxIB5qCCwv25FFKLC1P61tnNMUVoLT8I8QegHNusI/t7 wVc87Q+/o2xQYDXQfzpGvpD5R30KfNKDXgZQ55/CM7XAP5jBH7pz3XsQUhDvk9lLalhY IDJ1EIW076qQRkxrkaWTDp+w/D91gKbuwWJaWlwqRrI5vP6Qp1Ij6bLLzTVxZQuxXiVX dD1j26fq5hS9V5LFHcVAEAle8B0osfouaBK8tjZQYYVdoP1F3jNu22hSK9UOgr7iSqWJ kWly3WIcZiitryzUziQp6bnaMoljhkknhAvo4mJ93oA/YKLKhYKkGe74yurrpvFH7kYQ gwMA== X-Gm-Message-State: AOJu0YzYCsrQn6wXodafiYSgmtsN0x6I0KsZ+ie5aPD67KIH+1i6KjWd S0gCb6e0LMtxIDgHMPrChJMRYwJtvdmpnIY7RD5FqENu8SCYu2Dd6lukmOW0MOnYRIOSmBNjhNr obvlJF/TMcg== X-Gm-Gg: ASbGnct1kFxyWLfbbA2X34frtKftbJ+W7kx9H5orjB4VZm9mm3fwBa0ZLBJ03LDEpOA piiyoKh2cosWIcP9v6PyMPZLijxFZWnpMEk69X+DyPcKsAOpOU9/mZMHr40jvxKkdKiLujgpXln LNmVjuJDktu8caETZTsGjrXQHxSabI6iwXshVPjvQPH2dy4e7Gy2yeVmNzDs2BfbFBZUUsl03uT FIMQ5+u3mLoBxuZNfE9RzWC2lgam3sgNqsP1OV0koSPRO19Ty3LbTvLt6uZZwW6TvCVBD8+nvrk oFE3l5kkYD4fsoccp25mV1548trODb06rBVlbTWJR7PIUyB6xOYlta5Vo2Ab7/DjthCtpSKsdrF pnSPD4BLJDZkmNSSf6oCFXcBVAWBBlYR1rWcLV1Bhx+5FNIwaqbzyFoHH8rCMrh+2VW0pryVsyl dQaJX8aj2gNGYZY5uVQ21Go2pA8MX/mh++BOcvXeeEsfM= X-Google-Smtp-Source: AGHT+IEnY0uugUJqIG/xkP/LpPRxljbK0+8Z0BcVIT+7ekPwlS5/mNaj9Fu6cCj6dxs1BSLKaSxR2g== X-Received: by 2002:a05:6000:26c3:b0:3eb:5e99:cbd3 with SMTP id ffacd0b85a97d-425577ed539mr1014530f8f.2.1759280574636; Tue, 30 Sep 2025 18:02:54 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 18/24] target/arm/kvm, tcg: Handle SMCCC hypercall exits in VMM during PSCI_CPU_{ON, OFF} Date: Wed, 1 Oct 2025 01:01:21 +0000 Message-Id: <20251001010127.3092631-19-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::335; envelope-from=salil.mehta@opnsrc.net; helo=mail-wm1-x335.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001, T_SPF_HELO_TEMPERROR=0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1759281219834116600 Content-Type: text/plain; charset="utf-8" From: Author Salil Mehta To support vCPU hotplug-like feature, we must trap any `HVC`/`SMC` `PSCI_CPU_{ON,OFF}` hypercalls from the host KVM to QEMU for policy checks.= This ensures the following when a vCPU is brought online: 1. The vCPU is actually plugged in (i.e., present). 2. The vCPU is not administratively disabled. (Policy Checks) Implement the registration and handling of `HVC`/`SMC` hypercall exits with= in the VMM, ensuring that proper policy checks and control flow are enforced d= uring the vCPU onlining and offlining processes. Co-developed-by: Jean-Philippe Brucker Signed-off-by: Jean-Philippe Brucker Signed-off-by: Salil Mehta --- target/arm/arm-powerctl.c | 27 ++++++++--- target/arm/helper.c | 2 +- target/arm/internals.h | 2 +- target/arm/kvm.c | 93 +++++++++++++++++++++++++++++++++++++ target/arm/kvm_arm.h | 14 ++++++ target/arm/meson.build | 1 + target/arm/{tcg =3D> }/psci.c | 9 ++++ target/arm/tcg/meson.build | 4 -- 8 files changed, 139 insertions(+), 13 deletions(-) rename target/arm/{tcg =3D> }/psci.c (96%) diff --git a/target/arm/arm-powerctl.c b/target/arm/arm-powerctl.c index 20c70c7d6b..ab4422b261 100644 --- a/target/arm/arm-powerctl.c +++ b/target/arm/arm-powerctl.c @@ -17,6 +17,7 @@ #include "qemu/main-loop.h" #include "system/tcg.h" #include "target/arm/multiprocessing.h" +#include "hw/boards.h" =20 #ifndef DEBUG_ARM_POWERCTL #define DEBUG_ARM_POWERCTL 0 @@ -31,14 +32,17 @@ =20 CPUState *arm_get_cpu_by_id(uint64_t id) { + MachineState *ms =3D MACHINE(qdev_get_machine()); CPUState *cpu; =20 DPRINTF("cpu %" PRId64 "\n", id); =20 - CPU_FOREACH(cpu) { - ARMCPU *armcpu =3D ARM_CPU(cpu); - - if (arm_cpu_mp_affinity(armcpu) =3D=3D id) { + /* + * with vCPU standy/hotplug support, we must now check for all + * possible vCPUs + */ + CPU_FOREACH_POSSIBLE(cpu, ms->possible_cpus) { + if (cpu && (arm_cpu_mp_affinity(ARM_CPU(cpu)) =3D=3D id)) { return cpu; } } @@ -119,9 +123,18 @@ int arm_set_cpu_on(uint64_t cpuid, uint64_t entry, uin= t64_t context_id, =20 /* Retrieve the cpu we are powering up */ target_cpu_state =3D arm_get_cpu_by_id(cpuid); - if (!target_cpu_state) { - /* The cpu was not found */ - return QEMU_ARM_POWERCTL_INVALID_PARAM; + + /* Policy check: verify 'administrative' power state of target CPU */ + if (!target_cpu_state || !qdev_check_enabled(DEVICE(target_cpu_state))= ) { + /* + * The cpu is not plugged in or disabled. We should return appropr= iate + * value as introduced in DEN0022E PSCI 1.2 issue E + */ + qemu_log_mask(LOG_GUEST_ERROR, + "[ARM]%s: Denying attempt to online ACPI disabled" + "(_STA.Ena=3D0)CPU%" PRId64", needs admin action fir= st!\n", + __func__, cpuid); + return QEMU_ARM_POWERCTL_IS_OFF; } =20 target_cpu =3D ARM_CPU(target_cpu_state); diff --git a/target/arm/helper.c b/target/arm/helper.c index 0c1299ff84..814fe719da 100644 --- a/target/arm/helper.c +++ b/target/arm/helper.c @@ -9110,7 +9110,7 @@ void arm_cpu_do_interrupt(CPUState *cs) env->exception.syndrome); } =20 - if (tcg_enabled() && arm_is_psci_call(cpu, cs->exception_index)) { + if (arm_is_psci_call(cpu, cs->exception_index)) { arm_handle_psci_call(cpu); qemu_log_mask(CPU_LOG_INT, "...handled as PSCI call\n"); return; diff --git a/target/arm/internals.h b/target/arm/internals.h index 1b3d0244fd..ffd82a7ace 100644 --- a/target/arm/internals.h +++ b/target/arm/internals.h @@ -645,7 +645,7 @@ vaddr arm_adjust_watchpoint_address(CPUState *cs, vaddr= addr, int len); /* Callback function for when a watchpoint or breakpoint triggers. */ void arm_debug_excp_handler(CPUState *cs); =20 -#if defined(CONFIG_USER_ONLY) || !defined(CONFIG_TCG) +#if defined(CONFIG_USER_ONLY) static inline bool arm_is_psci_call(ARMCPU *cpu, int excp_type) { return false; diff --git a/target/arm/kvm.c b/target/arm/kvm.c index 1962eb29b2..98eb6db9ed 100644 --- a/target/arm/kvm.c +++ b/target/arm/kvm.c @@ -529,9 +529,51 @@ int kvm_arch_get_default_type(MachineState *ms) return fixed_ipa ? 0 : size; } =20 +static bool kvm_arm_set_vm_attr(struct kvm_device_attr *attr, const char *= name) +{ + int err; + + err =3D kvm_vm_ioctl(kvm_state, KVM_HAS_DEVICE_ATTR, attr); + if (err !=3D 0) { + error_report("%s: KVM_HAS_DEVICE_ATTR: %s", name, strerror(-err)); + return false; + } + + err =3D kvm_vm_ioctl(kvm_state, KVM_SET_DEVICE_ATTR, attr); + if (err !=3D 0) { + error_report("%s: KVM_SET_DEVICE_ATTR: %s", name, strerror(-err)); + return false; + } + + return true; +} + +int kvm_arm_set_smccc_filter(uint64_t func, uint8_t faction) +{ + struct kvm_smccc_filter filter =3D { + .base =3D func, + .nr_functions =3D 1, + .action =3D faction, + }; + struct kvm_device_attr attr =3D { + .group =3D KVM_ARM_VM_SMCCC_CTRL, + .attr =3D KVM_ARM_VM_SMCCC_FILTER, + .flags =3D 0, + .addr =3D (uintptr_t)&filter, + }; + + if (!kvm_arm_set_vm_attr(&attr, "SMCCC Filter")) { + error_report("failed to set SMCCC filter in KVM Host"); + return -1; + } + + return 0; +} + int kvm_arch_init(MachineState *ms, KVMState *s) { int ret =3D 0; + /* For ARM interrupt delivery is always asynchronous, * whether we are using an in-kernel VGIC or not. */ @@ -594,6 +636,22 @@ int kvm_arch_init(MachineState *ms, KVMState *s) hw_breakpoints =3D g_array_sized_new(true, true, sizeof(HWBreakpoint), max_hw_bps); =20 + /* + * To be able to handle PSCI CPU ON calls in QEMU, we need to install = SMCCC + * filter in the Host KVM. This is required to support features like + * virtual CPU Hotplug on ARM platforms. + */ + if (kvm_arm_set_smccc_filter(PSCI_0_2_FN64_CPU_ON, + KVM_SMCCC_FILTER_FWD_TO_USER)) { + error_report("CPU On PSCI-to-user-space fwd filter install failed"= ); + abort(); + } + if (kvm_arm_set_smccc_filter(PSCI_0_2_FN_CPU_OFF, + KVM_SMCCC_FILTER_FWD_TO_USER)) { + error_report("CPU Off PSCI-to-user-space fwd filter install failed= "); + abort(); + } + return ret; } =20 @@ -1440,6 +1498,38 @@ static bool kvm_arm_handle_debug(ARMCPU *cpu, return false; } =20 +static int kvm_arm_handle_hypercall(CPUState *cs, struct kvm_run *run) +{ + ARMCPU *cpu =3D ARM_CPU(cs); + CPUARMState *env =3D &cpu->env; + + kvm_cpu_synchronize_state(cs); + + /* + * hard coding immediate to 0 as we dont expect non-zero value as of n= ow + * This might change in future versions. Hence, KVM_GET_ONE_REG could= be + * used in such cases but it must be enhanced then only synchronize wi= ll + * also fetch ESR_EL2 value. + */ + if (run->hypercall.flags =3D=3D KVM_HYPERCALL_EXIT_SMC) { + cs->exception_index =3D EXCP_SMC; + env->exception.syndrome =3D syn_aa64_smc(0); + } else { + cs->exception_index =3D EXCP_HVC; + env->exception.syndrome =3D syn_aa64_hvc(0); + } + env->exception.target_el =3D 1; + bql_lock(); + arm_cpu_do_interrupt(cs); + bql_unlock(); + + /* + * For PSCI, exit the kvm_run loop and process the work. Especially + * important if this was a CPU_OFF command and we can't return to the = guest. + */ + return EXCP_INTERRUPT; +} + int kvm_arch_handle_exit(CPUState *cs, struct kvm_run *run) { ARMCPU *cpu =3D ARM_CPU(cs); @@ -1456,6 +1546,9 @@ int kvm_arch_handle_exit(CPUState *cs, struct kvm_run= *run) ret =3D kvm_arm_handle_dabt_nisv(cpu, run->arm_nisv.esr_iss, run->arm_nisv.fault_ipa); break; + case KVM_EXIT_HYPERCALL: + ret =3D kvm_arm_handle_hypercall(cs, run); + break; default: qemu_log_mask(LOG_UNIMP, "%s: un-handled exit reason %d\n", __func__, run->exit_reason); diff --git a/target/arm/kvm_arm.h b/target/arm/kvm_arm.h index ec9dc95ee8..bb2dfde3af 100644 --- a/target/arm/kvm_arm.h +++ b/target/arm/kvm_arm.h @@ -216,6 +216,15 @@ bool kvm_arm_mte_supported(void); * Returns true if KVM can enable EL2 and false otherwise. */ bool kvm_arm_el2_supported(void); + +/** + * kvm_arm_set_smccc_filter + * @func: funcion + * @faction: SMCCC filter action(handle, deny, fwd-to-user) to be deployed + * + * Sets the ARMs SMC-CC filter in KVM Host for selective hypercall exits + */ +int kvm_arm_set_smccc_filter(uint64_t func, uint8_t faction); #else =20 static inline bool kvm_arm_aarch32_supported(void) @@ -242,6 +251,11 @@ static inline bool kvm_arm_el2_supported(void) { return false; } + +static inline int kvm_arm_set_smccc_filter(uint64_t func, uint8_t faction) +{ + g_assert_not_reached(); +} #endif =20 /** diff --git a/target/arm/meson.build b/target/arm/meson.build index 07d9271aa4..ae4e75c4a9 100644 --- a/target/arm/meson.build +++ b/target/arm/meson.build @@ -15,6 +15,7 @@ arm_system_ss.add(files( )) arm_system_ss.add(when: 'CONFIG_KVM', if_true: files('hyp_gdbstub.c', 'kvm= .c')) arm_system_ss.add(when: 'CONFIG_HVF', if_true: files('hyp_gdbstub.c')) +arm_system_ss.add(files('psci.c')) =20 arm_user_ss =3D ss.source_set() arm_user_ss.add(files('cpu.c')) diff --git a/target/arm/tcg/psci.c b/target/arm/psci.c similarity index 96% rename from target/arm/tcg/psci.c rename to target/arm/psci.c index cabed43e8a..fbd2bd2d6f 100644 --- a/target/arm/tcg/psci.c +++ b/target/arm/psci.c @@ -21,10 +21,13 @@ #include "exec/helper-proto.h" #include "kvm-consts.h" #include "qemu/main-loop.h" +#include "qemu/error-report.h" #include "system/runstate.h" +#include "system/tcg.h" #include "internals.h" #include "arm-powerctl.h" #include "target/arm/multiprocessing.h" +#include "exec/target_long.h" =20 bool arm_is_psci_call(ARMCPU *cpu, int excp_type) { @@ -158,6 +161,11 @@ void arm_handle_psci_call(ARMCPU *cpu) case QEMU_PSCI_0_1_FN_CPU_SUSPEND: case QEMU_PSCI_0_2_FN_CPU_SUSPEND: case QEMU_PSCI_0_2_FN64_CPU_SUSPEND: + if (!tcg_enabled()) { + warn_report("CPU suspend not supported in non-tcg mode"); + break; + } +#ifdef CONFIG_TCG /* Affinity levels are not supported in QEMU */ if (param[1] & 0xfffe0000) { ret =3D QEMU_PSCI_RET_INVALID_PARAMS; @@ -170,6 +178,7 @@ void arm_handle_psci_call(ARMCPU *cpu) env->regs[0] =3D 0; } helper_wfi(env, 4); +#endif break; case QEMU_PSCI_1_0_FN_PSCI_FEATURES: switch (param[1]) { diff --git a/target/arm/tcg/meson.build b/target/arm/tcg/meson.build index 895facdc30..f4d8db0f79 100644 --- a/target/arm/tcg/meson.build +++ b/target/arm/tcg/meson.build @@ -49,10 +49,6 @@ arm_ss.add(when: 'TARGET_AARCH64', if_true: files( 'sve_helper.c', )) =20 -arm_system_ss.add(files( - 'psci.c', -)) - arm_system_ss.add(when: 'CONFIG_ARM_V7M', if_true: files('cpu-v7m.c')) arm_user_ss.add(when: 'TARGET_AARCH64', if_false: files('cpu-v7m.c')) =20 --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759281217; cv=none; d=zohomail.com; s=zohoarc; b=KephImRL2lnbt17tLMoRhJgRn+GsizD4P04U2+UcjG0HoVXko9w3kXdfaFp+I5nFgDu2+lFSC/G+n/CPFb5/ha8L9IbigLXluEcdSjbEfkbTVoFlwfUhdUZXbcqSV6TpIWp4Sz9fBt1Dw2UH6JUT1VWGN3KxEzuLFREm8OjHzHE= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759281217; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=QA9K9skWXs/r/1oOXj8qZ7/XkHpNtPZnS1Ie9dQypSg=; b=h+9ksjwvVCcQv1wBgmJltuiof2SPwPPos3kqYfS73KBdB8LKpZQNDwfBYtPAXuLMlTQgxy3In6yT9R98VlMg781yByEY6iJ4eMm4NpAIxx8fR70t3DT8h4L58m0KKqtC6aa0D9iaY0AmLRkmzHHwWV6/f+4k9XyM7w+yqXFMAwU= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759281217454714.0614008207582; Tue, 30 Sep 2025 18:13:37 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lGR-0005iD-BV; Tue, 30 Sep 2025 21:04:15 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lFy-0005UQ-QD for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:48 -0400 Received: from mail-wr1-x42b.google.com ([2a00:1450:4864:20::42b]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lFG-0008Bn-A5 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:46 -0400 Received: by mail-wr1-x42b.google.com with SMTP id ffacd0b85a97d-3ecdf2b1751so4399793f8f.0 for ; Tue, 30 Sep 2025 18:03:01 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280577; x=1759885377; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=QA9K9skWXs/r/1oOXj8qZ7/XkHpNtPZnS1Ie9dQypSg=; b=AAlGIlWniK4N1vwxtk3cvsXVPNMXZIYId/FV9PPvC50x2hC2KpL4HkgmA9Jo3wW+Ec sskeVjDbhMF1jfo3bbDGtgBoufnxHarEHAcvOwNHVp10usuQfFFXLDy317wndCNqV54L wA/novoAuHDVaEbIqgazVuUJJyRnsW/Ez3ceXq76MrLKF+WejrGV49i8FQPAXlw1Xe0c oAu4gL+JPkpjLJbXdb/dNybJ2OR491PPVVLFm/OMjmKdpl9RUkOsRcRPttXdT6l7yd6Y k72XF2wRCWHXE0F1mXCpvxWRWaIGmMeajpR5dNbpGdxui6TAbzCalyMsJkTtlXrPgNlE pcuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280577; x=1759885377; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=QA9K9skWXs/r/1oOXj8qZ7/XkHpNtPZnS1Ie9dQypSg=; b=TE9YY01Gc+ffLzE9v1QlltOPWjVpLw7Smb6fHRO+YSp1WUWghKqdJYakOXCw9JSWF3 XNtQZsJ9Hk/yFqvGW/3L6X2PXAAkvHE7IKjHdpLT2h4s9ThVjKvCr/LdmZbUMFZNN3QE yehVqqMx7JLdi5/lfOmXF/HlmDSacYMLprGtpVbMEOYoswEuy1LDlSBi81eNN2ROdWNP C6pb2wl1qRSKJIb3sbVZCtyrpPazKOhqmJG1Izb1xcb4Blnpq75oK8tOwmkNV5PZM5zU dO3T3DpnkGOC0t7zcxJhM0wKvookddFyvaNZ/Xe1lYCHtZviNzB62d+JmxJisgQIrEKW X3DA== X-Gm-Message-State: AOJu0Yxqj/Z8xIEzzgNQdeNSGEgYl4wA0zN8YJnjWZALUR/MwwFyD1HL dsgWFAHL2nju1igRlx8/ctlki+9ULUdKRmqUxpZX9fX0OEDxTpZK2xjwSEAOezM6K0bgeDNRvZ5 TG3SWYICArQ== X-Gm-Gg: ASbGncukz2AYdk66B2aXVH6gt5AYj7e58GBpbADV6wWmQmWnLl6fvBJN6odmLigev1N sYu/1/5vtWOKe5kGG2wNmgm/T7U04xxpzGYmaxXB2CxdWr5KVnAAjFTtx5SSV7ERYwDwvNXB1j+ fU11bDXWNJsEgwuXLGlvYormshD0soR4BAPAzySR7t5QsHkJgzdmR1BO/CxGw3CGINCDN6g2Syh dBIp86xTPvRI8dCkfXqAnSCttYJwL330xnBkmjkvEWP/nQPAQke02rlShUOQEDBYMaigZ5FLq+e jGCFr/AxLg10juBzRP5LDpO9QL6zyQRAafYe4AiEBpXGzm5rAglF2loAmf6xlbkp0N/BPF127/0 PYYx2wlz58/Mtf2UMRtY+dZQZ7/0QeSqMqMtix6+EwCVI0LTkqahbhCp6r64mtG9f5iGdmiNujK 5ZxZFG2OJNWPiBkqP1YpvYNVXXfuh1G1n/+7wk+uFS+3M= X-Google-Smtp-Source: AGHT+IFThuu0ki8xLIEofWJj296f6f32+rIbtRYw4hrXRdchJwH+Ge/B0avOu62ues1Sp8GocB88Qg== X-Received: by 2002:a05:6000:2c01:b0:3ec:b899:bc39 with SMTP id ffacd0b85a97d-42557a1b40dmr1065274f8f.58.1759280577175; Tue, 30 Sep 2025 18:02:57 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 19/24] target/arm/cpu: Add the Accessor hook to fetch ARM CPU arch-id Date: Wed, 1 Oct 2025 01:01:22 +0000 Message-Id: <20251001010127.3092631-20-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::42b; envelope-from=salil.mehta@opnsrc.net; helo=mail-wr1-x42b.google.com X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759281219657116600 Content-Type: text/plain; charset="utf-8" From: Salil Mehta ACPI 'acpi_cpu_{device_check,eject_request}_cb()' uses 'get_cpu_status()' API to get the existing 'AcpiCpuOspmStateStatus' of the CPU being 'online'd= or offline'd' after VM has initialized. Later usesCPUClass::get_arch_id` to ma= tch the CPU. Hence, we must add ARM CPU architecture specific accessor hook to = fetch `mp-affinity` programmed in the KVM host. Signed-off-by: Salil Mehta --- target/arm/cpu.c | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/target/arm/cpu.c b/target/arm/cpu.c index 0ceaf69092..d147e786c1 100644 --- a/target/arm/cpu.c +++ b/target/arm/cpu.c @@ -2744,6 +2744,11 @@ static const TCGCPUOps arm_tcg_ops =3D { }; #endif /* CONFIG_TCG */ =20 +static int64_t arm_cpu_get_arch_id(CPUState *cs) +{ + return arm_cpu_mp_affinity(ARM_CPU(cs)); +} + static void arm_cpu_class_init(ObjectClass *oc, const void *data) { ARMCPUClass *acc =3D ARM_CPU_CLASS(oc); @@ -2763,6 +2768,7 @@ static void arm_cpu_class_init(ObjectClass *oc, const= void *data) cc->dump_state =3D arm_cpu_dump_state; cc->set_pc =3D arm_cpu_set_pc; cc->get_pc =3D arm_cpu_get_pc; + cc->get_arch_id =3D arm_cpu_get_arch_id; cc->gdb_read_register =3D arm_cpu_gdb_read_register; cc->gdb_write_register =3D arm_cpu_gdb_write_register; #ifndef CONFIG_USER_ONLY --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759281345; cv=none; d=zohomail.com; s=zohoarc; b=cQpqTqw7LYXn4xv2fnjJhffS5qZjbcQOlWtlp/snl5MLia+5V1gdj509nMZ5Sdo1gCW0rOw7iiaOzqxskwaPYOZAd163SoIqecmTmVAkqxxsOtk21HE2P9oCwtZn70JUfSziNMtr33QtjEOR6gTdG+yS9f+m5gUDir22YpYOgEI= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759281345; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=dsPhyqsNOSQe1Rkbd7aGs1Z/OQwNtlVBiwoEPzFphoY=; b=b892lAE7uZjKPf9XizcvTtssySjz9enGJVHE+XuWPzPCTLeE7dCx7GZjehYu1mbmwpPowDVZVqU0BX+Jj2zj7Sf7rTDs+HhEiTe1vEuFe60gDOjAVWPiKmTNWf4jnFiv0MuQ/ckmPOyHEPjgBkztqSL/wRk8qgHjSfh0FbicEAY= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759281345342907.7498590225756; Tue, 30 Sep 2025 18:15:45 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lGu-0005yI-Pk; Tue, 30 Sep 2025 21:04:46 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lGB-0005Xp-8N for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:04:01 -0400 Received: from mail-wr1-x42f.google.com ([2a00:1450:4864:20::42f]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lFL-0008DN-Ks for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:03:57 -0400 Received: by mail-wr1-x42f.google.com with SMTP id ffacd0b85a97d-3ee64bc6b90so4136493f8f.0 for ; Tue, 30 Sep 2025 18:03:04 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:02:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280579; x=1759885379; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=dsPhyqsNOSQe1Rkbd7aGs1Z/OQwNtlVBiwoEPzFphoY=; b=MibP2eq1HI3/8IVCnbJ1vo0rbv6H7H+IvqSs5WWP25E3Cby6kd9olBx6A89tV3Vq/5 Dp1hUVNRq1/jwpgLTHu69IM6Kh1efi7BHMzQfLhUVLQBO9e9aNIIF54GXy+6hOmFP66d FFmLlNM5vkgZUrd5TaqWA7a3IfRz2H8dlVa7o4SDMZmq1gvarYYRS1yFBTuD9pr49dQq fqzftFk4jdsDDy3ETNdVll09FgFqzl0+10w9RLBU8WSlonApD1X/Z3WnwkexxK+T4PIA p/O8tZbg8VjIA91z8o45zfBlG0mcW7zDeiLSpEBs5EWSBw/s6ETcGti+6Cri8RH47of3 +B0g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280579; x=1759885379; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=dsPhyqsNOSQe1Rkbd7aGs1Z/OQwNtlVBiwoEPzFphoY=; b=MfWvlw4H4Cvk5TV0ltopCjmuKehhv2aLncfNIaV8EuKqwjilwjSRkqyyItgKPtrFeA xs6soG9QwYX/Q1acd6wvDakHmzmk2Qv3G6Cise3piGtCu2xDuwoYT0jZCik30NVIsnZd 7AXHR3QZpo7wt5k42sDSBaydBg/YstDANSQ+gjp/IsjqOlm7U350BQ5YJWjM0Gu6coRr cnEz3tkAnYP/tMuQj0nlZqaRxgxAX5ex0AUrWmC49imBzqbCkIUmmg+awJoYrdEgo2Te X8NGQeZ4Wk/sHxGBz6zsSEz7MNxgE91n7H1+4QjsFofOVjBmkRYejHGq8WnF1g435md3 HH1g== X-Gm-Message-State: AOJu0Yzz6my2CkllZO6Co91tdwcvnTfgTV+luTYidMcEpHmEAeLRJGZD qBhMzebErel/V81QxxOHJMRHHN3IlvWj6EzFZpx1y/da2tUQvLUD0nJPITgoxgqE1pbI6VJL+ya /CrgC7jCfhQ== X-Gm-Gg: ASbGncsSOR9LVxZWL5PS13cYOb1aUVrnuez30Na5q0027hDyoPso0TuAK8cPn+ydl4g EfDDWId8/w+/3U3H1o4TJOHqKKU+yqE4AmEwPWjU2wTxiYJENE7Q+mW2ITV/Ys7dmqHNskZR9kb wXoxkXgt8ujFvDr9GwOJOG2kkvOOgsCMdbJPUKOUaWRqQrnUaIKSOHGcvOazw6aVP+LEycnMaGM OPYtQ/EErXr6sZ4XuJQ/W9sFy1HDz9Oy2nPHLAXBOVg8OL7G2GbD95qxLJfW9auyAdmSCsezmRh tvRfFL0AHelpp0ASZZOEpXZE9OzFLlZbjjSp9vSB+bLbFsD0jWC+7X9H/zb2LzWW/sLaD/ZH1d8 1rasH62CBRnLNSEZU2xRnwNQ1XxKGFREnBs5NFLq6zaHiOBRF0wv6RyEkUddj/71p/BT6RnQFdW 66/Ft6Y+judERhbBPIM0YJdOvNAKGMUs/MV2ObeLD07Lk= X-Google-Smtp-Source: AGHT+IFtB5MGJPxZc7LFdAIJwiPBFhGpfES9vziFBKWvvIiG64aaUix3NmQGnEMTBXCcgGWjjjk8RA== X-Received: by 2002:a05:6000:2284:b0:3ee:154e:504 with SMTP id ffacd0b85a97d-425577f1c60mr1102015f8f.19.1759280578924; Tue, 30 Sep 2025 18:02:58 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 20/24] target/arm/kvm: Write vCPU's state back to KVM on cold-reset Date: Wed, 1 Oct 2025 01:01:23 +0000 Message-Id: <20251001010127.3092631-21-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::42f; envelope-from=salil.mehta@opnsrc.net; helo=mail-wr1-x42f.google.com X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759281347585116600 From: Jean-Philippe Brucker Previously, all `PSCI_CPU_{ON, OFF}` calls were handled directly by KVM. However, with the introduction of this new vCPU hotplug-like feature, these hypervisor calls are now trapped to QEMU for policy checks. This shift can = lead to inconsistent vCPU states between KVM and QEMU, particularly when the vCP= U has been recently administratively enabled and is transitioning from either unp= arked state in QOM due to 'lazy realization' or even from 'powered-off' state. Therefore, it is crucial to synchronize the vCPU state with KVM, especially= in the context of a cold reset of the QOM vCPU. The same applies when PSCI CPU= _OFF is being handled by Qemu, it must ensure that kVM vCPUs are powered-off as = well. To ensure this synchronization, mark the QOM vCPU as "dirty" to trigger a c= all to `kvm_arch_put_registers()`. This guarantees that KVM=E2=80=99s `MP_STATE= ` is updated accordingly, forcing synchronization of the `mp_state` between QEMU and KVM. Signed-off-by: Jean-Philippe Brucker Signed-off-by: Salil Mehta --- target/arm/arm-powerctl.c | 1 + target/arm/kvm.c | 7 +++++++ 2 files changed, 8 insertions(+) diff --git a/target/arm/arm-powerctl.c b/target/arm/arm-powerctl.c index ab4422b261..89074918a9 100644 --- a/target/arm/arm-powerctl.c +++ b/target/arm/arm-powerctl.c @@ -263,6 +263,7 @@ static void arm_set_cpu_off_async_work(CPUState *target= _cpu_state, =20 assert(bql_locked()); target_cpu->power_state =3D PSCI_OFF; + target_cpu_state->vcpu_dirty =3D true; target_cpu_state->halted =3D 1; target_cpu_state->exception_index =3D EXCP_HLT; } diff --git a/target/arm/kvm.c b/target/arm/kvm.c index 98eb6db9ed..c4b68a0b17 100644 --- a/target/arm/kvm.c +++ b/target/arm/kvm.c @@ -1026,6 +1026,7 @@ bool kvm_arm_cpu_post_load(ARMCPU *cpu) void kvm_arm_reset_vcpu(ARMCPU *cpu) { int ret; + CPUState *cs =3D CPU(cpu); =20 /* Re-init VCPU so that all registers are set to * their respective reset values. @@ -1047,6 +1048,12 @@ void kvm_arm_reset_vcpu(ARMCPU *cpu) * for the same reason we do so in kvm_arch_get_registers(). */ write_list_to_cpustate(cpu); + + /* + * Ensure we call kvm_arch_put_registers(). The vCPU isn't marked dirt= y if + * it was parked in KVM and is now booting from a PSCI CPU_ON call. + */ + cs->vcpu_dirty =3D true; } =20 void kvm_arm_create_host_vcpu(ARMCPU *cpu) --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759281036; cv=none; d=zohomail.com; s=zohoarc; b=RcBW6yrFzS1Ga8ZOJ+an8eZR+mXefmiWwI+PIPQR8Cy7xFGq7Muo/6190aAEfcr5iR0vhq4/nqUsposS77f5Mw62iSKz3LT0oKaqrmhLTLrETfN5fQZhWiUzAWypSh9BVgO/k/QlgMTbzpNTmu5b4WoeAvV5+0uGixn7f4blnRs= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759281036; h=Content-Type:Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=LkorhTYVdtNEc2JK9JCYJPvsAtCb3dSBOdMaKNrFs7Y=; b=Xe36xZcAEvRSnNpIlgJls9se87jCWyiAbddaH0Uq3QnNb4JyrsePcxDK5ocwDQzRF8hNEFNGwRBwfpbwRvfI9hEe5CTUH4lGYxh4KLFPCZYJEgLYbTjS/+WiSJ04lMcUDEUnsu1YLEjiurkhvCxNxfTusxremZiycS0SAcIivPc= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759281036492169.17723667307996; Tue, 30 Sep 2025 18:10:36 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lGz-00064U-65; Tue, 30 Sep 2025 21:04:49 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lGL-0005e5-Fa for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:04:09 -0400 Received: from mail-wm1-x330.google.com ([2a00:1450:4864:20::330]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lFN-0008DL-9m for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:04:08 -0400 Received: by mail-wm1-x330.google.com with SMTP id 5b1f17b1804b1-46e4ad36541so41966955e9.0 for ; Tue, 30 Sep 2025 18:03:03 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.02.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:03:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280581; x=1759885381; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LkorhTYVdtNEc2JK9JCYJPvsAtCb3dSBOdMaKNrFs7Y=; b=c/IBDRZdX2QyTxvPUieuUL1mGP6FJHpv0VtlQwU4gLSkD/DqR+0t6Du5lRGXf9MOQ3 yQqG6W6msiTNi7HOseCvKC5qmNIhC6HC0Qfr/ywrIH9ttr7wB+BHLSjtKJqx0z6cFdWJ JwNRkYyDLi9vRVsJStmgErvssviuF38UaYgPU+hkOh/xZMRy4Lo2eSdO0JWMfuUtSNPP keP5pvwrk9WGP4i25niOj4//HhYinjoOUeqcvtQk9PMio9OEC8UdbLtIhzXtmQRi9HvI +/7mgRjkVj4gdwn8DUHIeefzHfhOQUYn9rH1JDTd/R7AQNHVKbrGs2bwVVFLqizsa1el SJ5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280581; x=1759885381; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LkorhTYVdtNEc2JK9JCYJPvsAtCb3dSBOdMaKNrFs7Y=; b=riRRtKeyD0PwIdYNzRRCthDA3Sk+/ZWVEEFRQ5D1ga2cDK9K68tEm3Sh43ysx9bbt6 Mn2vD7ycBleY2IZGhagn3uI4E/5Vlp1GNVoea0jp2MjHKLcDzq/LNAylcRraJipzqtqY a/n2v22EMaaXMSd2GL3Jw3F0Kc7jOthx5deK2GyBSL7UAhSgGCTqmFnCi+McLvhE1T2p sKVW/vA9cdEYsSoOQMA2tVN99yrJFJvCTAk+sVUexblyoCCef5+nd9Zi8Qc9obBKCsbD RMsO+rb2Q736WIDWZaXGuQ9JLgpPoOEYoXBSYmVMlYNkmnGalgbSBcUQUtQd+cl2KqT3 fm8A== X-Gm-Message-State: AOJu0YwKK49XG+846LJX1+18OTpNe/g9J5DpdgDz3GEsyKRcAo/oLpkY +HDtrZOCF7yjgCVESWSkIXVNoQnj9JU7jwE0fLdde0FRoGkLItY3pWLOp+rbZstSb/39MpCJUeF l2SoaHFFOoQ== X-Gm-Gg: ASbGncts3hxr50RuCEpthovg24x0jct3qFC2fmpSoN15cjlkHgwI7/vAtAMcQtF7Sti LRfwFlwzrJfsRZ7C1S58pmotBUm7fAm8dYeygxCmaU77vPzRCcUNkLOzzh8tTKPLbi+qsCiGOAR QAbnE607DdJDECsxtDluFyOPNMeLVh/+2U2D/4rUJm1u40ilCgH0JABaTHkJ4QoqcpVNw+WR3Dw 4Bu1vBE0szpzaYMB9lhlZdZWkmzzf61Hbv+AQA/9gsIJq+oEjHPeEXNGME/x0CoSfmgXWuIBJgD tUD+9ZNJAW6yK+G1G+OBHzk1nA+FMU67a9ZRrUdLPmRxZbKev21ZdWTLUcrxeRb9vDFgGnmCfL/ lcpE29IyYLyahzE8ry0Ny2rJptoSMUIRU0doWeNMYmnOn7SiCVT9WitMSUiT8A/OEGEs97Lk8/W /k5FLkfk9Xdu1DIUocWpXmSRlaJcWp2dYxMsTLaMIIYfo= X-Google-Smtp-Source: AGHT+IGtiMfNDTwBfFlWDk5u04zD0uoanejP/L9zghp74Y9gIVBsMBao7ceLjywLt+u/Os5JnWjkgg== X-Received: by 2002:a05:600c:198b:b0:45b:80ff:58f7 with SMTP id 5b1f17b1804b1-46e612e57c3mr13281575e9.36.1759280580648; Tue, 30 Sep 2025 18:03:00 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 21/24] hw/intc/arm-gicv3-kvm: Pause all vCPUs & cache ICC_CTLR_EL1 for userspace PSCI CPU_ON Date: Wed, 1 Oct 2025 01:01:24 +0000 Message-Id: <20251001010127.3092631-22-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::330; envelope-from=salil.mehta@opnsrc.net; helo=mail-wm1-x330.google.com X-Spam_score_int: -16 X-Spam_score: -1.7 X-Spam_bar: - X-Spam_report: (-1.7 / 5.0 requ) BAYES_00=-1.9, DKIM_INVALID=0.1, DKIM_SIGNED=0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=no autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759281037041116600 From: Salil Mehta Problem: =3D=3D=3D=3D=3D=3D=3D When PSCI CPU_ON was handled entirely in KVM, the operation executed under VGIC/KVM locks at EL2 and appeared atomic to other vCPU threads (intermedia= te states were not observable). With the SMCCC forward-to-userspace filter ena= bled, PSCI ON/OFF calls now exit to QEMU, where policy checks are performed. In the userspace CPU_ON handling (during cpu_reset?), QEMU must perform IOC= TLs to fetch ICC_CTLR_EL1 fields that reflect supported features and IRQ-related configuration (e.g. EOImode, PMHE, CBPR). While these IOCTLs are in flight, other vCPUs can run and cause transient inconsistency. KVM enforces atomici= ty by trying to take all vCPU locks (kvm_trylock_all_vcpus() -> -EBUSY). QEMU therefore pauses all vCPUs before issuing these IOCTLs to avoid contending = for locks and to prevent -EBUSY failures during cpu_reset. KVM Details: (As I understand and stand ready to be corrected! :)) Userspace fetch of sysreg ICC_CTLR_EL1 results in access of ICH_VMCR_EL2 re= g. VMCR is per-vCPU and controls the CPU interface. Pending state is recorded = in the distributor for SPIs, and in each redistributor for SGIs and PPIs. Deli= very to the PE depends on the CPU interface configuration (VMCR fields such as P= MR, IGRPEN, EOImode, BPR). Updates to VMCR must therefore be applied atomically= with respect to interrupt injection and deactivation. The KVM ioctl layer first attempts to lock all vCPU mutexes, and only then takes the VM lock before calling vgic_v3_attr_regs_access(). This ordering serializes userspace acce= sses with IRQ handling (IAR/EOI and SGI delivery?). ICC_CTLR_EL1 initially reflects architectural defaults (e.g. EOImode, PMR). Most fields are read-only feature indicators that never change. Writable fields such as EOImode, PMHE and CBPR are configured once by the guest GICv3 driver and then remain pseudo-static. Both the initial defaults and the guest-configured values can be cached and reused across resets, avoiding repeated VM-wide pauses to fetch ICC_CTLR_EL1 from KVM on every cpu_reset(). Appendix: ICC_CTLR_EL1 layout (for reviewers) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D ICC_CTLR_EL1 [63:0] 63 = 32 +-------------------------------------------------------------------------= ---+ | RES0 = | +-------------------------------------------------------------------------= ---+ 31 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 = 0 +------------+--+--+--+--+--+--+--+--+--+---+---+---+--+--+--+--+--+--+--+= --+ | RES0 |Ex|RS|RES0 |A3|SE| IDbits | PRIbits |R0|PM| RES0 |EO|= CB| +------------+--+--+--+--+--+--+--+--+--+---+---+---+--+--+--+--+--+--+--+= --+ | | | | | | | | | | | | | | | += CBPR | | | | | | +EOI= mode | | | | | +-PMHE | | | | +----RES0 | | | +--SEIS | | +-----A3V | +--------------RSS +-----------------ExtRange Access: {Ex, RS, A3, SE, IDbits, PRIbits} =3D RO; {PMHE} =3D RW*; {EO, CB} =3D RW**; others =3D RES0. Notes : * impl-def (may be RO when DS=3D0) ** CB may be RO when DS=3D0 (EO stays RW) Source: Arm GIC Architecture Specification (IHI 0069H.b), =C2=A712.2.6 =E2=80=9CICC_CTLR_EL1=E2=80=9D, pp. 12-233=E2=80=A612= -237 Resets that may trigger ICC_CTLR_EL1 fetch include: 1. PSCI CPU_ON 2. qemu_system_reset() during full VM reset 3. Post-load path on migration 4. Lazy realization via device_set/-deviceset It can be expensive to pause the entire VM just to reset one vCPU, especial= ly for long-lived workloads where hundreds of resets may occur. For such syste= ms, frequent VM-wide pauses are unacceptable. Solution: =3D=3D=3D=3D=3D=3D=3D=3D This patch caches ICC_CTLR_EL1 early, seeding it either from architectural defaults or on the first PSCI CPU_ON when the guest GICv3 driver has initia= lized the interface. The cached value is then reused on every cpu_reset(), avoidi= ng repeated VM-wide pauses and heavy IOCTLs. The IOCTL path is retained only a= s a fallback if the cached shadow is not valid. Signed-off-by: Salil Mehta --- hw/intc/arm_gicv3_kvm.c | 93 ++++++++++++++++++++++++++++-- include/hw/intc/arm_gicv3_common.h | 10 ++++ target/arm/arm-powerctl.c | 1 + target/arm/cpu.h | 1 + 4 files changed, 100 insertions(+), 5 deletions(-) diff --git a/hw/intc/arm_gicv3_kvm.c b/hw/intc/arm_gicv3_kvm.c index e97578f59a..62d6016e8a 100644 --- a/hw/intc/arm_gicv3_kvm.c +++ b/hw/intc/arm_gicv3_kvm.c @@ -27,6 +27,7 @@ #include "qemu/module.h" #include "system/kvm.h" #include "system/runstate.h" +#include "system/cpus.h" #include "kvm_arm.h" #include "gicv3_internal.h" #include "vgic_common.h" @@ -681,13 +682,73 @@ static void kvm_arm_gicv3_get(GICv3State *s) } } =20 +/* Caller must hold the iothread (BQL). */ +static inline void +kvm_gicc_get_cached_icc_ctlr_el1(GICv3CPUState *c, uint64_t regval[2], + bool *valid) +{ + const uint64_t attr =3D (uint64_t)KVM_VGIC_ATTR(ICC_CTLR_EL1, c->gicr_= typer); + const int group =3D KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS; + GICv3State *s =3D c->gic; + uint64_t val =3D 0; + int ret; + + assert(regval && valid); + + if (*valid) { + /* Fast path: return cached (no vCPU pausing required). */ + c->icc_ctlr_el1[GICV3_NS] =3D regval[GICV3_NS]; + c->icc_ctlr_el1[GICV3_S] =3D regval[GICV3_S]; + return; + } + + ret =3D kvm_device_access(s->dev_fd, group, attr, &val, false, NULL); + if (ret =3D=3D -EBUSY || ret =3D=3D -EAGAIN) { + int tries; + + /* One-time heavy path: avoid contention by pausing all vCPUs. */ + pause_all_vcpus(); + /* + * Even with vCPUs paused, we cannot fully rule out a non-vCPU con= text + * temporarily holding KVM vCPU mutexes; treat -EBUSY/-EAGAIN as + * transient and retry a few times. Final attempt aborts in-loop. + */ + for (tries =3D 0; tries < 5; tries++) { + Error **errp =3D (tries =3D=3D 4) ? &error_abort : NULL; + + ret =3D kvm_device_access(s->dev_fd, group, attr, &val, false,= errp); + if (!ret) { + break; + } + if (ret !=3D -EBUSY && ret !=3D -EAGAIN) { + error_setg_errno(&error_abort, -ret, + "KVM_GET_DEVICE_ATTR failed: Group %d " + "attr 0x%016" PRIx64, group, attr); + /* not reached */ + } + g_usleep(50); + } + resume_all_vcpus(); + } + + /* Success: publish and seed cache. */ + c->icc_ctlr_el1[GICV3_NS] =3D val; + c->icc_ctlr_el1[GICV3_S] =3D val; + + regval[GICV3_NS] =3D c->icc_ctlr_el1[GICV3_NS]; + regval[GICV3_S] =3D c->icc_ctlr_el1[GICV3_S]; + *valid =3D true; +} + static void arm_gicv3_icc_reset(CPUARMState *env, const ARMCPRegInfo *ri) { GICv3State *s; GICv3CPUState *c; + ARMCPU *cpu; =20 c =3D (GICv3CPUState *)env->gicv3state; s =3D c->gic; + cpu =3D ARM_CPU(c->cpu); =20 c->icc_pmr_el1 =3D 0; /* @@ -713,11 +774,33 @@ static void arm_gicv3_icc_reset(CPUARMState *env, con= st ARMCPRegInfo *ri) } =20 /* Initialize to actual HW supported configuration */ - kvm_device_access(s->dev_fd, KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS, - KVM_VGIC_ATTR(ICC_CTLR_EL1, c->gicr_typer), - &c->icc_ctlr_el1[GICV3_NS], false, &error_abort); - - c->icc_ctlr_el1[GICV3_S] =3D c->icc_ctlr_el1[GICV3_NS]; + /* + * Avoid racy VGIC CPU sysreg reads while vCPUs are running. KVM requi= res + * pausing all vCPUs for ICC_* sysregs accesses to prevent races with + * in-flight IRQ delivery (e.g. EOImode etc.). + * + * To keep the reset path fast, cache the architectural default and the + * guest GICv3 driver configured ICC_CTLR_EL1 on the first access and = then + * reuse that for subsequent resets. Most fields in this register are + * invariants throughout the life of VM. Fields EOImode, PMHE and CBPR= are + * pseudo static and dont change once configured by guest driver. + */ + if (cpu->first_psci_on_request_seen || s->guest_gicc_initialized) { + if (!s->guest_gicc_initialized) { + s->guest_gicc_initialized =3D true; + } + kvm_gicc_get_cached_icc_ctlr_el1(c, c->icc_ctlr_configured, + &c->icc_ctlr_configured_valid); + } else { + /* + * kernel has not loded yet. It safe to assume not other vCPU is in + * KVM_RUN except vCPU 0 at this moment. Just in case, if there is + * other priviledged context of KVM accessing the register then we + * KVM device access can potentially return -EBUSY. + */ + kvm_gicc_get_cached_icc_ctlr_el1(c, c->icc_ctlr_arch_def, + &c->icc_ctlr_arch_def_valid); + } } =20 static void kvm_arm_gicv3_reset_hold(Object *obj, ResetType type) diff --git a/include/hw/intc/arm_gicv3_common.h b/include/hw/intc/arm_gicv3= _common.h index a8a84c4687..0282a94edc 100644 --- a/include/hw/intc/arm_gicv3_common.h +++ b/include/hw/intc/arm_gicv3_common.h @@ -165,6 +165,15 @@ struct GICv3CPUState { uint64_t icc_apr[3][4]; uint64_t icc_igrpen[3]; uint64_t icc_ctlr_el3; + /* + * Shadow copy of ICC_CTLR_EL1 architectural default. Fetched once per= -vCPU + * when no vCPUs are running, and reused on reset to avoid calling + * kvm_device_access() in the hot path. + */ + uint64_t icc_ctlr_arch_def[2]; /* per-secstate (NS=3D0,S=3D1) */ + bool icc_ctlr_arch_def_valid; + uint64_t icc_ctlr_configured[2]; + bool icc_ctlr_configured_valid; bool gicc_accessible; =20 /* Virtualization control interface */ @@ -240,6 +249,7 @@ struct GICv3State { bool force_8bit_prio; bool irq_reset_nonsecure; bool gicd_no_migration_shift_bug; + bool guest_gicc_initialized; =20 int dev_fd; /* kvm device fd if backed by kvm vgic support */ Error *migration_blocker; diff --git a/target/arm/arm-powerctl.c b/target/arm/arm-powerctl.c index 89074918a9..0b65898cec 100644 --- a/target/arm/arm-powerctl.c +++ b/target/arm/arm-powerctl.c @@ -68,6 +68,7 @@ static void arm_set_cpu_on_async_work(CPUState *target_cp= u_state, ARMCPU *target_cpu =3D ARM_CPU(target_cpu_state); struct CpuOnInfo *info =3D (struct CpuOnInfo *) data.host_ptr; =20 + target_cpu->first_psci_on_request_seen =3D true; /* Initialize the cpu we are turning on */ cpu_reset(target_cpu_state); arm_emulate_firmware_reset(target_cpu_state, info->target_el); diff --git a/target/arm/cpu.h b/target/arm/cpu.h index cd5982d362..603e482b3a 100644 --- a/target/arm/cpu.h +++ b/target/arm/cpu.h @@ -974,6 +974,7 @@ struct ArchCPU { =20 /* Current power state, access guarded by BQL */ ARMPSCIState power_state; + bool first_psci_on_request_seen; =20 /* CPU has virtualization extension */ bool has_el2; --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759281113276704.0246425426863; Tue, 30 Sep 2025 18:11:53 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lH0-00064v-Fw; Tue, 30 Sep 2025 21:04:50 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lGP-0005iX-Ik for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:04:15 -0400 Received: from mail-wm1-x32d.google.com ([2a00:1450:4864:20::32d]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lFO-0008ED-DI for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:04:13 -0400 Received: by mail-wm1-x32d.google.com with SMTP id 5b1f17b1804b1-46b303f755aso58090205e9.1 for ; Tue, 30 Sep 2025 18:03:06 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.03.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:03:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280583; x=1759885383; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=YvbFfcp7wnei5ZfqWSO2jElDFuBlDIo3JfgLZwsSQVw=; b=dOxxtRihKw/NLqS/obje+pwTWj2iqhA0d7YZ/kEjLal2m+O7CWzKv82VANuDwtUNfF tqnrBREsCpQBw8K8ZGPpyrwGWy/dfrTlITQCwnukooQVSOB1ViX6vbBly3paqG9OtZmD YSSmTlqH5duPgPSPcylfBJOjBMoqMZu20bIJfP4JxdUVGhViu+dfovgSq6YNbT3GW2w4 d6XI/pGojEgC0WfW6XWQHPDzpw94bUw8cg/ukYOf3F7uAwKCVD5M2ZQf77zzSK0h9Jcy GFN8hCsl9/9ka1/VXPp958yPS+COU1CEqg/mae+q5KKruwpmmyiLE4L9nwddtvAqqrpK rz+w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280583; x=1759885383; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=YvbFfcp7wnei5ZfqWSO2jElDFuBlDIo3JfgLZwsSQVw=; b=O6vevJwL4Q7dvReEiwgw/aSwbWuMzJ/py3a2QiXOl8nhhtxKg5Zf1qUg+MINoJrfep qmqsMLTLkWrh+4dpmfT4VhlHhxRisg1k2RhRLrTRcjPgSfiRBocRiQsqjylpU18rKpko xFfBajpWz8r0cQWKRft8UQ9S4l/QBS8eKE8JiaNkl1tO+N1KbUgW4ztmBw5+oi9f3hWt 4P/LUxr093Kvix1JGkXwv65gLutpwU8moikzTmRboKJFiQQmosLXhPzQmG1c7jrJy/2G qiVCPfid0E/G9gb35U3+c3FFq+eR9JMytYc9go/vAwJE86BosZDVgjFOj4GzA0A9p2sQ v+8A== X-Gm-Message-State: AOJu0Yxzhj9aGou37v4/fKH141wMA/85ZPg3Fj9cj71Cbjy1FYn56j57 XpZ7PPW/Jb6H4L0JW+NIKiK9xyIpgJ+8y8qiHXluSdQoCXfKT/dbQQ3saEqydIZycul+O9aqHfh ho3mTAitLqA== X-Gm-Gg: ASbGncv8vb33kKMUj4VWFM9IZSayrQ0LHfaLUL29nV4JvR4gwbAz709LumblVDi8QfK Gi8fIZZFr16ad6ffU9SBLegZMUCliTT8R6RN4PfNoXv3T2NieHxx03SkT8izp5EwOGqScjRKw6H lYAKEaxRvoOQssX2/b3QMCJyGwSU3XdrtY4rO0aSUxtNwNBCmxwvHQ5gvWaFk7i1xw3GPNHuHTJ RP/Od5ad//bPIPSfFFM2bvzK8J+D5esy3o5nIsDPiejQt1mHtcJMW0l3QXnR7RTXNlf2mTdCtmh w1O9dnvvkZZDiM4q0mnPNQgryKebCjabLG+zKSIMU35aqqP2QO/Mcp5j2UXpG3VU5qZxvlcHS2t LtP12U2/OKH0TdxUNzeF8bSStf23KvCeU6ZxRX4ZN4aCB0TvMSyJZAOTZ7hdpI05EoBVTXkn7nS buBpQPs44q+hAqkqT1m/9BlATnuWV8PHC9s0uIbqHuT18= X-Google-Smtp-Source: AGHT+IHz5zJvRAfTPBo9M5sp5y9ZSQpGXLUE6GRT5+Sd8Ti2k3C3x23ab/qaxyHFH3BNL5MCzuLhNQ== X-Received: by 2002:a05:600c:c4aa:b0:45f:2cb5:ecff with SMTP id 5b1f17b1804b1-46e612cb719mr15530685e9.31.1759280582488; Tue, 30 Sep 2025 18:03:02 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 22/24] monitor, qdev: Introduce 'device_set' to change admin state of existing devices Date: Wed, 1 Oct 2025 01:01:25 +0000 Message-Id: <20251001010127.3092631-23-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::32d; envelope-from=salil.mehta@opnsrc.net; helo=mail-wm1-x32d.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1759281114255116600 From: Salil Mehta This patch adds a "device_set" interface for modifying properties of devices that already exist in the guest topology. Unlike 'device_add'/'device_del' (hot-plug), 'device_set' does not create or destroy devices. It is intended for guest-visible hot-add semantics where hardware is provisioned at boot b= ut logically enabled/disabled later via administrative policy. Compared to the existing 'qom-set' command, which is less intuitive and wor= ks only with object IDs, device_set provides a more device-oriented interface. It can be invoked at the QEMU prompt using natural device arguments, and the new '-deviceset' CLI option allows properties to be set at boot time, simil= ar to how '-device' specifies device creation. While the initial implementation focuses on "admin-state" changes (e.g., enable/disable a CPU already described by ACPI/DT), the interface is design= ed to be generic. In future, it could be used for other per-device set/unset style controls =E2=80=94 beyond administrative power-states =E2=80=94 provi= ded the target device explicitly allows such changes. This enables fine-grained runtime control of device properties. Key pieces: * QMP: qmp_device_set() to update an existing device. The device can be located by "id" or via driver+property match using a DeviceListener callback (qdev_find_device()). * HMP: "device_set" command with tab-completion. Errors are surfaced via hmp_handle_error(). * CLI: "-deviceset" option for setting startup/admin properties at boot, including a JSON form. Options are parsed into qemu_deviceset_opts and applied after device creation. * Docs/help: HMP help text and qemu-options.hx additions explain usage and explicitly note that no hot-plug occurs. * Safety: disallowed during live migration (migration_is_idle() check). Semantics: * Operates on an existing DeviceState; no enumeration/new device appears. * Complements device_add/device_del by providing state mutation only. * Backward compatible: no behavior change unless "device_set"/"-deviceset" is used. Examples: HMP: (qemu) device_set host-arm-cpu,core-id=3D3,admin-state=3Denable CLI (at boot): -smp cpus=3D4,maxcpus=3D4 \ -deviceset host-arm-cpu,core-id=3D2,admin-state=3Ddisable QMP (JSON form): { "execute": "device_set", "arguments": { "driver": "host-arm-cpu", "core-id": 1, "admin-state": "disable" } } NOTE: The qdev_enable()/qdev_disable() hooks for acting on admin-state will= be added in subsequent patches. Device classes must explicitly support any property they want to expose through device_set. Signed-off-by: Salil Mehta --- hmp-commands.hx | 30 +++++++++ hw/arm/virt.c | 86 +++++++++++++++++++++++++ hw/core/cpu-common.c | 12 ++++ hw/core/qdev.c | 21 ++++++ include/hw/arm/virt.h | 1 + include/hw/core/cpu.h | 11 ++++ include/hw/qdev-core.h | 22 +++++++ include/monitor/hmp.h | 2 + include/monitor/qdev.h | 30 +++++++++ include/system/system.h | 1 + qemu-options.hx | 51 +++++++++++++-- system/qdev-monitor.c | 139 +++++++++++++++++++++++++++++++++++++++- system/vl.c | 39 +++++++++++ 13 files changed, 440 insertions(+), 5 deletions(-) diff --git a/hmp-commands.hx b/hmp-commands.hx index d0e4f35a30..18056cf21d 100644 --- a/hmp-commands.hx +++ b/hmp-commands.hx @@ -707,6 +707,36 @@ SRST or a QOM object path. ERST =20 +{ + .name =3D "device_set", + .args_type =3D "device:O", + .params =3D "driver[,prop=3Dvalue][,...]", + .help =3D "set/unset existing device property", + .cmd =3D hmp_device_set, + .command_completion =3D device_set_completion, +}, + +SRST +``device_set`` *driver[,prop=3Dvalue][,...]* + Change the administrative power state of an existing device. + + This command enables or disables a known device (e.g., CPU) using the + "device_set" interface. It does not hotplug or add a new device. + + Depending on platform support (e.g., PSCI or ACPI), this may trigger + corresponding operational changes =E2=80=94 such as powering down a CPU = or + transitioning it to active use. + + Administrative state: + * *enabled* =E2=80=94 Allows the guest to use the device (e.g., CPU_O= N) + * *disabled* =E2=80=94 Prevents guest use; device is powered off (e.g.= , CPU_OFF) + + Note: The device must already exist (be declared during machine creation= ). + + Example: + (qemu) device_set host-arm-cpu,core-id=3D3,admin-state=3Ddisabled +ERST + { .name =3D "cpu", .args_type =3D "index:i", diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 9a41a0682b..7bd37ffb75 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -74,6 +74,7 @@ #include "qapi/visitor.h" #include "qapi/qapi-visit-common.h" #include "qobject/qlist.h" +#include "qobject/qdict.h" #include "standard-headers/linux/input.h" #include "hw/arm/smmuv3.h" #include "hw/acpi/acpi.h" @@ -1824,6 +1825,88 @@ void virt_machine_done(Notifier *notifier, void *dat= a) virt_build_smbios(vms); } =20 +static DeviceState *virt_find_cpu(const QDict *opts, Error **errp) +{ + int64_t socket_id, cluster_id, core_id, thread_id; + MachineState *ms =3D MACHINE(qdev_get_machine()); + int64_t T, C, K, cpu_id; + CPUState *cpu; + const char *s; + + /* parse topology */ + socket_id =3D (s =3D qdict_get_try_str(opts, "socket-id")) ? + strtoll(s, NULL, 10) : 0; + cluster_id =3D (s =3D qdict_get_try_str(opts, "cluster-id")) ? + strtoll(s, NULL, 10) : 0; + core_id =3D (s =3D qdict_get_try_str(opts, "core-id")) ? + strtoll(s, NULL, 10) : 0; + thread_id =3D (s =3D qdict_get_try_str(opts, "thread-id")) ? + strtoll(s, NULL, 10) : 0; + + /* Range checks */ + if (thread_id < 0 || thread_id >=3D ms->smp.threads) { + error_setg(errp, + "Couldn't find cpu(%ld:%ld:%ld:%ld), Invalid thread-id = %ld", + socket_id, cluster_id, core_id, thread_id, thread_id); + return NULL; + } + if (core_id < 0 || core_id >=3D ms->smp.cores) { + error_setg(errp, + "Couldn't find cpu(%ld:%ld:%ld:%ld), Invalid core-id %l= d", + socket_id, cluster_id, core_id, thread_id, core_id); + return NULL; + } + if (cluster_id < 0 || cluster_id >=3D ms->smp.clusters) { + error_setg(errp, + "Couldn't find cpu(%ld:%ld:%ld:%ld), Invalid cluster-id= %ld", + socket_id, cluster_id, core_id, thread_id, cluster_id); + return NULL; + } + if (socket_id < 0 || socket_id >=3D ms->smp.sockets) { + error_setg(errp, + "Couldn't find cpu(%ld:%ld:%ld:%ld), Invalid socket-id = %ld", + socket_id, cluster_id, core_id, thread_id, socket_id); + return NULL; + } + + /* Compute logical CPU index: t + T*(c + C*(k + K*s)). */ + T =3D ms->smp.threads; + C =3D ms->smp.cores; + K =3D ms->smp.clusters; + cpu_id =3D thread_id + T * (core_id + C * (cluster_id + K * socket_id)= ); + + cpu =3D machine_get_possible_cpu((int)cpu_id); + if (!cpu) { + error_setg(errp, + "Couldn't find cpu(%ld:%ld:%ld:%ld), Invalid cpu-index = %ld", + socket_id, cluster_id, core_id, thread_id, cpu_id); + return NULL; + } + + return DEVICE(cpu); +} + +static DeviceState * +virt_find_device(DeviceListener *listener, const QDict *opts, Error **errp) +{ + const char *typename; + + g_assert(opts); + + typename =3D qdict_get_try_str(opts, "driver"); + if (!typename) + { + error_setg(errp, "no driver specified"); + return NULL; + } + + if (cpu_typename_is_a(typename, TYPE_ARM_CPU)) { + return virt_find_cpu(opts, errp); + } + + return NULL; +} + static void virt_park_cpu_in_userspace(CPUState *cs) { /* we don't want to migrate 'disabled' vCPU state(even if realized) */ @@ -2545,6 +2628,9 @@ static void machvirt_init(MachineState *machine) =20 create_fdt(vms); =20 + vms->device_listener.find_device =3D virt_find_device; + device_listener_register(&vms->device_listener); + assert(possible_cpus->len =3D=3D max_cpus); for (n =3D 0; n < possible_cpus->len; n++) { Object *cpuobj; diff --git a/hw/core/cpu-common.c b/hw/core/cpu-common.c index 39e674aca2..6883dba75e 100644 --- a/hw/core/cpu-common.c +++ b/hw/core/cpu-common.c @@ -170,6 +170,18 @@ char *cpu_model_from_type(const char *typename) return g_strdup(typename); } =20 +bool cpu_typename_is_a(const char *typename, const char *base_typename) +{ + ObjectClass *oc; + + if (!typename || !base_typename) { + return false; + } + + oc =3D object_class_by_name(typename); + return oc && object_class_dynamic_cast(oc, base_typename); +} + static void cpu_common_parse_features(const char *typename, char *features, Error **errp) { diff --git a/hw/core/qdev.c b/hw/core/qdev.c index 3aba99b912..4fa2988ca0 100644 --- a/hw/core/qdev.c +++ b/hw/core/qdev.c @@ -226,6 +226,27 @@ bool qdev_should_hide_device(const QDict *opts, bool f= rom_json, Error **errp) return false; } =20 +DeviceState * +qdev_find_device(const QDict *opts, Error **errp) +{ + ERRP_GUARD(); + DeviceListener *listener; + DeviceState *dev; + + QTAILQ_FOREACH(listener, &device_listeners, link) { + if (listener->find_device) { + dev =3D listener->find_device(listener, opts, errp); + if (*errp) { + return NULL; + } else if (dev) { + return dev; + } + } + } + + return NULL; +} + void qdev_set_legacy_instance_id(DeviceState *dev, int alias_id, int required_for_version) { diff --git a/include/hw/arm/virt.h b/include/hw/arm/virt.h index 0898e8eed3..de4a08175e 100644 --- a/include/hw/arm/virt.h +++ b/include/hw/arm/virt.h @@ -182,6 +182,7 @@ struct VirtMachineState { char *oem_table_id; bool ns_el2_virt_timer_irq; CXLState cxl_devices_state; + DeviceListener device_listener; }; =20 #define VIRT_ECAM_ID(high) (high ? VIRT_HIGH_PCIE_ECAM : VIRT_PCIE_ECAM) diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h index ccf5588011..c9ce9bbdaf 100644 --- a/include/hw/core/cpu.h +++ b/include/hw/core/cpu.h @@ -853,6 +853,17 @@ ObjectClass *cpu_class_by_name(const char *typename, c= onst char *cpu_model); */ char *cpu_model_from_type(const char *typename); =20 +/** + * cpu_typename_is_a: + * @typename: QOM type name to check (e.g. "host-arm-cpu"). + * @base_typename: Base QOM typename to test against (e.g. TYPE_ARM_CPU). + * + * Return: true if @typename names a class that is-a @base_typename, else = false. + * + * Notes: Safe for common code; depends only on QOM (no target headers). + */ +bool cpu_typename_is_a(const char *typename, const char *base_typename); + /** * cpu_create: * @typename: The CPU type. diff --git a/include/hw/qdev-core.h b/include/hw/qdev-core.h index 3e08cfb59f..19d1d1a144 100644 --- a/include/hw/qdev-core.h +++ b/include/hw/qdev-core.h @@ -371,6 +371,15 @@ struct DeviceListener { */ bool (*hide_device)(DeviceListener *listener, const QDict *device_opts, bool from_json, Error **errp); + /* + * Used by qdev to find any device corresponding to the device opts + * + * Returns the `DeviceState` on sucess and NULL if device was not foun= d. + * On errors, it returns NULL and errp is set + */ + DeviceState * (*find_device)(DeviceListener *listener, + const QDict *device_opts, + Error **errp); QTAILQ_ENTRY(DeviceListener) link; }; =20 @@ -1252,6 +1261,19 @@ void device_listener_unregister(DeviceListener *list= ener); */ bool qdev_should_hide_device(const QDict *opts, bool from_json, Error **er= rp); =20 +/** + * qdev_find_device() - find the device + * + * @opts: options QDict + * @errp: pointer to error object + * + * Called when device state is toggled via qdev_device_state() + * + * Return: a DeviceState on success and NULL on failure + */ +DeviceState * +qdev_find_device(const QDict *opts, Error **errp); + typedef enum MachineInitPhase { /* current_machine is NULL. */ PHASE_NO_MACHINE, diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h index ae116d9804..3e8c492c28 100644 --- a/include/monitor/hmp.h +++ b/include/monitor/hmp.h @@ -84,6 +84,7 @@ void hmp_change_medium(Monitor *mon, const char *device, = const char *target, void hmp_migrate(Monitor *mon, const QDict *qdict); void hmp_device_add(Monitor *mon, const QDict *qdict); void hmp_device_del(Monitor *mon, const QDict *qdict); +void hmp_device_set(Monitor *mon, const QDict *qdict); void hmp_dump_guest_memory(Monitor *mon, const QDict *qdict); void hmp_netdev_add(Monitor *mon, const QDict *qdict); void hmp_netdev_del(Monitor *mon, const QDict *qdict); @@ -117,6 +118,7 @@ void object_add_completion(ReadLineState *rs, int nb_ar= gs, const char *str); void object_del_completion(ReadLineState *rs, int nb_args, const char *str= ); void device_add_completion(ReadLineState *rs, int nb_args, const char *str= ); void device_del_completion(ReadLineState *rs, int nb_args, const char *str= ); +void device_set_completion(ReadLineState *rs, int nb_args, const char *str= ); void sendkey_completion(ReadLineState *rs, int nb_args, const char *str); void chardev_remove_completion(ReadLineState *rs, int nb_args, const char = *str); void chardev_add_completion(ReadLineState *rs, int nb_args, const char *st= r); diff --git a/include/monitor/qdev.h b/include/monitor/qdev.h index 1d57bf6577..b10040e27f 100644 --- a/include/monitor/qdev.h +++ b/include/monitor/qdev.h @@ -6,6 +6,36 @@ void hmp_info_qtree(Monitor *mon, const QDict *qdict); void hmp_info_qdm(Monitor *mon, const QDict *qdict); void qmp_device_add(QDict *qdict, QObject **ret_data, Error **errp); +/** + * qmp_device_set: + * @qdict: Boxed arguments identifying the target device and property chan= ges. + * + * The device can be identified in one of two ways: + * 1. By "id": Device instance ID (string), or + * 2. By "driver": Device type (string) plus one or more + * property=3Dvalue pairs to match. + * + * Must also include at least one property assignment to change. + * Currently used for: + * - "admin-state": "enable" | "disable" + * + * Additional properties may be supported by specific devices + * in future. + * + * @errp: Pointer to error object (set on failure). + * + * Change one or more mutable properties of an existing device at runtime. + * Initially intended for administrative CPU power-state control via + * "admin-state" on CPU devices, but may be extended to support other + * per-device set/unset controls when allowed by the target device class. + * + * Returns: Nothing. On success, replies with `{ "return": true }` via QMP. + * + * Errors: + * - DeviceNotFound: No matching device found + * - GenericError: Parameter validation failed or operation unsupported + */ +void qmp_device_set(const QDict *qdict, Error **errp); =20 int qdev_device_help(QemuOpts *opts); DeviceState *qdev_device_add(QemuOpts *opts, Error **errp); diff --git a/include/system/system.h b/include/system/system.h index a7effe7dfd..3702325cfb 100644 --- a/include/system/system.h +++ b/include/system/system.h @@ -116,6 +116,7 @@ extern QemuOptsList qemu_drive_opts; extern QemuOptsList bdrv_runtime_opts; extern QemuOptsList qemu_chardev_opts; extern QemuOptsList qemu_device_opts; +extern QemuOptsList qemu_deviceset_opts; extern QemuOptsList qemu_netdev_opts; extern QemuOptsList qemu_nic_opts; extern QemuOptsList qemu_net_opts; diff --git a/qemu-options.hx b/qemu-options.hx index 83ccde341b..f517b91042 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -375,7 +375,10 @@ SRST This is different from CPU hotplug where additional CPUs are not even present in the system description. Administratively disabled CPUs appe= ar in ACPI tables i.e. are provisioned, but cannot be used until explicitly - enabled via QMP/HMP or the deviceset API. + enabled via QMP/HMP or the deviceset API. On ACPI guests, each vCPU co= unted + by 'disabledcpus=3D' is provisioned with '\ ``_STA``\ ' reporting Pres= ent=3D1 + and Enabled=3D0 (present-offline) at boot; it becomes Enabled=3D1 when= brought + online via 'device_set ... admin-state=3Denable'. =20 On boards supporting CPU hotplug, the optional '\ ``maxcpus``\ ' param= eter can be set to enable further CPUs to be added at runtime. When both @@ -455,6 +458,15 @@ SRST =20 -smp 2 =20 + Note: The cluster topology will only be generated in ACPI and exposed + to guest if it's explicitly specified in -smp. + + Note: Administratively disabled CPUs (specified via 'disabledcpus=3D' = and + '-deviceset' at CLI during boot) are especially useful for platforms l= ike + ARM that lack native CPU hotplug support. These CPUs will appear to the + guest as unavailable, and any attempt to bring them online must go thr= ough + QMP/HMP commands like 'device_set'. + Examples using 'disabledcpus': =20 For a board without CPU hotplug, enable 4 CPUs at boot and provision @@ -472,9 +484,6 @@ SRST :: =20 -smp cpus=3D4,disabledcpus=3D2,maxcpus=3D8 - - Note: The cluster topology will only be generated in ACPI and exposed - to guest if it's explicitly specified in -smp. ERST =20 DEF("numa", HAS_ARG, QEMU_OPTION_numa, @@ -1281,6 +1290,40 @@ SRST =20 ERST =20 +DEF("deviceset", HAS_ARG, QEMU_OPTION_deviceset, + "-deviceset driver[,prop[=3Dvalue]][,...]\n" + " Set administrative power state of an existing device.= \n" + " Does not hotplug a new device. Can disable or enable\= n" + " devices (such as CPUs) at boot based on policy.\n" + " Example:\n" + " -deviceset host-arm-cpu,core-id=3D2,admin-state= =3Ddisabled\n" + " Use '-deviceset help' for supported drivers\n" + " Use '-deviceset driver,help' for driver-specific prop= erties\n", + QEMU_ARCH_ALL) +SRST +``-deviceset driver[,prop[=3Dvalue]][,...]`` + Configure an existing device's administrative power state or propertie= s. + + Unlike ``-device``, this option does not create a new device. Instead, + it sets startup properties (such as administrative power state) for + a device already declared via -smp or other machine configuration. + + Example: + -smp cpus=3D4 + -deviceset host-arm-cpu,core-id=3D2,admin-state=3Ddisabled + + The above disables CPU core 2 at boot using administrative offlining. + The guest may later re-enable the core (if permitted by platform polic= y). + + ``state=3Denabled|disabled`` + Sets the administrative state of the device: + - ``enabled``: device is made available at boot + - ``disabled``: device is administratively disabled and powered off + + Use ``-deviceset help`` to view all supported drivers. + Use ``-deviceset driver,help`` for property-specific help. +ERST + DEF("name", HAS_ARG, QEMU_OPTION_name, "-name string1[,process=3Dstring2][,debug-threads=3Don|off]\n" " set the name of the guest\n" diff --git a/system/qdev-monitor.c b/system/qdev-monitor.c index 2ac92d0a07..1099b1237d 100644 --- a/system/qdev-monitor.c +++ b/system/qdev-monitor.c @@ -263,12 +263,20 @@ static DeviceClass *qdev_get_device_class(const char = **driver, Error **errp) } =20 dc =3D DEVICE_CLASS(oc); - if (!dc->user_creatable) { + if (!dc->user_creatable && !dc->admin_power_state_supported) { error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "driver", "a pluggable device type"); return NULL; } =20 + if (phase_check(PHASE_MACHINE_READY) && + (!dc->hotpluggable || !dc->admin_power_state_supported)) { + error_setg(errp, QERR_INVALID_PARAMETER_VALUE, "driver", + "a pluggable device type or which supports changing pow= er-" + "state administratively"); + return NULL; + } + if (object_class_dynamic_cast(oc, TYPE_SYS_BUS_DEVICE)) { /* sysbus devices need to be allowed by the machine */ MachineClass *mc =3D MACHINE_CLASS(object_get_class(qdev_get_machi= ne())); @@ -939,6 +947,76 @@ void qmp_device_del(const char *id, Error **errp) } } =20 +void qmp_device_set(const QDict *qdict, Error **errp) +{ + const char *state; + const char *driver; + DeviceState *dev; + DeviceClass *dc; + const char *id; + + driver =3D qdict_get_try_str(qdict, "driver"); + if (!driver) { + error_setg(errp, "Parameter 'driver' is missing"); + return; + } + + /* check driver exists and we are at the right phase of machine init */ + dc =3D qdev_get_device_class(&driver, errp); + if (!dc) { + error_setg(errp, "driver '%s' not supported", driver); + return; + } + + if (migration_is_running()) { + error_setg(errp, "device_set not allowed while migrating"); + return; + } + + id =3D qdict_get_try_str(qdict, "id"); + + if (id) { + /* Lookup by ID */ + dev =3D find_device_state(id, false, errp); + if (errp && *errp) { + error_prepend(errp, "Device lookup failed for ID '%s': ", id); + return; + } + } else { + /* Lookup using driver and properties */ + dev =3D qdev_find_device(qdict, errp); + if (errp && *errp) { + error_prepend(errp, "Device lookup for %s failed: ", driver); + return; + } + } + if (!dev) { + error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND, + "No device found for driver '%s'", driver); + return; + } + + state =3D qdict_get_try_str(qdict, "admin-state"); + if (!state) { + error_setg(errp, "no device state change specified for device %s ", + dev->id); + return; + } else if (!strcmp(state, "enable")) { + + if (!qdev_enable(dev, qdev_get_parent_bus(DEVICE(dev)), errp)) { + return; + } + } else if (!strcmp(state, "disable")) { + if (!qdev_disable(dev, qdev_get_parent_bus(DEVICE(dev)), errp)) { + return; + } + } else { + error_setg(errp, "unrecognized specified state *%s* for device %s", + state, dev->id); + return; + } +} + int qdev_sync_config(DeviceState *dev, Error **errp) { DeviceClass *dc =3D DEVICE_GET_CLASS(dev); @@ -1019,6 +1097,14 @@ void hmp_device_del(Monitor *mon, const QDict *qdict) hmp_handle_error(mon, err); } =20 +void hmp_device_set(Monitor *mon, const QDict *qdict) +{ + Error *err =3D NULL; + + qmp_device_set(qdict, &err); + hmp_handle_error(mon, err); +} + void device_add_completion(ReadLineState *rs, int nb_args, const char *str) { GSList *list, *elt; @@ -1101,6 +1187,41 @@ void device_del_completion(ReadLineState *rs, int nb= _args, const char *str) peripheral_device_del_completion(rs, str); } =20 +void device_set_completion(ReadLineState *rs, int nb_args, const char *str) +{ + GSList *list, *elt; + size_t len; + + if (nb_args =3D=3D 2) { + len =3D strlen(str); + readline_set_completion_index(rs, len); + + list =3D elt =3D object_class_get_list(TYPE_DEVICE, false); + while (elt) { + DeviceClass *dc =3D OBJECT_CLASS_CHECK(DeviceClass, elt->data, + TYPE_DEVICE); + readline_add_completion_of( + rs, str, object_class_get_name(OBJECT_CLASS(dc))); + elt =3D elt->next; + } + g_slist_free(list); + return; + } + + if (nb_args =3D=3D 3) { + readline_set_completion_index(rs, strlen(str)); + readline_add_completion_of(rs, str, "admin-state"); + return; + } + + if (nb_args =3D=3D 4) { + readline_set_completion_index(rs, strlen(str)); + readline_add_completion_of(rs, str, "enable"); + readline_add_completion_of(rs, str, "disable"); + return; + } +} + BlockBackend *blk_by_qdev_id(const char *id, Error **errp) { DeviceState *dev; @@ -1134,6 +1255,22 @@ QemuOptsList qemu_device_opts =3D { }, }; =20 +QemuOptsList qemu_deviceset_opts =3D { + .name =3D "deviceset", + .implied_opt_name =3D "driver", + .head =3D QTAILQ_HEAD_INITIALIZER(qemu_deviceset_opts.head), + .desc =3D { + /* + * no fixed schema; parameters include: + * - driver=3D + * - id=3D (optional) + * - admin-state=3Denabled|disabled + * - other optional props for locating the device + */ + { /* end of list */ } + }, +}; + QemuOptsList qemu_global_opts =3D { .name =3D "global", .head =3D QTAILQ_HEAD_INITIALIZER(qemu_global_opts.head), diff --git a/system/vl.c b/system/vl.c index 2f0fd21a1f..c1731de202 100644 --- a/system/vl.c +++ b/system/vl.c @@ -1218,6 +1218,16 @@ static int device_init_func(void *opaque, QemuOpts *= opts, Error **errp) return 0; } =20 +static int deviceset_init_func(void *opaque, QemuOpts *opts, Error **errp) +{ + QDict *qdict =3D qemu_opts_to_qdict(opts, NULL); + + qmp_device_set(qdict, errp); + qobject_unref(qdict); + + return *errp ? -1 : 0; +} + static int chardev_init_func(void *opaque, QemuOpts *opts, Error **errp) { Error *local_err =3D NULL; @@ -2755,6 +2765,10 @@ static void qemu_create_cli_devices(void) assert(ret_data =3D=3D NULL); /* error_fatal aborts */ loc_pop(&opt->loc); } + + /* add deferred 'deviceset' list handling - common to JSON/non-JSON pa= th */ + qemu_opts_foreach(qemu_find_opts("deviceset"), deviceset_init_func, NU= LL, + &error_fatal); } =20 static bool qemu_machine_creation_done(Error **errp) @@ -2855,6 +2869,7 @@ void qemu_init(int argc, char **argv) qemu_add_drive_opts(&bdrv_runtime_opts); qemu_add_opts(&qemu_chardev_opts); qemu_add_opts(&qemu_device_opts); + qemu_add_opts(&qemu_deviceset_opts); qemu_add_opts(&qemu_netdev_opts); qemu_add_opts(&qemu_nic_opts); qemu_add_opts(&qemu_net_opts); @@ -3458,6 +3473,30 @@ void qemu_init(int argc, char **argv) } } break; + case QEMU_OPTION_deviceset: + if (optarg[0] =3D=3D '{') { + /* JSON input: convert to QDict and then to QemuOpts = */ + QObject *obj =3D qobject_from_json(optarg, &error_fat= al); + QDict *qdict =3D qobject_to(QDict, obj); + if (!qdict) { + error_report("Invalid JSON object for -deviceset"= ); + exit(1); + } + + opts =3D qemu_opts_from_qdict(qemu_find_opts("devices= et"), + qdict, &error_fatal); + qobject_unref(qdict); + if (!opts) { + error_report_err(error_fatal); + exit(1); + } + } else { + if (!qemu_opts_parse_noisily(qemu_find_opts("deviceset= "), + optarg, true)) { + exit(1); + } + } + break; case QEMU_OPTION_smp: machine_parse_property_opt(qemu_find_opts("smp-opts"), "smp", optarg); --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=fail; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759280750202516.5798246658369; Tue, 30 Sep 2025 18:05:50 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lGU-0005kP-HE; Tue, 30 Sep 2025 21:04:18 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lGF-0005a5-3d for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:04:04 -0400 Received: from mail-wr1-x434.google.com ([2a00:1450:4864:20::434]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lFP-0008Ez-7P for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:04:02 -0400 Received: by mail-wr1-x434.google.com with SMTP id ffacd0b85a97d-3f42b54d1b9so1581161f8f.0 for ; Tue, 30 Sep 2025 18:03:07 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.03.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:03:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280584; x=1759885384; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=JqlJ1Ed3QXcZnlY4qexn4Hfgq2mlv/iEUpdzIVVjHlk=; b=MP1303CNHvdDwdTaZACP/+sk5v2ZHHbp/3co5Li52hM9jQzUH2aMVwUZnzIwED2v3r BVLs/8nXviCU4s3hFPU9Y5I8cN6efR7MAy9L/lrZIuGU4bVqwltCUgozxDT0wslsFiNc CftKUO28G/clSUe9xZndGWFTDR9K3M0q3lOA2zS+UUWrcqBNYVlEHYw4LpKg6bqrm+sP tGmXwVnZ7BALatMfic+FjL07IZlcIgroinwqp3zZaGrfUPy8yQqEYyAhZG3Px8e+Bzxy RTED5iSUg6XANLpo7VvfRhvRVw9hvZ16N/A8+eu1OsAlZ1/OISqJ6b9yKhIcbOgAop2b mX6g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280584; x=1759885384; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=JqlJ1Ed3QXcZnlY4qexn4Hfgq2mlv/iEUpdzIVVjHlk=; b=mlMCKwCeTugwiU6ea/179w54NFa7PX/HRHZ+R0vsEoD8JMftdbTPGibYAxJSQC89rn 5WILILahHBoz6llF0HL7GwHarA475GQsXWgysypQAq/rq404PwsCwGnp7v4+xhK6hj2J mYM6RMajt12WHIyG503nJWFJWvglnDdDbJsTogSW+gRN8qjVb7XNeVp7oV7Bp0uWnUVa d9sa5VPwlcOkKUQGrPt1D+cz/KhgJQs8jPTX7DTExN8O5jMW2ls1/p2GBT/2pN0ErUps xi5d07O07BJMQiQMjV7ZGVi3v0UNEjYxyaji3Yhw/ILEAbQtiUx+QJvsy/ARs8FdXZa2 DWNw== X-Gm-Message-State: AOJu0YwPf3M5OK0xEFgsrwKO4cNk9mXxI8piYZGec6T3zzAMHTWa3S7D 6cgXyUfbEwM43gDpT355pY50Pr9fjFaVcrFOi1K+V16vI0h4M8xni5RB2ivByVuyipwqpT5nTlo B4hOJQvHZqA== X-Gm-Gg: ASbGncvfBlwG5zaC+kA4VWCdc2VmySBNzfU/Yn7ddY/EH0RD3qtehN2tQ4cV242aE2K Ei1afkK0K+i9I2wRI7VmjzRPWJ9ObVz/hyiB3nyg+T9n5v9QNeH5hitwDMW2LPthFpYuflT4MOV 1mFgGCK5dhXPM3XxYxapfxz5oHh6Xpd2jp6JX/p7LF0IyqjSzrM272K7Afbw3Fn1E4aa6REU6TL KbjCKPrkQhfoY9Slu1zYCEo67ixvR6RoaIQF+mJr9VkWTdqoT/8t+wc1HiVUQriYQsLI+sy5L1p oNWSZpHbZZfmxRzW5G2rgHWOZd+ppRVVaIciJFe2SeneWVu4X5mGOZNOBX84pV5j7f7uM9QCAZj KeN4atJKoqy0wdKz4O4/EGPRM6uJBiH64rxwcHOuEzSUUnJcJF+eE4SWkYkrkVrC/xl1dDDzkhL qAKOBAYKvElIE9McRfR4lmJUciTHGrAWps5Qut84flQDs= X-Google-Smtp-Source: AGHT+IG1OBzG3IMv9K5awWGeGYbciu9hTZ9mCdT5/F5dG73capVwC8DN4VC2EWZSd6kl2kqttZBcbA== X-Received: by 2002:a05:6000:250c:b0:3d1:c2cf:da07 with SMTP id ffacd0b85a97d-425577e4a77mr911612f8f.4.1759280584205; Tue, 30 Sep 2025 18:03:04 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 23/24] monitor, qapi: add 'info cpus-powerstate' and QMP query (Admin + Oper states) Date: Wed, 1 Oct 2025 01:01:26 +0000 Message-Id: <20251001010127.3092631-24-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::434; envelope-from=salil.mehta@opnsrc.net; helo=mail-wr1-x434.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, T_SPF_TEMPERROR=0.01 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: fail (Header signature does not verify) X-ZM-MESSAGEID: 1759280753255116600 Content-Type: text/plain; charset="utf-8" From: Salil Mehta The existing 'info hotpluggable-cpus' applies to platforms with true CPU hotplug. On ARM, vCPUs are not hotpluggable: resources are allocated at boot and policy is enforced administratively (e.g. via ACPI _STA) to achieve a hotplug-like effect. As a result, the hotpluggable interface cannot describe ARM CPU state, whether administrative or runtime. Operators need a clear view of both administrative policy (Enabled, Disabled, Removed) and guest runtime status (On, Standby, Off, Unknown) for all possible vCPUs. This separation is essential to debug CPU life cycle flows on ARM, where PSCI CPU_ON/CPU_OFF and ACPI methods are used, and to distinguish CPUs that are enumerated but administratively blocked from those actually executing in the guest. The new interface is independent of hotplug and coexists with 'info hotpluggable-cpus' on platforms that support it (e.g. x86). By default devices are administratively Enabled; on hotpluggable systems, absent CPUs appear as Removed here. This patch introduces: * QMP 'query-cpus-powerstate' returning CPUPowerStateInfo per possible vCPU. * HMP 'info cpus-powerstate' for human-readable output. * Enums: - CPUPowerAdminState { enabled, disabled, removed } - CPUOperPowerState { on, standby, off, unknown } * CPUPowerStateInfo with admin/oper state, optional topology ids, and qom-path. Operational state semantics: * 'on' : CPU is on and runnable. * 'standby' : Reserved for suspend-with-context (e.g. PSCI CPU_SUSPEND). Not emitted yet. * 'off' : CPU is powered off. - At initial boot, admin-disabled vCPUs may be left unrealized (lazy realize) and are reported Off. - After an admin enable, the vCPU is realized; if later powered down, it remains realized and reported Off. * 'unknown' : State cannot be determined (very early init/teardown, transient hot-(un)plug window, or no power-state handler). Migration semantics: * Admin-disabled (unrealized) vCPUs do not migrate. * Admin-enabled vCPUs migrate their operational state, including Off. Signed-off-by: Salil Mehta --- hmp-commands-info.hx | 32 +++++++++++ hw/arm/virt.c | 32 +++++++++++ hw/core/machine-hmp-cmds.c | 62 +++++++++++++++++++++ hw/core/machine-qmp-cmds.c | 107 +++++++++++++++++++++++++++++++++++++ include/monitor/hmp.h | 1 + qapi/machine.json | 87 ++++++++++++++++++++++++++++++ 6 files changed, 321 insertions(+) diff --git a/hmp-commands-info.hx b/hmp-commands-info.hx index 6142f60e7b..b4d24c8aed 100644 --- a/hmp-commands-info.hx +++ b/hmp-commands-info.hx @@ -766,6 +766,38 @@ ERST SRST ``info hotpluggable-cpus`` Show information about hotpluggable CPUs + +ERST + +{ + .name =3D "cpus-powerstate", + .args_type =3D "", + .params =3D "", + .help =3D "Show administrative and operational CPU states", + .cmd =3D hmp_info_cpus_powerstate, + .flags =3D "p", +}, + +SRST + ``info cpus-powerstate`` + Display administrative (policy) and operational (runtime) power + states for each virtual CPU. + + Administrative states: + - ``Enabled`` : CPU is available to the guest + - ``Disabled`` : CPU is present but administratively blocked + - ``Removed`` : CPU is not present (hidden from the guest) + + Operational states (if available): + - ``On`` : CPU is powered on and executing + - ``Standby`` : CPU is idle/low-power and can resume on an event + - ``Off`` : CPU is powered off or guest-offlined + - ``Unknown`` : State cannot be determined (e.g. very early init, + teardown, transient hotplug/hotremove window, or + target/platform does not expose a queryable state) + + The administrative state constrains which operational states are + possible. ERST =20 { diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 7bd37ffb75..5e02d6749d 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -2080,6 +2080,21 @@ virt_cpu_post_poweroff(PowerStateHandler *handler, D= eviceState *dev, virt_park_cpu_in_userspace(cs); } =20 +static +DeviceOperPowerState virt_cpu_get_oper_state(DeviceState *dev, Error **err= p) +{ + ARMCPU *cpu =3D ARM_CPU(CPU(dev)); + + switch (cpu->power_state) { + case PSCI_ON: + return DEVICE_OPER_POWER_STATE_ON; + case PSCI_OFF: + return DEVICE_OPER_POWER_STATE_OFF; + default: + return DEVICE_OPER_POWER_STATE_UNKNOWN; + } +} + static uint64_t virt_cpu_mp_affinity(VirtMachineState *vms, int idx) { uint8_t clustersz; @@ -2452,6 +2467,9 @@ virt_setup_lazy_vcpu_realization(Object *cpuobj, Virt= MachineState *vms) NULL); } =20 + /* set operational state of disabled CPUs as OFF */ + ARM_CPU(cpuobj)->power_state =3D PSCI_OFF; + /* * [!] Constraint: The ARM CPU architecture does not permit new CPUs * to be added after system initialization. @@ -3517,6 +3535,19 @@ virt_machine_device_pre_poweron(PowerStateHandler *h= andler, DeviceState *dev, } } =20 +static DeviceOperPowerState +virt_machine_get_device_oper_state(DeviceState *dev, Error **errp) +{ + if (object_dynamic_cast(OBJECT(dev), TYPE_CPU)) { + return virt_cpu_get_oper_state(dev, errp); + } else { + error_setg(errp, "can't get power state for unsupported device-typ= e %s", + object_get_typename(OBJECT(dev))); + } + + return DEVICE_OPER_POWER_STATE_UNKNOWN; +} + static void * virt_machine_powerstate_handler(MachineState *machine, DeviceState *dev) { @@ -3672,6 +3703,7 @@ static void virt_machine_class_init(ObjectClass *oc, = const void *data) assert(!mc->get_powerstate_handler); mc->has_online_capable_cpus =3D true; mc->get_powerstate_handler =3D virt_machine_powerstate_handler; + pshc->get_oper_state =3D virt_machine_get_device_oper_state; pshc->request_poweroff =3D virt_machine_device_request_poweroff; pshc->post_poweroff =3D virt_machine_device_post_poweroff; pshc->pre_poweron =3D virt_machine_device_pre_poweron; diff --git a/hw/core/machine-hmp-cmds.c b/hw/core/machine-hmp-cmds.c index 3a612e2232..b01d8b800a 100644 --- a/hw/core/machine-hmp-cmds.c +++ b/hw/core/machine-hmp-cmds.c @@ -107,6 +107,68 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict *= qdict) qapi_free_HotpluggableCPUList(saved); } =20 +void hmp_info_cpus_powerstate(Monitor *mon, const QDict *qdict) +{ + Error *err =3D NULL; + CPUPowerStateInfoList *list =3D qmp_query_cpus_power_state(&err); + CPUPowerStateInfoList *entry =3D list; + + if (hmp_handle_error(mon, err)) { + return; + } + + monitor_printf(mon, "CPUs Power State Info:\n"); + + while (entry) { + CPUPowerStateInfo *cpu =3D entry->value; + + monitor_printf(mon, " CPU ID: %" PRIi64 "\n", cpu->id); + + if (cpu->has_socket_id) { + monitor_printf(mon, " socket-id: %" PRIi64 "\n", cpu->socke= t_id); + } + if (cpu->has_cluster_id) { + monitor_printf(mon, " cluster-id: %" PRIi64 "\n", cpu->clus= ter_id); + } + if (cpu->has_core_id) { + monitor_printf(mon, " core-id: %" PRIi64 "\n", cpu->core_id= ); + } + if (cpu->has_thread_id) { + monitor_printf(mon, " thread-id: %" PRIi64 "\n", cpu->threa= d_id); + } + if (cpu->has_die_id) { + monitor_printf(mon, " die-id: %" PRIi64 "\n", cpu->die_id); + } + if (cpu->has_module_id) { + monitor_printf(mon, " module-id: %" PRIi64 "\n", cpu->modul= e_id); + } + if (cpu->has_book_id) { + monitor_printf(mon, " book-id: %" PRIi64 "\n", cpu->book_id= ); + } + if (cpu->has_drawer_id) { + monitor_printf(mon, " drawer-id: %" PRIi64 "\n", cpu->drawe= r_id); + } + if (cpu->has_node_id) { + monitor_printf(mon, " node-id: %" PRIi64 "\n", cpu->node_id= ); + } + if (cpu->has_vcpus_count) { + monitor_printf(mon, " vcpus-count: %" PRIi64 "\n", cpu->vcp= us_count); + } + if (cpu->qom_path) { + monitor_printf(mon, " qom-path: \"%s\"\n", cpu->qom_path); + } + + monitor_printf(mon, " admin-state: \"%s\"\n", + CPUAdminPowerState_str(cpu->admin_state)); + monitor_printf(mon, " oper-state: \"%s\"\n", + CPUOperPowerState_str(cpu->oper_state)); + + entry =3D entry->next; + } + + qapi_free_CPUPowerStateInfoList(list); +} + void hmp_info_memdev(Monitor *mon, const QDict *qdict) { Error *err =3D NULL; diff --git a/hw/core/machine-qmp-cmds.c b/hw/core/machine-qmp-cmds.c index 6aca1a626e..b48356f36f 100644 --- a/hw/core/machine-qmp-cmds.c +++ b/hw/core/machine-qmp-cmds.c @@ -158,6 +158,113 @@ HotpluggableCPUList *qmp_query_hotpluggable_cpus(Erro= r **errp) return machine_query_hotpluggable_cpus(ms); } =20 +CPUPowerStateInfoList *qmp_query_cpus_power_state(Error **errp) +{ + MachineState *ms =3D MACHINE(qdev_get_machine()); + CPUPowerStateInfoList *head =3D NULL; + CPUPowerStateInfoList **tail =3D &head; + CPUPowerStateInfo *info; + CPUState *cpu; + + CPU_FOREACH_POSSIBLE(cpu, ms->possible_cpus) { + CPUArchId *arch_id =3D machine_get_possible_cpu_arch_id(cpu->cpu_i= ndex); + if (!arch_id) { + continue; + } + + info =3D g_new0(CPUPowerStateInfo, 1); + info->id =3D cpu->cpu_index; + + /* Optional topology fields */ + if (arch_id->props.has_socket_id) { + info->socket_id =3D arch_id->props.socket_id; + info->has_socket_id =3D true; + } + if (arch_id->props.has_cluster_id) { + info->cluster_id =3D arch_id->props.cluster_id; + info->has_cluster_id =3D true; + } + if (arch_id->props.has_core_id) { + info->core_id =3D arch_id->props.core_id; + info->has_core_id =3D true; + } + if (arch_id->props.has_thread_id) { + info->thread_id =3D arch_id->props.thread_id; + info->has_thread_id =3D true; + } + if (arch_id->props.has_die_id) { + info->die_id =3D arch_id->props.die_id; + info->has_die_id =3D true; + } + if (arch_id->props.has_module_id) { + info->module_id =3D arch_id->props.module_id; + info->has_module_id =3D true; + } + if (arch_id->props.has_book_id) { + info->book_id =3D arch_id->props.book_id; + info->has_book_id =3D true; + } + if (arch_id->props.has_drawer_id) { + info->drawer_id =3D arch_id->props.drawer_id; + info->has_drawer_id =3D true; + } + if (arch_id->props.has_node_id) { + info->node_id =3D arch_id->props.node_id; + info->has_node_id =3D true; + } + + info->vcpus_count =3D arch_id->vcpus_count; + info->has_vcpus_count =3D true; + + info->qom_path =3D object_get_canonical_path(OBJECT(cpu)); + + /* Determine current power state */ + switch (qdev_get_admin_power_state(DEVICE(cpu))) { + case DEVICE_ADMIN_POWER_STATE_ENABLED: + info->admin_state =3D CPU_ADMIN_POWER_STATE_ENABLED; + break; + case DEVICE_ADMIN_POWER_STATE_DISABLED: + info->admin_state =3D CPU_ADMIN_POWER_STATE_DISABLED; + break; + case DEVICE_ADMIN_POWER_STATE_REMOVED: + info->admin_state =3D CPU_ADMIN_POWER_STATE_REMOVED; + break; + default: + /* This should never be hit */ + g_assert_not_reached(); + break; + } + + /* Determine current operational power state */ + switch (qdev_get_oper_power_state(DEVICE(cpu))) { + case DEVICE_OPER_POWER_STATE_ON: + info->oper_state =3D CPU_OPER_POWER_STATE_ON; + break; + case DEVICE_OPER_POWER_STATE_OFF: + info->oper_state =3D CPU_OPER_POWER_STATE_OFF; + break; + case DEVICE_OPER_POWER_STATE_STANDBY: + info->oper_state =3D CPU_OPER_POWER_STATE_STANDBY; + break; + case DEVICE_OPER_POWER_STATE_UNKNOWN: + info->oper_state =3D CPU_OPER_POWER_STATE_UNKNOWN; + break; + default: + /* This should never be hit */ + g_assert_not_reached(); + break; + } + + /* Add to result list */ + CPUPowerStateInfoList *entry =3D g_new0(CPUPowerStateInfoList, 1); + entry->value =3D info; + *tail =3D entry; + tail =3D &entry->next; + } + + return head; +} + void qmp_set_numa_node(NumaOptions *cmd, Error **errp) { if (phase_check(PHASE_MACHINE_INITIALIZED)) { diff --git a/include/monitor/hmp.h b/include/monitor/hmp.h index 3e8c492c28..946ccb90c1 100644 --- a/include/monitor/hmp.h +++ b/include/monitor/hmp.h @@ -142,6 +142,7 @@ void hmp_rocker_of_dpa_flows(Monitor *mon, const QDict = *qdict); void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict *qdict); void hmp_info_dump(Monitor *mon, const QDict *qdict); void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict); +void hmp_info_cpus_powerstate(Monitor *mon, const QDict *qdict); void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict); void hmp_info_memory_size_summary(Monitor *mon, const QDict *qdict); void hmp_info_replay(Monitor *mon, const QDict *qdict); diff --git a/qapi/machine.json b/qapi/machine.json index e45740da33..3856785b27 100644 --- a/qapi/machine.json +++ b/qapi/machine.json @@ -1069,6 +1069,93 @@ { 'command': 'query-hotpluggable-cpus', 'returns': ['HotpluggableCPU'], 'allow-preconfig': true } =20 +## +# @CPUOperPowerState: +# +# Guest-visible operational state of the CPU. +# This reflects runtime status such as guest online/offline status or +# suspended state (e.g., CPU halted, suspended in a WFI loop). +# +# .. note:: +# This field is read-only. It is derived by QEMU from runtime +# information (e.g., CPU execution/architectural state, PSCI power +# status, vCPU runstate) and cannot be set by management tools or +# user commands. +# +# @on: CPU is online and executing. +# @standby: CPU is idle or suspended (e.g., WFI). +# @off: CPU is guest-offlined or halted. +# @unknown: State cannot be determined at this time (e.g., very early +# init/teardown, transient hotplug/hotremove window, no +# power-state handler registered, or the target/platform does +# not expose a queryable CPU state). +## +{ 'enum': 'CPUOperPowerState', + 'data': ['on', 'standby', 'off', 'unknown'] } + +## +# @CPUAdminPowerState: +# +# Host-side administrative power state of the CPU device. +# Controls guest visibility and lifecycle. +# +# @enabled: CPU is administratively enabled (can be used by guest) +# @disabled: CPU is administratively disabled (guest-visible but unusable) +# @removed: CPU is logically removed (not visible to guest) +## +{ 'enum': 'CPUAdminPowerState', + 'data': ['enabled', 'disabled', 'removed'] } + +## +# @CPUPowerStateInfo: +# +# CPU status combining both administrative and operational/runtime state. +# +# @id: CPU index +# @core-id: Core ID (optional) +# @socket-id: Socket ID (optional) +# @cluster-id: Cluster ID (optional) +# @thread-id: Thread ID (optional) +# @node-id: NUMA node ID (optional) +# @drawer-id: Drawer ID (optional) +# @book-id: Book ID (optional) +# @die-id: Die ID (optional) +# @module-id: Module ID (optional) +# @vcpus-count: Number of threads under this logical CPU (optional) +# @qom-path: QOM object path (optional) +# @admin-state: Administrative power state (enabled/disabled/removed) +# @oper-state: Guest-visible runtime power state (on/standby/off) +## +{ 'struct': 'CPUPowerStateInfo', + 'data': { + 'id': 'int', + '*core-id': 'int', + '*socket-id': 'int', + '*cluster-id': 'int', + '*thread-id': 'int', + '*node-id': 'int', + '*drawer-id': 'int', + '*book-id': 'int', + '*die-id': 'int', + '*module-id': 'int', + '*vcpus-count': 'int', + '*qom-path': 'str', + 'admin-state': 'CPUAdminPowerState', + 'oper-state': 'CPUOperPowerState' + } } + +## +# @query-cpus-power-state: +# +# Returns all CPUs and their power state info, combining host policy and +# runtime guest status. This is useful for debugging vCPU hotplug, +# suspend/resume, admin power states or offline state flows. +# +# Returns: a list of @CPUPowerStateInfo +## +{ 'command': 'query-cpus-power-state', + 'returns': ['CPUPowerStateInfo'] } + ## # @set-numa-node: # --=20 2.34.1 From nobody Fri Nov 14 22:22:13 2025 Delivered-To: importer@patchew.org Authentication-Results: mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org ARC-Seal: i=1; a=rsa-sha256; t=1759281222; cv=none; d=zohomail.com; s=zohoarc; b=BGdkFeYExBs0E+9q3Em2/JEEmSdLoHhderQg9q1eXinkP71f/Vhh58gLS42eI6Yjl34HNl3MGJ9frNGOTQXwIXg20f0ufad3wkYrUAsp7p2HG6/KMAMByKqDKDyUtF/pYV2iuUbGJ7Ku1LrhwDxiXMFUpDJopPsRfbEmcD/N2Sc= ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=zohomail.com; s=zohoarc; t=1759281222; h=Content-Transfer-Encoding:Cc:Cc:Date:Date:From:From:In-Reply-To:List-Subscribe:List-Post:List-Id:List-Archive:List-Help:List-Unsubscribe:MIME-Version:Message-ID:References:Sender:Subject:Subject:To:To:Message-Id:Reply-To; bh=A8WcCIp7AW8otz3wOJdViBKtHh2ab6qkb/xOpXEIFFU=; b=HSROWhhNz22y6XODf7EsVDYq9Li2UsiBx5MU38Ap6Zp0I6L/uqgZlAvNwRnNJStWZmNAkwxCqiemNWN94BP7fkhPRXzBgzhSdd39Kp30EpGuB6oic7HFx36qEf86kk5XIajnWP3qPhydcwLiZPYgHTgjnHVSKhDrOjGCqNv8tgw= ARC-Authentication-Results: i=1; mx.zohomail.com; dkim=pass; spf=pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1759281222205888.5510792576025; Tue, 30 Sep 2025 18:13:42 -0700 (PDT) Received: from localhost ([::1] helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1v3lH2-00065T-RU; Tue, 30 Sep 2025 21:04:53 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1v3lGU-0005kQ-E0 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:04:18 -0400 Received: from mail-wr1-x432.google.com ([2a00:1450:4864:20::432]) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.90_1) (envelope-from ) id 1v3lFV-0008Fk-L8 for qemu-devel@nongnu.org; Tue, 30 Sep 2025 21:04:18 -0400 Received: by mail-wr1-x432.google.com with SMTP id ffacd0b85a97d-3ee12a63af1so4310323f8f.1 for ; Tue, 30 Sep 2025 18:03:10 -0700 (PDT) Received: from localhost.localdomain ([90.209.204.182]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-40fb985e080sm24587426f8f.24.2025.09.30.18.03.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 30 Sep 2025 18:03:06 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=opnsrc.net; s=google; t=1759280587; x=1759885387; darn=nongnu.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=A8WcCIp7AW8otz3wOJdViBKtHh2ab6qkb/xOpXEIFFU=; b=YqMa/7f+/5G/wVlrBQdSoL049gw7GfGzw+lLHRsHeHeiTbBZTz3q0AgmyaJfbc3q82 25hxREWkDYgqzRCpHUtLwko0bweO/nFDNVxsSGX0qAICYwpypP3ggCEeFeJDJtbMOapo qfkO+2d/79q/v2cX0CSZOhehxo1P3EAaUJ7dOgmGY3bFZEGihoiV8ss3hO5fKx/Tdc7b /kN3dNS6uU2TGXM5wei8vtEFnVkClzXIF+dAoBfe4dD6nebVXWnnTqGd2LVYSLC6YQta bdnTeiM6RITY4HQwLjcimMtjxhtw3guLL7xuP8K+WhFfXn5Jb81tCxyy9PGBiK72s6mg yc6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759280587; x=1759885387; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=A8WcCIp7AW8otz3wOJdViBKtHh2ab6qkb/xOpXEIFFU=; b=DwaSrDzqgJAusGmbxlBTS9CgKb3IXdBesjkbaFNxsROQoKViwf28s/U1kTBq0GpHjS DO0aO7JjsOCKuozAcbGBzbOdH74Um7mJHt0X81Y5oz5MLtq7q6KWBwBiQf6+dfsQshNR V953/OpSGHUqAc9zUuogO1HhU+fAW72hKrP8fiAz7762y7BV7rE8uzAcz+RsqMdbma4I br/RcwMf8u4wJNdWFIBYE1rJbf+noc3dQ+i+m1DcI80qJxtW6kcnsWd01im8/liyFic+ YrniunTsS5WHGZ64ISiLWF+ght5Gg5GiTrWICMfBwpsLau7LcCfM4V6feMnFG4vie2+Y Y1hA== X-Gm-Message-State: AOJu0YxXpxaUXA/WHQ4MbOKe/RaqOBuTuNUCWtmL0pT81s9iEglhWYue poJAjwvLf/0r+YMKbg7xHpGuKrNpfzihkoXMHi5PBIDrWTaD+lXm2DjESTwpV4jAmZ/CgPtGfQI I6NheM+CHNw== X-Gm-Gg: ASbGncvJoOR3d/WtUyAX+rY8I3ipaDSs5bR8JgRuEKj+esJRxIFYTQCwz9sWhyEsLVG phUIedn1sy8jq2AYCEd+bVDk2R5kUiQ/sUYR7B0cx3ycdvNtwPmsOQ5YO4UxYuD//67KWQx4ccD z83GO2jiMx6qRGVBl7rHq5BXF9tctk5w3rHjOqFJTiZOqub1n6eIugSZ8s1hnS69HKnYc+JR4ec bPkeew7JWTB3XIkmCYmjnHtmdbOJ9L7/n5GlOjL5srU6rrPjBgIwwcmGQ1BxjjMph1R247N4IqU 9i4uVYDAwVPiSoOYChYyvMw7ZvXEvU9TgxLn70OlXmul0GftCV0DbtXNy6gvTQhjOeJ3AbGHSmT lqishKyPAd6A6gebdaf7Czo9gZp8rEEUh7E84W0mLjWQ2/egwAbwLNKQDQU2H7wSMNKB+5NHvMu OEjb78sNHPxDjDzs8jydiTkELP0/iCbDl2+xSlpmgGdqw= X-Google-Smtp-Source: AGHT+IGlokWyuAQ2aOYDbWjfOqnwWRIkap8UzDam0XIevWPLN9f+zE+7RFoc9UfCTkN4Aj3PvVw+Vw== X-Received: by 2002:a05:6000:2407:b0:3ec:db87:e8fa with SMTP id ffacd0b85a97d-425577fe66bmr1211377f8f.26.1759280586676; Tue, 30 Sep 2025 18:03:06 -0700 (PDT) From: salil.mehta@opnsrc.net To: qemu-devel@nongnu.org, qemu-arm@nongnu.org, mst@redhat.com Cc: salil.mehta@huawei.com, maz@kernel.org, jean-philippe@linaro.org, jonathan.cameron@huawei.com, lpieralisi@kernel.org, peter.maydell@linaro.org, richard.henderson@linaro.org, imammedo@redhat.com, armbru@redhat.com, andrew.jones@linux.dev, david@redhat.com, philmd@linaro.org, eric.auger@redhat.com, will@kernel.org, ardb@kernel.org, oliver.upton@linux.dev, pbonzini@redhat.com, gshan@redhat.com, rafael@kernel.org, borntraeger@linux.ibm.com, alex.bennee@linaro.org, gustavo.romero@linaro.org, npiggin@gmail.com, harshpb@linux.ibm.com, linux@armlinux.org.uk, darren@os.amperecomputing.com, ilkka@os.amperecomputing.com, vishnu@os.amperecomputing.com, gankulkarni@os.amperecomputing.com, karl.heubaum@oracle.com, miguel.luis@oracle.com, salil.mehta@opnsrc.net, zhukeqian1@huawei.com, wangxiongfeng2@huawei.com, wangyanan55@huawei.com, wangzhou1@hisilicon.com, linuxarm@huawei.com, jiakernel2@gmail.com, maobibo@loongson.cn, lixianglai@loongson.cn, shahuang@redhat.com, zhao1.liu@intel.com Subject: [PATCH RFC V6 24/24] tcg: Defer TB flush for 'lazy realized' vCPUs on first region alloc Date: Wed, 1 Oct 2025 01:01:27 +0000 Message-Id: <20251001010127.3092631-25-salil.mehta@opnsrc.net> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20251001010127.3092631-1-salil.mehta@opnsrc.net> References: <20251001010127.3092631-1-salil.mehta@opnsrc.net> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Received-SPF: pass (zohomail.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Received-SPF: pass client-ip=2a00:1450:4864:20::432; envelope-from=salil.mehta@opnsrc.net; helo=mail-wr1-x432.google.com X-Spam_score_int: -20 X-Spam_score: -2.1 X-Spam_bar: -- X-Spam_report: (-2.1 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, T_SPF_HELO_TEMPERROR=0.01, T_SPF_TEMPERROR=0.01 autolearn=unavailable autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.29 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: qemu-devel-bounces+importer=patchew.org@nongnu.org X-ZohoMail-DKIM: pass (identity @opnsrc.net) X-ZM-MESSAGEID: 1759281223912116600 Content-Type: text/plain; charset="utf-8" From: Salil Mehta The TCG code cache is split into regions shared by vCPUs under MTTCG. For cold-boot (early realized) vCPUs, regions are sized/allocated during bring-= up. However, when a vCPU is *lazy_realized* (administratively "disabled" at boot and realized later on demand), its TCGContext may fail the very first code region allocation if the shared TB cache is saturated by already-running vCPUs. Flushing the TB cache is the right remediation, but `tb_flush()` must be performed from the safe execution context (cpu_exec_loop()/tb_gen_code()). This patch wires a deferred flush: * In `tcg_region_initial_alloc__locked()`, treat an initial allocation failure for a lazily realized vCPU as non-fatal: set `s->tbflush_pend` and return. * In `tcg_tb_alloc()`, if `s->tbflush_pend` is observed, clear it and return NULL so the caller performs a synchronous `tb_flush()` and then retries allocation. This avoids hangs observed when a newly realized vCPU cannot obtain its fir= st region under TB-cache pressure, while keeping the flush at a safe point. No change for cold-boot vCPUs and when accel ops is KVM. In earlier series, this patch was with below named, 'tcg: Update tcg_register_thread() leg to handle region alloc for hotplugge= d vCPU' Reported-by: Miguel Luis Signed-off-by: Miguel Luis Signed-off-by: Salil Mehta --- accel/tcg/tcg-accel-ops-mttcg.c | 2 +- accel/tcg/tcg-accel-ops-rr.c | 2 +- hw/arm/virt.c | 5 +++++ include/hw/core/cpu.h | 1 + include/tcg/startup.h | 6 ++++++ include/tcg/tcg.h | 1 + tcg/region.c | 16 ++++++++++++++++ tcg/tcg.c | 19 ++++++++++++++++++- 8 files changed, 49 insertions(+), 3 deletions(-) diff --git a/accel/tcg/tcg-accel-ops-mttcg.c b/accel/tcg/tcg-accel-ops-mttc= g.c index 337b993d3d..cdb7345340 100644 --- a/accel/tcg/tcg-accel-ops-mttcg.c +++ b/accel/tcg/tcg-accel-ops-mttcg.c @@ -73,7 +73,7 @@ static void *mttcg_cpu_thread_fn(void *arg) force_rcu.notifier.notify =3D mttcg_force_rcu; force_rcu.cpu =3D cpu; rcu_add_force_rcu_notifier(&force_rcu.notifier); - tcg_register_thread(); + tcg_register_thread(cpu); =20 bql_lock(); qemu_thread_get_self(cpu->thread); diff --git a/accel/tcg/tcg-accel-ops-rr.c b/accel/tcg/tcg-accel-ops-rr.c index 6eec5c9eee..18e713cada 100644 --- a/accel/tcg/tcg-accel-ops-rr.c +++ b/accel/tcg/tcg-accel-ops-rr.c @@ -186,7 +186,7 @@ static void *rr_cpu_thread_fn(void *arg) rcu_register_thread(); force_rcu.notify =3D rr_force_rcu; rcu_add_force_rcu_notifier(&force_rcu); - tcg_register_thread(); + tcg_register_thread(cpu); =20 bql_lock(); qemu_thread_get_self(cpu->thread); diff --git a/hw/arm/virt.c b/hw/arm/virt.c index 5e02d6749d..254303727b 100644 --- a/hw/arm/virt.c +++ b/hw/arm/virt.c @@ -2482,6 +2482,11 @@ virt_setup_lazy_vcpu_realization(Object *cpuobj, Vir= tMachineState *vms) if (kvm_enabled()) { kvm_arm_create_host_vcpu(ARM_CPU(cpuobj)); } + + /* we may have to nuke the TB cache */ + if (tcg_enabled()) { + CPU(cpuobj)->lazy_realized =3D true; + } } =20 static void machvirt_init(MachineState *machine) diff --git a/include/hw/core/cpu.h b/include/hw/core/cpu.h index c9ce9bbdaf..c2d45fb494 100644 --- a/include/hw/core/cpu.h +++ b/include/hw/core/cpu.h @@ -486,6 +486,7 @@ struct CPUState { bool stop; bool stopped; bool parked; + bool lazy_realized; /* realized after machine init (lazy realization) = */ =20 /* Should CPU start in powered-off state? */ bool start_powered_off; diff --git a/include/tcg/startup.h b/include/tcg/startup.h index 95f574af2b..f9126bb0bd 100644 --- a/include/tcg/startup.h +++ b/include/tcg/startup.h @@ -25,6 +25,8 @@ #ifndef TCG_STARTUP_H #define TCG_STARTUP_H =20 +#include "hw/core/cpu.h" + /** * tcg_init: Initialize the TCG runtime * @tb_size: translation buffer size @@ -43,7 +45,11 @@ void tcg_init(size_t tb_size, int splitwx, unsigned max_= threads); * accelerator's init_machine() method) must register with this * function before initiating translation. */ +#ifdef CONFIG_USER_ONLY void tcg_register_thread(void); +#else +void tcg_register_thread(CPUState *cpu); +#endif =20 /** * tcg_prologue_init(): Generate the code for the TCG prologue diff --git a/include/tcg/tcg.h b/include/tcg/tcg.h index a6d9aa50d4..e197ee03c0 100644 --- a/include/tcg/tcg.h +++ b/include/tcg/tcg.h @@ -396,6 +396,7 @@ struct TCGContext { =20 /* Track which vCPU triggers events */ CPUState *cpu; /* *_trans */ + bool tbflush_pend; /* TB flush pending due to lazy vCPU realization */ =20 /* These structures are private to tcg-target.c.inc. */ QSIMPLEQ_HEAD(, TCGLabelQemuLdst) ldst_labels; diff --git a/tcg/region.c b/tcg/region.c index 7ea0b37a84..23635e0194 100644 --- a/tcg/region.c +++ b/tcg/region.c @@ -393,6 +393,22 @@ bool tcg_region_alloc(TCGContext *s) static void tcg_region_initial_alloc__locked(TCGContext *s) { bool err =3D tcg_region_alloc__locked(s); + + /* + * Lazily realized vCPUs (administratively "disabled" at boot and real= ized + * later on demand) may initially fail to obtain even a single code re= gion + * if the shared TB cache is under pressure from already running vCPUs. + * + * Treat this first-allocation failure as non-fatal: mark this TCGCont= ext + * to request a TB cache flush and return. The flush is performed late= r, + * synchronously in the vCPU execution path (cpu_exec_loop()/tb_gen_co= de()), + * which is the safe place for tb_flush(). + */ + if (err && s->cpu && s->cpu->lazy_realized) { + s->tbflush_pend =3D true; + return; + } + g_assert(!err); } =20 diff --git a/tcg/tcg.c b/tcg/tcg.c index afac55a203..5867952ae7 100644 --- a/tcg/tcg.c +++ b/tcg/tcg.c @@ -1285,12 +1285,14 @@ void tcg_register_thread(void) tcg_ctx =3D &tcg_init_ctx; } #else -void tcg_register_thread(void) +void tcg_register_thread(CPUState *cpu) { TCGContext *s =3D g_malloc(sizeof(*s)); unsigned int i, n; =20 *s =3D tcg_init_ctx; + s->cpu =3D cpu; + s->tbflush_pend =3D false; =20 /* Relink mem_base. */ for (i =3D 0, n =3D tcg_init_ctx.nb_globals; i < n; ++i) { @@ -1871,6 +1873,21 @@ TranslationBlock *tcg_tb_alloc(TCGContext *s) TranslationBlock *tb; void *next; =20 + /* + * Lazy realization: + * A vCPU that was realized after machine init may have failed its fir= st + * code-region allocation (see tcg_region_initial_alloc__locked()) and + * requested a deferred TB-cache flush by setting s->tbflush_pend. + * + * If the flag is set, do not attempt allocation here. Clear the flag = and + * return NULL so the caller (tb_gen_code()/cpu_exec_loop()) can perfo= rm a + * safe tb_flush() and then retry TB allocation. + */ + if (s->tbflush_pend) { + s->tbflush_pend =3D false; + return NULL; + } + retry: tb =3D (void *)ROUND_UP((uintptr_t)s->code_gen_ptr, align); next =3D (void *)ROUND_UP((uintptr_t)(tb + 1), align); --=20 2.34.1