From nobody Fri Apr 19 05:21:21 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1547221564975776.0573357430163; Fri, 11 Jan 2019 07:46:04 -0800 (PST) Received: from localhost ([127.0.0.1]:56008 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghz0V-00081L-60 for importer@patchew.org; Fri, 11 Jan 2019 10:46:03 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38183) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghyrX-0000et-9p for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ghyrV-0003o7-0I for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:47 -0500 Received: from mga14.intel.com ([192.55.52.115]:56403) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ghyrS-0003gw-HM for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:43 -0500 Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Jan 2019 07:36:30 -0800 Received: from vmmtaopc.sh.intel.com ([10.239.13.92]) by orsmga006.jf.intel.com with ESMTP; 11 Jan 2019 07:36:26 -0800 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,466,1539673200"; d="scan'208";a="107522595" From: Tao Xu To: mst@redhat.com, imammedo@redhat.com Date: Fri, 11 Jan 2019 23:34:43 +0800 Message-Id: <20190111153451.14304-2-tao3.xu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111153451.14304-1-tao3.xu@intel.com> References: <20190111153451.14304-1-tao3.xu@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.115 Subject: [Qemu-devel] [PATCH v2 1/9] hmat acpi: Build Memory Subsystem Address Range Structure(s) in ACPI HMAT X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ehabkost@redhat.com, jingqi.liu@intel.com, tao3.xu@intel.com, qemu-devel@nongnu.org, pbonzini@redhat.com, danmei.wei@intel.com, rth@twiddle.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Liu Jingqi HMAT is defined in ACPI 6.2: 5.2.27 Heterogeneous Memory Attribute Table (H= MAT). The specification references below link: http://www.uefi.org/sites/default/files/resources/ACPI_6_2.pdf It describes the memory attributes, such as memory side cache attributes and bandwidth and latency details, related to the System Physical Address (SPA) Memory Ranges. The software is expected to use this information as hint for optimization. This structure describes the System Physical Address(SPA) range occupied by memory subsystem and its associativity with processor proximity domain as well as hint for memory usage. Signed-off-by: Liu Jingqi Signed-off-by: Tao Xu --- default-configs/i386-softmmu.mak | 1 + hw/acpi/Makefile.objs | 1 + hw/acpi/hmat.c | 139 +++++++++++++++++++++++++++++++ hw/acpi/hmat.h | 73 ++++++++++++++++ hw/i386/acpi-build.c | 121 +++++++++++++++++---------- hw/i386/acpi-build.h | 10 +++ include/sysemu/numa.h | 2 + numa.c | 6 ++ 8 files changed, 308 insertions(+), 45 deletions(-) create mode 100644 hw/acpi/hmat.c create mode 100644 hw/acpi/hmat.h diff --git a/default-configs/i386-softmmu.mak b/default-configs/i386-softmm= u.mak index 64c998c4c8..3b77640f9d 100644 --- a/default-configs/i386-softmmu.mak +++ b/default-configs/i386-softmmu.mak @@ -67,3 +67,4 @@ CONFIG_I2C=3Dy CONFIG_SEV=3D$(CONFIG_KVM) CONFIG_VTD=3Dy CONFIG_AMD_IOMMU=3Dy +CONFIG_ACPI_HMAT=3Dy diff --git a/hw/acpi/Makefile.objs b/hw/acpi/Makefile.objs index 11c35bcb44..21889fd80a 100644 --- a/hw/acpi/Makefile.objs +++ b/hw/acpi/Makefile.objs @@ -6,6 +6,7 @@ common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) +=3D memory_hotplu= g.o common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) +=3D cpu.o common-obj-$(CONFIG_ACPI_NVDIMM) +=3D nvdimm.o common-obj-$(CONFIG_ACPI_VMGENID) +=3D vmgenid.o +common-obj-$(CONFIG_ACPI_HMAT) +=3D hmat.o common-obj-$(call lnot,$(CONFIG_ACPI_X86)) +=3D acpi-stub.o =20 common-obj-y +=3D acpi_interface.o diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c new file mode 100644 index 0000000000..d4e586d4f5 --- /dev/null +++ b/hw/acpi/hmat.c @@ -0,0 +1,139 @@ +/* + * HMAT ACPI Implementation + * + * Copyright(C) 2018 Intel Corporation. + * + * Author: + * Liu jingqi + * + * HMAT is defined in ACPI 6.2. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see + */ + +#include "unistd.h" +#include "fcntl.h" +#include "qemu/osdep.h" +#include "sysemu/numa.h" +#include "hw/i386/pc.h" +#include "hw/i386/acpi-build.h" +#include "hw/acpi/acpi.h" +#include "hw/acpi/hmat.h" +#include "hw/acpi/aml-build.h" +#include "hw/nvram/fw_cfg.h" +#include "hw/acpi/bios-linker-loader.h" + +/* Build Memory Subsystem Address Range Structure */ +static void hmat_build_spa_info(GArray *table_data, + uint64_t base, uint64_t length, int node) +{ + uint16_t flags =3D 0; + + if (numa_info[node].is_initiator) { + flags |=3D HMAT_SPA_PROC_VALID; + } + if (numa_info[node].is_target) { + flags |=3D HMAT_SPA_MEM_VALID; + } + + /* Type */ + build_append_int_noprefix(table_data, ACPI_HMAT_SPA, sizeof(uint16_t)); + /* Reserved0 */ + build_append_int_noprefix(table_data, 0, sizeof(uint16_t)); + /* Length */ + build_append_int_noprefix(table_data, sizeof(AcpiHmatSpaRange), + sizeof(uint32_t)); + /* Flags */ + build_append_int_noprefix(table_data, flags, sizeof(uint16_t)); + /* Reserved1 */ + build_append_int_noprefix(table_data, 0, sizeof(uint16_t)); + /* Process Proximity Domain */ + build_append_int_noprefix(table_data, node, sizeof(uint32_t)); + /* Memory Proximity Domain */ + build_append_int_noprefix(table_data, node, sizeof(uint32_t)); + /* Reserved2 */ + build_append_int_noprefix(table_data, 0, sizeof(uint32_t)); + /* System Physical Address Range Base */ + build_append_int_noprefix(table_data, base, sizeof(uint64_t)); + /* System Physical Address Range Length */ + build_append_int_noprefix(table_data, length, sizeof(uint64_t)); +} + +static int pc_dimm_device_list(Object *obj, void *opaque) +{ + GSList **list =3D opaque; + + if (object_dynamic_cast(obj, TYPE_PC_DIMM)) { + *list =3D g_slist_append(*list, DEVICE(obj)); + } + + object_child_foreach(obj, pc_dimm_device_list, opaque); + return 0; +} + +/* + * The Proximity Domain of System Physical Address ranges defined + * in the HMAT, NFIT and SRAT tables shall match each other. + */ +static void hmat_build_spa(GArray *table_data, PCMachineState *pcms) +{ + GSList *device_list =3D NULL; + uint64_t mem_base, mem_len; + int i; + + if (pcms->numa_nodes && !mem_ranges_number) { + build_mem_ranges(pcms); + } + + for (i =3D 0; i < mem_ranges_number; i++) { + hmat_build_spa_info(table_data, mem_ranges[i].base, + mem_ranges[i].length, mem_ranges[i].node); + } + + /* Build HMAT SPA structures for PC-DIMM devices. */ + object_child_foreach(qdev_get_machine(), pc_dimm_device_list, &device_= list); + + for (; device_list; device_list =3D device_list->next) { + PCDIMMDevice *dimm =3D device_list->data; + mem_base =3D object_property_get_uint(OBJECT(dimm), PC_DIMM_ADDR_P= ROP, + NULL); + mem_len =3D object_property_get_uint(OBJECT(dimm), PC_DIMM_SIZE_PR= OP, + NULL); + i =3D object_property_get_uint(OBJECT(dimm), PC_DIMM_NODE_PROP, NU= LL); + hmat_build_spa_info(table_data, mem_base, mem_len, i); + } +} + +static void hmat_build_hma(GArray *hma, PCMachineState *pcms) +{ + /* Build HMAT Memory Subsystem Address Range. */ + hmat_build_spa(hma, pcms); +} + +void hmat_build_acpi(GArray *table_data, BIOSLinker *linker, + MachineState *machine) +{ + PCMachineState *pcms =3D PC_MACHINE(machine); + uint64_t hmat_start, hmat_len; + + hmat_start =3D table_data->len; + acpi_data_push(table_data, sizeof(AcpiHmat)); + + hmat_build_hma(table_data, pcms); + hmat_len =3D table_data->len - hmat_start; + + build_header(linker, table_data, + (void *)(table_data->data + hmat_start), + "HMAT", hmat_len, 1, NULL, NULL); +} diff --git a/hw/acpi/hmat.h b/hw/acpi/hmat.h new file mode 100644 index 0000000000..096415df8a --- /dev/null +++ b/hw/acpi/hmat.h @@ -0,0 +1,73 @@ +/* + * HMAT ACPI Implementation Header + * + * Copyright(C) 2018 Intel Corporation. + * + * Author: + * Liu jingqi + * + * HMAT is defined in ACPI 6.2. + * + * This library is free software; you can redistribute it and/or + * modify it under the terms of the GNU Lesser General Public + * License as published by the Free Software Foundation; either + * version 2 of the License, or (at your option) any later version. + * + * This library is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU + * Lesser General Public License for more details. + * + * You should have received a copy of the GNU Lesser General Public + * License along with this library; if not, see + */ + +#ifndef HMAT_H +#define HMAT_H + +#include "qemu/osdep.h" +#include "hw/acpi/acpi-defs.h" +#include "hw/acpi/acpi.h" +#include "hw/acpi/bios-linker-loader.h" +#include "hw/acpi/aml-build.h" + +#define ACPI_HMAT_SPA 0 + +/* ACPI HMAT sub-structure header */ +#define ACPI_HMAT_SUB_HEADER_DEF \ + uint16_t type; \ + uint16_t reserved0; \ + uint32_t length; + +/* the values of AcpiHmatSpaRange flag */ +enum { + HMAT_SPA_PROC_VALID =3D 0x1, + HMAT_SPA_MEM_VALID =3D 0x2, + HMAT_SPA_RESERVATION_HINT =3D 0x4, +}; + +/* + * HMAT (Heterogeneous Memory Attributes Table) + */ +struct AcpiHmat { + ACPI_TABLE_HEADER_DEF + uint32_t reserved; +} QEMU_PACKED; +typedef struct AcpiHmat AcpiHmat; + +struct AcpiHmatSpaRange { + ACPI_HMAT_SUB_HEADER_DEF + uint16_t flags; + uint16_t reserved1; + uint32_t proc_proximity; + uint32_t mem_proximity; + uint32_t reserved2; + uint64_t spa_base; + uint64_t spa_length; +} QEMU_PACKED; +typedef struct AcpiHmatSpaRange AcpiHmatSpaRange; + +void hmat_build_acpi(GArray *table_data, BIOSLinker *linker, + MachineState *machine); + +#endif diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 14f757fc36..a93d437175 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -64,6 +64,7 @@ #include "hw/i386/intel_iommu.h" =20 #include "hw/acpi/ipmi.h" +#include "hw/acpi/hmat.h" =20 /* These are used to size the ACPI tables for -M pc-i440fx-1.7 and * -M pc-i440fx-2.0. Even if the actual amount of AML generated grows @@ -119,6 +120,14 @@ typedef struct AcpiBuildPciBusHotplugState { bool pcihp_bridge_en; } AcpiBuildPciBusHotplugState; =20 +/* The memory contains at least one hole + * from 640k-1M and possibly another one from 3.5G-4G. + * So far, the number of memory ranges is up to 2 + * more than the number of numa nodes. + */ +MemoryRange mem_ranges[MAX_NODES + 2]; +uint32_t mem_ranges_number; + static void init_common_fadt_data(Object *o, AcpiFadtData *data) { uint32_t io =3D object_property_get_uint(o, ACPI_PM_PROP_PM_IO_BASE, N= ULL); @@ -2251,6 +2260,63 @@ build_tpm2(GArray *table_data, BIOSLinker *linker, G= Array *tcpalog) #define HOLE_640K_START (640 * KiB) #define HOLE_640K_END (1 * MiB) =20 +void build_mem_ranges(PCMachineState *pcms) +{ + uint64_t mem_len, mem_base, next_base; + int i; + + /* the memory map is a bit tricky, it contains at least one hole + * from 640k-1M and possibly another one from 3.5G-4G. + */ + mem_ranges_number =3D 0; + next_base =3D 0; + + for (i =3D 0; i < pcms->numa_nodes; ++i) { + mem_base =3D next_base; + mem_len =3D pcms->node_mem[i]; + next_base =3D mem_base + mem_len; + + /* Cut out the 640K hole */ + if (mem_base <=3D HOLE_640K_START && + next_base > HOLE_640K_START) { + mem_len -=3D next_base - HOLE_640K_START; + if (mem_len > 0) { + mem_ranges[mem_ranges_number].base =3D mem_base; + mem_ranges[mem_ranges_number].length =3D mem_len; + mem_ranges[mem_ranges_number].node =3D i; + mem_ranges_number++; + } + + /* Check for the rare case: 640K < RAM < 1M */ + if (next_base <=3D HOLE_640K_END) { + next_base =3D HOLE_640K_END; + continue; + } + mem_base =3D HOLE_640K_END; + mem_len =3D next_base - HOLE_640K_END; + } + + /* Cut out the ACPI_PCI hole */ + if (mem_base <=3D pcms->below_4g_mem_size && + next_base > pcms->below_4g_mem_size) { + mem_len -=3D next_base - pcms->below_4g_mem_size; + if (mem_len > 0) { + mem_ranges[mem_ranges_number].base =3D mem_base; + mem_ranges[mem_ranges_number].length =3D mem_len; + mem_ranges[mem_ranges_number].node =3D i; + mem_ranges_number++; + } + mem_base =3D 1ULL << 32; + mem_len =3D next_base - pcms->below_4g_mem_size; + next_base =3D mem_base + mem_len; + } + mem_ranges[mem_ranges_number].base =3D mem_base; + mem_ranges[mem_ranges_number].length =3D mem_len; + mem_ranges[mem_ranges_number].node =3D i; + mem_ranges_number++; + } +} + static void build_srat(GArray *table_data, BIOSLinker *linker, MachineState *machine) { @@ -2259,7 +2325,6 @@ build_srat(GArray *table_data, BIOSLinker *linker, Ma= chineState *machine) =20 int i; int srat_start, numa_start, slots; - uint64_t mem_len, mem_base, next_base; MachineClass *mc =3D MACHINE_GET_CLASS(machine); const CPUArchIdList *apic_ids =3D mc->possible_cpu_arch_ids(machine); PCMachineState *pcms =3D PC_MACHINE(machine); @@ -2299,54 +2364,18 @@ build_srat(GArray *table_data, BIOSLinker *linker, = MachineState *machine) } } =20 + if (pcms->numa_nodes && !mem_ranges_number) { + build_mem_ranges(pcms); + } =20 - /* the memory map is a bit tricky, it contains at least one hole - * from 640k-1M and possibly another one from 3.5G-4G. - */ - next_base =3D 0; numa_start =3D table_data->len; =20 - for (i =3D 1; i < pcms->numa_nodes + 1; ++i) { - mem_base =3D next_base; - mem_len =3D pcms->node_mem[i - 1]; - next_base =3D mem_base + mem_len; - - /* Cut out the 640K hole */ - if (mem_base <=3D HOLE_640K_START && - next_base > HOLE_640K_START) { - mem_len -=3D next_base - HOLE_640K_START; - if (mem_len > 0) { - numamem =3D acpi_data_push(table_data, sizeof *numamem); - build_srat_memory(numamem, mem_base, mem_len, i - 1, - MEM_AFFINITY_ENABLED); - } - - /* Check for the rare case: 640K < RAM < 1M */ - if (next_base <=3D HOLE_640K_END) { - next_base =3D HOLE_640K_END; - continue; - } - mem_base =3D HOLE_640K_END; - mem_len =3D next_base - HOLE_640K_END; - } - - /* Cut out the ACPI_PCI hole */ - if (mem_base <=3D pcms->below_4g_mem_size && - next_base > pcms->below_4g_mem_size) { - mem_len -=3D next_base - pcms->below_4g_mem_size; - if (mem_len > 0) { - numamem =3D acpi_data_push(table_data, sizeof *numamem); - build_srat_memory(numamem, mem_base, mem_len, i - 1, - MEM_AFFINITY_ENABLED); - } - mem_base =3D 1ULL << 32; - mem_len =3D next_base - pcms->below_4g_mem_size; - next_base =3D mem_base + mem_len; - } - - if (mem_len > 0) { + for (i =3D 0; i < mem_ranges_number; i++) { + if (mem_ranges[i].length > 0) { numamem =3D acpi_data_push(table_data, sizeof *numamem); - build_srat_memory(numamem, mem_base, mem_len, i - 1, + build_srat_memory(numamem, mem_ranges[i].base, + mem_ranges[i].length, + mem_ranges[i].node, MEM_AFFINITY_ENABLED); } } @@ -2669,6 +2698,8 @@ void acpi_build(AcpiBuildTables *tables, MachineState= *machine) acpi_add_table(table_offsets, tables_blob); build_slit(tables_blob, tables->linker); } + acpi_add_table(table_offsets, tables_blob); + hmat_build_acpi(tables_blob, tables->linker, machine); } if (acpi_get_mcfg(&mcfg)) { acpi_add_table(table_offsets, tables_blob); diff --git a/hw/i386/acpi-build.h b/hw/i386/acpi-build.h index 007332e51c..f17de6af6a 100644 --- a/hw/i386/acpi-build.h +++ b/hw/i386/acpi-build.h @@ -2,6 +2,16 @@ #ifndef HW_I386_ACPI_BUILD_H #define HW_I386_ACPI_BUILD_H =20 +typedef struct memory_range { + uint64_t base; + uint64_t length; + uint32_t node; +} MemoryRange; + +extern MemoryRange mem_ranges[]; +extern uint32_t mem_ranges_number; + +void build_mem_ranges(PCMachineState *pcms); void acpi_setup(void); =20 #endif diff --git a/include/sysemu/numa.h b/include/sysemu/numa.h index b6ac7de43e..d41be00b92 100644 --- a/include/sysemu/numa.h +++ b/include/sysemu/numa.h @@ -13,6 +13,8 @@ struct NodeInfo { uint64_t node_mem; struct HostMemoryBackend *node_memdev; bool present; + bool is_initiator; + bool is_target; uint8_t distance[MAX_NODES]; }; =20 diff --git a/numa.c b/numa.c index 50ec016013..9ee4f6f258 100644 --- a/numa.c +++ b/numa.c @@ -105,6 +105,10 @@ static void parse_numa_node(MachineState *ms, NumaNode= Options *node, } } =20 + if (node->cpus) { + numa_info[nodenr].is_initiator =3D true; + } + if (node->has_mem && node->has_memdev) { error_setg(errp, "cannot specify both mem=3D and memdev=3D"); return; @@ -121,6 +125,7 @@ static void parse_numa_node(MachineState *ms, NumaNodeO= ptions *node, =20 if (node->has_mem) { numa_info[nodenr].node_mem =3D node->mem; + numa_info[nodenr].is_target =3D true; } if (node->has_memdev) { Object *o; @@ -133,6 +138,7 @@ static void parse_numa_node(MachineState *ms, NumaNodeO= ptions *node, object_ref(o); numa_info[nodenr].node_mem =3D object_property_get_uint(o, "size",= NULL); numa_info[nodenr].node_memdev =3D MEMORY_BACKEND(o); + numa_info[nodenr].is_target =3D true; } numa_info[nodenr].present =3D true; max_numa_nodeid =3D MAX(max_numa_nodeid, nodenr + 1); --=20 2.17.1 From nobody Fri Apr 19 05:21:21 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1547221390433918.3280601155058; Fri, 11 Jan 2019 07:43:10 -0800 (PST) Received: from localhost ([127.0.0.1]:55230 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghyxh-0005SC-8K for importer@patchew.org; Fri, 11 Jan 2019 10:43:09 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38158) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghyrW-0000eW-GS for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:49 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ghyrV-0003oE-0Q for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:46 -0500 Received: from mga14.intel.com ([192.55.52.115]:56407) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ghyrS-0003i3-HP for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:43 -0500 Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Jan 2019 07:36:32 -0800 Received: from vmmtaopc.sh.intel.com ([10.239.13.92]) by orsmga006.jf.intel.com with ESMTP; 11 Jan 2019 07:36:30 -0800 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,466,1539673200"; d="scan'208";a="107522603" From: Tao Xu To: mst@redhat.com, imammedo@redhat.com Date: Fri, 11 Jan 2019 23:34:44 +0800 Message-Id: <20190111153451.14304-3-tao3.xu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111153451.14304-1-tao3.xu@intel.com> References: <20190111153451.14304-1-tao3.xu@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.115 Subject: [Qemu-devel] [PATCH v2 2/9] hmat acpi: Build System Locality Latency and Bandwidth Information Structure(s) in ACPI HMAT X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ehabkost@redhat.com, jingqi.liu@intel.com, tao3.xu@intel.com, qemu-devel@nongnu.org, pbonzini@redhat.com, danmei.wei@intel.com, rth@twiddle.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Liu Jingqi This structure describes the memory access latency and bandwidth information from various memory access initiator proximity domains. The latency and bandwidth numbers represented in this structure correspond to rated latency and bandwidth for the platform. The software could use this information as hint for optimization. Signed-off-by: Liu Jingqi Signed-off-by: Tao Xu --- hw/acpi/hmat.c | 98 ++++++++++++++++++++++++++++++++++++++++++++++++++ hw/acpi/hmat.h | 76 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 174 insertions(+) diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c index d4e586d4f5..214f150fe6 100644 --- a/hw/acpi/hmat.c +++ b/hw/acpi/hmat.c @@ -34,6 +34,11 @@ #include "hw/nvram/fw_cfg.h" #include "hw/acpi/bios-linker-loader.h" =20 +struct numa_hmat_lb_info *hmat_lb_info[HMAT_LB_LEVELS][HMAT_LB_TYPES] =3D = {0}; + +static uint32_t initiator_pxm[MAX_NODES], target_pxm[MAX_NODES]; +static uint32_t num_initiator, num_target; + /* Build Memory Subsystem Address Range Structure */ static void hmat_build_spa_info(GArray *table_data, uint64_t base, uint64_t length, int node) @@ -115,10 +120,103 @@ static void hmat_build_spa(GArray *table_data, PCMac= hineState *pcms) } } =20 +static void classify_proximity_domains(void) +{ + int node; + + for (node =3D 0; node < nb_numa_nodes; node++) { + if (numa_info[node].is_initiator) { + initiator_pxm[num_initiator++] =3D node; + } + if (numa_info[node].is_target) { + target_pxm[num_target++] =3D node; + } + } +} + +static void hmat_build_lb(GArray *table_data) +{ + AcpiHmatLBInfo *hmat_lb; + struct numa_hmat_lb_info *numa_hmat_lb; + int i, j, hrchy, type; + + if (!num_initiator && !num_target) { + classify_proximity_domains(); + } + + for (hrchy =3D HMAT_LB_MEM_MEMORY; + hrchy <=3D HMAT_LB_MEM_CACHE_3RD_LEVEL; hrchy++) { + for (type =3D HMAT_LB_DATA_ACCESS_LATENCY; + type <=3D HMAT_LB_DATA_WRITE_BANDWIDTH; type++) { + numa_hmat_lb =3D hmat_lb_info[hrchy][type]; + + if (numa_hmat_lb) { + uint64_t start; + uint32_t *list_entry; + uint16_t *entry, *entry_start; + uint32_t size; + uint8_t m, n; + + start =3D table_data->len; + hmat_lb =3D acpi_data_push(table_data, sizeof(*hmat_lb)); + + hmat_lb->type =3D cpu_to_le16(ACPI_HMAT_LB_INFO); + hmat_lb->flags =3D numa_hmat_lb->hierarchy; + hmat_lb->data_type =3D numa_hmat_lb->data_type; + hmat_lb->num_initiator =3D cpu_to_le32(num_initiator); + hmat_lb->num_target =3D cpu_to_le32(num_target); + + if (type <=3D HMAT_LB_DATA_WRITE_LATENCY) { + hmat_lb->base_unit =3D cpu_to_le32(numa_hmat_lb->base_= lat); + } else { + hmat_lb->base_unit =3D cpu_to_le32(numa_hmat_lb->base_= bw); + } + if (!hmat_lb->base_unit) { + hmat_lb->base_unit =3D cpu_to_le32(1); + } + + /* the initiator proximity domain list */ + for (i =3D 0; i < num_initiator; i++) { + list_entry =3D acpi_data_push(table_data, sizeof(uint3= 2_t)); + *list_entry =3D cpu_to_le32(initiator_pxm[i]); + } + + /* the target proximity domain list */ + for (i =3D 0; i < num_target; i++) { + list_entry =3D acpi_data_push(table_data, sizeof(uint3= 2_t)); + *list_entry =3D cpu_to_le32(target_pxm[i]); + } + + /* latency or bandwidth entries */ + size =3D sizeof(uint16_t) * num_initiator * num_target; + entry_start =3D acpi_data_push(table_data, size); + + for (i =3D 0; i < num_initiator; i++) { + m =3D initiator_pxm[i]; + for (j =3D 0; j < num_target; j++) { + n =3D target_pxm[j]; + entry =3D entry_start + i * num_target + j; + if (type <=3D HMAT_LB_DATA_WRITE_LATENCY) { + *entry =3D cpu_to_le16(numa_hmat_lb->latency[m= ][n]); + } else { + *entry =3D cpu_to_le16(numa_hmat_lb->bandwidth= [m][n]); + } + } + } + hmat_lb =3D (AcpiHmatLBInfo *)(table_data->data + start); + hmat_lb->length =3D cpu_to_le16(table_data->len - start); + } + } + } +} + static void hmat_build_hma(GArray *hma, PCMachineState *pcms) { /* Build HMAT Memory Subsystem Address Range. */ hmat_build_spa(hma, pcms); + + /* Build HMAT System Locality Latency and Bandwidth Information. */ + hmat_build_lb(hma); } =20 void hmat_build_acpi(GArray *table_data, BIOSLinker *linker, diff --git a/hw/acpi/hmat.h b/hw/acpi/hmat.h index 096415df8a..fddd05e0d1 100644 --- a/hw/acpi/hmat.h +++ b/hw/acpi/hmat.h @@ -32,6 +32,7 @@ #include "hw/acpi/aml-build.h" =20 #define ACPI_HMAT_SPA 0 +#define ACPI_HMAT_LB_INFO 1 =20 /* ACPI HMAT sub-structure header */ #define ACPI_HMAT_SUB_HEADER_DEF \ @@ -46,6 +47,28 @@ enum { HMAT_SPA_RESERVATION_HINT =3D 0x4, }; =20 +/* the value of AcpiHmatLBInfo flags */ +enum { + HMAT_LB_MEM_MEMORY =3D 0, + HMAT_LB_MEM_CACHE_LAST_LEVEL =3D 1, + HMAT_LB_MEM_CACHE_1ST_LEVEL =3D 2, + HMAT_LB_MEM_CACHE_2ND_LEVEL =3D 3, + HMAT_LB_MEM_CACHE_3RD_LEVEL =3D 4, +}; + +/* the value of AcpiHmatLBInfo data type */ +enum { + HMAT_LB_DATA_ACCESS_LATENCY =3D 0, + HMAT_LB_DATA_READ_LATENCY =3D 1, + HMAT_LB_DATA_WRITE_LATENCY =3D 2, + HMAT_LB_DATA_ACCESS_BANDWIDTH =3D 3, + HMAT_LB_DATA_READ_BANDWIDTH =3D 4, + HMAT_LB_DATA_WRITE_BANDWIDTH =3D 5, +}; + +#define HMAT_LB_LEVELS (HMAT_LB_MEM_CACHE_3RD_LEVEL + 1) +#define HMAT_LB_TYPES (HMAT_LB_DATA_WRITE_BANDWIDTH + 1) + /* * HMAT (Heterogeneous Memory Attributes Table) */ @@ -67,6 +90,59 @@ struct AcpiHmatSpaRange { } QEMU_PACKED; typedef struct AcpiHmatSpaRange AcpiHmatSpaRange; =20 +struct AcpiHmatLBInfo { + ACPI_HMAT_SUB_HEADER_DEF + uint8_t flags; + uint8_t data_type; + uint16_t reserved1; + uint32_t num_initiator; + uint32_t num_target; + uint32_t reserved2; + uint64_t base_unit; +} QEMU_PACKED; +typedef struct AcpiHmatLBInfo AcpiHmatLBInfo; + +struct numa_hmat_lb_info { + /* + * Indicates total number of Proximity Domains + * that can initiate memory access requests. + */ + uint32_t num_initiator; + /* + * Indicates total number of Proximity Domains + * that can act as target. + */ + uint32_t num_target; + /* + * Indicates it's memory or + * the specified level memory side cache. + */ + uint8_t hierarchy; + /* + * Present the type of data, + * access/read/write latency or bandwidth. + */ + uint8_t data_type; + /* The base unit for latency in nanoseconds. */ + uint64_t base_lat; + /* The base unit for bandwidth in megabytes per second(MB/s). */ + uint64_t base_bw; + /* + * latency[i][j]: + * Indicates the latency based on base_lat + * from Initiator Proximity Domain i to Target Proximity Domain j. + */ + uint16_t latency[MAX_NODES][MAX_NODES]; + /* + * bandwidth[i][j]: + * Indicates the bandwidth based on base_bw + * from Initiator Proximity Domain i to Target Proximity Domain j. + */ + uint16_t bandwidth[MAX_NODES][MAX_NODES]; +}; + +extern struct numa_hmat_lb_info *hmat_lb_info[HMAT_LB_LEVELS][HMAT_LB_TYPE= S]; + void hmat_build_acpi(GArray *table_data, BIOSLinker *linker, MachineState *machine); =20 --=20 2.17.1 From nobody Fri Apr 19 05:21:21 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1547221139702155.06873181345497; Fri, 11 Jan 2019 07:38:59 -0800 (PST) Received: from localhost ([127.0.0.1]:54161 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghyte-0001qm-CY for importer@patchew.org; Fri, 11 Jan 2019 10:38:58 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38219) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghyrZ-0000ey-2t for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ghyrW-0003s9-W4 for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:49 -0500 Received: from mga14.intel.com ([192.55.52.115]:56403) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ghyrW-0003gw-KC for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:46 -0500 Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Jan 2019 07:36:36 -0800 Received: from vmmtaopc.sh.intel.com ([10.239.13.92]) by orsmga006.jf.intel.com with ESMTP; 11 Jan 2019 07:36:32 -0800 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,466,1539673200"; d="scan'208";a="107522613" From: Tao Xu To: mst@redhat.com, imammedo@redhat.com Date: Fri, 11 Jan 2019 23:34:45 +0800 Message-Id: <20190111153451.14304-4-tao3.xu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111153451.14304-1-tao3.xu@intel.com> References: <20190111153451.14304-1-tao3.xu@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.115 Subject: [Qemu-devel] [PATCH v2 3/9] hmat acpi: Build Memory Side Cache Information Structure(s) in ACPI HMAT X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ehabkost@redhat.com, jingqi.liu@intel.com, tao3.xu@intel.com, qemu-devel@nongnu.org, pbonzini@redhat.com, danmei.wei@intel.com, rth@twiddle.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Liu Jingqi This structure describes memory side cache information for memory proximity domains if the memory side cache is present and the physical device(SMBIOS handle) forms the memory side cache. The software could use this information to effectively place the data in memory to maximize the performance of the system memory that use the memory side cache. Signed-off-by: Liu Jingqi Signed-off-by: Tao Xu --- hw/acpi/hmat.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++ hw/acpi/hmat.h | 44 +++++++++++++++++++++++++++++++++++++++ 2 files changed, 100 insertions(+) diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c index 214f150fe6..9d29ef7929 100644 --- a/hw/acpi/hmat.c +++ b/hw/acpi/hmat.c @@ -35,6 +35,8 @@ #include "hw/acpi/bios-linker-loader.h" =20 struct numa_hmat_lb_info *hmat_lb_info[HMAT_LB_LEVELS][HMAT_LB_TYPES] =3D = {0}; +struct numa_hmat_cache_info + *hmat_cache_info[MAX_NODES][MAX_HMAT_CACHE_LEVEL + 1] =3D {0}; =20 static uint32_t initiator_pxm[MAX_NODES], target_pxm[MAX_NODES]; static uint32_t num_initiator, num_target; @@ -210,6 +212,57 @@ static void hmat_build_lb(GArray *table_data) } } =20 +static void hmat_build_cache(GArray *table_data) +{ + AcpiHmatCacheInfo *hmat_cache; + struct numa_hmat_cache_info *numa_hmat_cache; + int i, level; + + for (i =3D 0; i < nb_numa_nodes; i++) { + for (level =3D 0; level <=3D MAX_HMAT_CACHE_LEVEL; level++) { + numa_hmat_cache =3D hmat_cache_info[i][level]; + if (numa_hmat_cache) { + uint64_t start =3D table_data->len; + + hmat_cache =3D acpi_data_push(table_data, sizeof(*hmat_cac= he)); + hmat_cache->length =3D cpu_to_le32(sizeof(*hmat_cache)); + hmat_cache->type =3D cpu_to_le16(ACPI_HMAT_CACHE_INFO); + hmat_cache->mem_proximity =3D + cpu_to_le32(numa_hmat_cache->mem_proximity); + hmat_cache->cache_size =3D cpu_to_le64(numa_hmat_cache->s= ize); + hmat_cache->cache_attr =3D HMAT_CACHE_TOTAL_LEVEL( + numa_hmat_cache->total_levels); + hmat_cache->cache_attr |=3D HMAT_CACHE_CURRENT_LEVEL( + numa_hmat_cache->level); + hmat_cache->cache_attr |=3D HMAT_CACHE_ASSOC( + numa_hmat_cache->associativity); + hmat_cache->cache_attr |=3D HMAT_CACHE_WRITE_POLICY( + numa_hmat_cache->write_policy); + hmat_cache->cache_attr |=3D HMAT_CACHE_LINE_SIZE( + numa_hmat_cache->line_size); + hmat_cache->cache_attr =3D cpu_to_le32(hmat_cache->cache_a= ttr); + + if (numa_hmat_cache->num_smbios_handles !=3D 0) { + uint16_t *smbios_handles; + int size; + + size =3D hmat_cache->num_smbios_handles * sizeof(uint1= 6_t); + smbios_handles =3D acpi_data_push(table_data, size); + + hmat_cache =3D (AcpiHmatCacheInfo *) + (table_data->data + start); + hmat_cache->length +=3D size; + + /* TBD: set smbios handles */ + memset(smbios_handles, 0, size); + } + hmat_cache->num_smbios_handles =3D + cpu_to_le16(numa_hmat_cache->num_smbios_handle= s); + } + } + } +} + static void hmat_build_hma(GArray *hma, PCMachineState *pcms) { /* Build HMAT Memory Subsystem Address Range. */ @@ -217,6 +270,9 @@ static void hmat_build_hma(GArray *hma, PCMachineState = *pcms) =20 /* Build HMAT System Locality Latency and Bandwidth Information. */ hmat_build_lb(hma); + + /* Build HMAT Memory Side Cache Information. */ + hmat_build_cache(hma); } =20 void hmat_build_acpi(GArray *table_data, BIOSLinker *linker, diff --git a/hw/acpi/hmat.h b/hw/acpi/hmat.h index fddd05e0d1..f9fdcdcd33 100644 --- a/hw/acpi/hmat.h +++ b/hw/acpi/hmat.h @@ -33,6 +33,15 @@ =20 #define ACPI_HMAT_SPA 0 #define ACPI_HMAT_LB_INFO 1 +#define ACPI_HMAT_CACHE_INFO 2 + +#define MAX_HMAT_CACHE_LEVEL 3 + +#define HMAT_CACHE_TOTAL_LEVEL(level) (level & 0xF) +#define HMAT_CACHE_CURRENT_LEVEL(level) ((level & 0xF) << 4) +#define HMAT_CACHE_ASSOC(assoc) ((assoc & 0xF) << 8) +#define HMAT_CACHE_WRITE_POLICY(policy) ((policy & 0xF) << 12) +#define HMAT_CACHE_LINE_SIZE(size) ((size & 0xFFFF) << 16) =20 /* ACPI HMAT sub-structure header */ #define ACPI_HMAT_SUB_HEADER_DEF \ @@ -102,6 +111,17 @@ struct AcpiHmatLBInfo { } QEMU_PACKED; typedef struct AcpiHmatLBInfo AcpiHmatLBInfo; =20 +struct AcpiHmatCacheInfo { + ACPI_HMAT_SUB_HEADER_DEF + uint32_t mem_proximity; + uint32_t reserved; + uint64_t cache_size; + uint32_t cache_attr; + uint16_t reserved2; + uint16_t num_smbios_handles; +} QEMU_PACKED; +typedef struct AcpiHmatCacheInfo AcpiHmatCacheInfo; + struct numa_hmat_lb_info { /* * Indicates total number of Proximity Domains @@ -141,7 +161,31 @@ struct numa_hmat_lb_info { uint16_t bandwidth[MAX_NODES][MAX_NODES]; }; =20 +struct numa_hmat_cache_info { + /* The memory proximity domain to which the memory belongs. */ + uint32_t mem_proximity; + /* Size of memory side cache in bytes. */ + uint64_t size; + /* Total cache levels for this memory proximity domain. */ + uint8_t total_levels; + /* Cache level described in this structure. */ + uint8_t level; + /* Cache Associativity: None/Direct Mapped/Comple Cache Indexing */ + uint8_t associativity; + /* Write Policy: None/Write Back(WB)/Write Through(WT) */ + uint8_t write_policy; + /* Cache Line size in bytes. */ + uint16_t line_size; + /* + * Number of SMBIOS handles that contributes to + * the memory side cache physical devices. + */ + uint16_t num_smbios_handles; +}; + extern struct numa_hmat_lb_info *hmat_lb_info[HMAT_LB_LEVELS][HMAT_LB_TYPE= S]; +extern struct numa_hmat_cache_info + *hmat_cache_info[MAX_NODES][MAX_HMAT_CACHE_LEVEL + 1]; =20 void hmat_build_acpi(GArray *table_data, BIOSLinker *linker, MachineState *machine); --=20 2.17.1 From nobody Fri Apr 19 05:21:21 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1547221729539326.60422630353105; Fri, 11 Jan 2019 07:48:49 -0800 (PST) Received: from localhost ([127.0.0.1]:56651 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghz3A-0001Ye-AS for importer@patchew.org; Fri, 11 Jan 2019 10:48:48 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38257) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghyrb-0000gd-8E for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ghyrX-0003sl-AD for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:51 -0500 Received: from mga14.intel.com ([192.55.52.115]:56407) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ghyrW-0003i3-VK for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:47 -0500 Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Jan 2019 07:36:37 -0800 Received: from vmmtaopc.sh.intel.com ([10.239.13.92]) by orsmga006.jf.intel.com with ESMTP; 11 Jan 2019 07:36:35 -0800 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,466,1539673200"; d="scan'208";a="107522620" From: Tao Xu To: mst@redhat.com, imammedo@redhat.com Date: Fri, 11 Jan 2019 23:34:46 +0800 Message-Id: <20190111153451.14304-5-tao3.xu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111153451.14304-1-tao3.xu@intel.com> References: <20190111153451.14304-1-tao3.xu@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.115 Subject: [Qemu-devel] [PATCH v2 4/9] Extend the command-line to provide memory latency and bandwidth information X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ehabkost@redhat.com, jingqi.liu@intel.com, tao3.xu@intel.com, qemu-devel@nongnu.org, pbonzini@redhat.com, danmei.wei@intel.com, rth@twiddle.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Liu Jingqi Add -numa hmat-lb option to provide System Locality Latency and Bandwidth Information. These memory attributes help to build System Locality Latency and Bandwidth Information Structure(s) in ACPI Heterogeneous Memory Attribute Table (HMAT). Signed-off-by: Liu Jingqi Signed-off-by: Tao Xu --- numa.c | 124 ++++++++++++++++++++++++++++++++++++++++++++++++ qapi/misc.json | 92 ++++++++++++++++++++++++++++++++++- qemu-options.hx | 28 ++++++++++- 3 files changed, 241 insertions(+), 3 deletions(-) diff --git a/numa.c b/numa.c index 9ee4f6f258..97b77356ad 100644 --- a/numa.c +++ b/numa.c @@ -40,6 +40,7 @@ #include "qemu/option.h" #include "qemu/config-file.h" #include "qemu/cutils.h" +#include "hw/acpi/hmat.h" =20 QemuOptsList qemu_numa_opts =3D { .name =3D "numa", @@ -180,6 +181,123 @@ static void parse_numa_distance(NumaDistOptions *dist= , Error **errp) have_numa_distance =3D true; } =20 +static void parse_numa_hmat_lb(MachineState *ms, NumaHmatLBOptions *node, + Error **errp) +{ + struct numa_hmat_lb_info *hmat_lb =3D 0; + + if (node->data_type <=3D HMATLB_DATA_TYPE_WRITE_LATENCY) { + if (!node->has_latency) { + error_setg(errp, "Please specify the latency."); + return; + } + if (node->has_bandwidth) { + error_setg(errp, "Please do not specify the bandwidth " + "since the data type is latency."); + return; + } + if (node->has_base_bw) { + error_setg(errp, "Please do not specify the base-bw " + "since the data type is latency."); + return; + } + } + + if (node->data_type >=3D HMATLB_DATA_TYPE_ACCESS_BANDWIDTH) { + if (!node->has_bandwidth) { + error_setg(errp, "Please specify the bandwidth."); + return; + } + if (node->has_latency) { + error_setg(errp, "Please do not specify the latency " + "since the data type is bandwidth."); + return; + } + if (node->has_base_lat) { + error_setg(errp, "Please do not specify the base-lat " + "since the data type is bandwidth."); + return; + } + } + + if (node->initiator >=3D nb_numa_nodes) { + error_setg(errp, "Invalid initiator=3D%" + PRIu16 ", it should be less than %d.", + node->initiator, nb_numa_nodes); + return; + } + if (!numa_info[node->initiator].is_initiator) { + error_setg(errp, "Invalid initiator=3D%" + PRIu16 ", it isn't an initiator proximity domain.", + node->initiator); + return; + } + + if (node->target >=3D nb_numa_nodes) { + error_setg(errp, "Invalid initiator=3D%" + PRIu16 ", it should be less than %d.", + node->target, nb_numa_nodes); + return; + } + if (!numa_info[node->target].is_target) { + error_setg(errp, "Invalid target=3D%" + PRIu16 ", it isn't a target proximity domain.", + node->target); + return; + } + + if (node->has_latency) { + hmat_lb =3D hmat_lb_info[node->hierarchy][node->data_type]; + if (!hmat_lb) { + hmat_lb =3D g_malloc0(sizeof(*hmat_lb)); + hmat_lb_info[node->hierarchy][node->data_type] =3D hmat_lb; + } else if (hmat_lb->latency[node->initiator][node->target]) { + error_setg(errp, "Duplicate configuration of the latency for " + "initiator=3D%" PRIu16 " and target=3D%" PRIu16 ".", + node->initiator, node->target); + return; + } + + /* Only the first time of setting the base unit is valid. */ + if ((hmat_lb->base_lat =3D=3D 0) && (node->has_base_lat)) { + hmat_lb->base_lat =3D node->base_lat; + } + + hmat_lb->latency[node->initiator][node->target] =3D node->latency; + } + + if (node->has_bandwidth) { + hmat_lb =3D hmat_lb_info[node->hierarchy][node->data_type]; + + if (!hmat_lb) { + hmat_lb =3D g_malloc0(sizeof(*hmat_lb)); + hmat_lb_info[node->hierarchy][node->data_type] =3D hmat_lb; + } else if (hmat_lb->bandwidth[node->initiator][node->target]) { + error_setg(errp, "Duplicate configuration of the bandwidth for= " + "initiator=3D%" PRIu16 " and target=3D%" PRIu16 ".", + node->initiator, node->target); + return; + } + + /* Only the first time of setting the base unit is valid. */ + if (hmat_lb->base_bw =3D=3D 0) { + if (!node->has_base_bw) { + error_setg(errp, "Please provide the base-bw!"); + return; + } else { + hmat_lb->base_bw =3D node->base_bw; + } + } + + hmat_lb->bandwidth[node->initiator][node->target] =3D node->bandwi= dth; + } + + if (hmat_lb) { + hmat_lb->hierarchy =3D node->hierarchy; + hmat_lb->data_type =3D node->data_type; + } +} + static void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp) { @@ -213,6 +331,12 @@ void set_numa_options(MachineState *ms, NumaOptions *o= bject, Error **errp) machine_set_cpu_numa_node(ms, qapi_NumaCpuOptions_base(&object->u.= cpu), &err); break; + case NUMA_OPTIONS_TYPE_HMAT_LB: + parse_numa_hmat_lb(ms, &object->u.hmat_lb, &err); + if (err) { + goto end; + } + break; default: abort(); } diff --git a/qapi/misc.json b/qapi/misc.json index 24d20a880a..b18eb28459 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -2746,10 +2746,12 @@ # # @cpu: property based CPU(s) to node mapping (Since: 2.10) # +# @hmat-lb: memory latency and bandwidth information (Since: 2.13) +# # Since: 2.1 ## { 'enum': 'NumaOptionsType', - 'data': [ 'node', 'dist', 'cpu' ] } + 'data': [ 'node', 'dist', 'cpu', 'hmat-lb' ] } =20 ## # @NumaOptions: @@ -2764,7 +2766,8 @@ 'data': { 'node': 'NumaNodeOptions', 'dist': 'NumaDistOptions', - 'cpu': 'NumaCpuOptions' }} + 'cpu': 'NumaCpuOptions', + 'hmat-lb': 'NumaHmatLBOptions' }} =20 ## # @NumaNodeOptions: @@ -2827,6 +2830,91 @@ 'base': 'CpuInstanceProperties', 'data' : {} } =20 +## +# @HmatLBMemoryHierarchy: +# +# The memory hierarchy in the System Locality Latency +# and Bandwidth Information Structure of HMAT +# +# @memory: the structure represents the memory performance +# +# @last-level: last level memory of memory side cached memory +# +# @1st-level: first level memory of memory side cached memory +# +# @2nd-level: second level memory of memory side cached memory +# +# @3rd-level: third level memory of memory side cached memory +# +# Since: 2.13 +## +{ 'enum': 'HmatLBMemoryHierarchy', + 'data': [ 'memory', 'last-level', '1st-level', + '2nd-level', '3rd-level' ] } + +## +# @HmatLBDataType: +# +# Data type in the System Locality Latency +# and Bandwidth Information Structure of HMAT +# +# @access-latency: access latency +# +# @read-latency: read latency +# +# @write-latency: write latency +# +# @access-bandwidth: access bandwitch +# +# @read-bandwidth: read bandwidth +# +# @write-bandwidth: write bandwidth +# +# Since: 2.13 +## +{ 'enum': 'HmatLBDataType', + 'data': [ 'access-latency', 'read-latency', 'write-latency', + 'access-bandwidth', 'read-bandwidth', 'write-bandwidth' ] } + +## +# @NumaHmatLBOptions: +# +# Set the system locality latency and bandwidth information +# between Initiator and Target proximity Domains. +# +# @initiator: the Initiator Proximity Domain. +# +# @target: the Target Proximity Domain. +# +# @hierarchy: the Memory Hierarchy. Indicates the performance +# of memory or side cache. +# +# @data-type: presents the type of data, access/read/write +# latency or hit latency. +# +# @base-lat: the base unit for latency in nanoseconds. +# +# @base-bw: the base unit for bandwidth in megabytes per second(MB/s). +# +# @latency: the value of latency based on Base Unit from @initiator +# to @target proximity domain. +# +# @bandwidth: the value of bandwidth based on Base Unit between +# @initiator and @target proximity domain. +# +# Since: 2.13 +## +{ 'struct': 'NumaHmatLBOptions', + 'data': { + 'initiator': 'uint16', + 'target': 'uint16', + 'hierarchy': 'HmatLBMemoryHierarchy', + 'data-type': 'HmatLBDataType', + '*base-lat': 'uint64', + '*base-bw': 'uint64', + '*latency': 'uint16', + '*bandwidth': 'uint16' }} + ## # @HostMemPolicy: # diff --git a/qemu-options.hx b/qemu-options.hx index d4f3564b78..88f078c846 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -163,16 +163,19 @@ DEF("numa", HAS_ARG, QEMU_OPTION_numa, "-numa node[,mem=3Dsize][,cpus=3Dfirstcpu[-lastcpu]][,nodeid=3Dnode]\n" "-numa node[,memdev=3Did][,cpus=3Dfirstcpu[-lastcpu]][,nodeid=3Dnode]\= n" "-numa dist,src=3Dsource,dst=3Ddestination,val=3Ddistance\n" - "-numa cpu,node-id=3Dnode[,socket-id=3Dx][,core-id=3Dy][,thread-id=3Dz= ]\n", + "-numa cpu,node-id=3Dnode[,socket-id=3Dx][,core-id=3Dy][,thread-id=3Dz= ]\n" + "-numa hmat-lb,initiator=3Dnode,target=3Dnode,hierarchy=3Dmemory|last-= level,data-type=3Daccess-latency|read-latency|write-latency[,base-lat=3Dbla= t][,base-bw=3Dbbw][,latency=3Dlat][,bandwidth=3Dbw]\n", QEMU_ARCH_ALL) STEXI @item -numa node[,mem=3D@var{size}][,cpus=3D@var{firstcpu}[-@var{lastcpu}]= ][,nodeid=3D@var{node}] @itemx -numa node[,memdev=3D@var{id}][,cpus=3D@var{firstcpu}[-@var{lastcpu= }]][,nodeid=3D@var{node}] @itemx -numa dist,src=3D@var{source},dst=3D@var{destination},val=3D@var{di= stance} @itemx -numa cpu,node-id=3D@var{node}[,socket-id=3D@var{x}][,core-id=3D@va= r{y}][,thread-id=3D@var{z}] +@itemx -numa hmat-lb,initiator=3D@var{node},target=3D@var{node},hierarchy= =3D@var{str},data-type=3D@var{str}[,base-lat=3D@var{blat}][,base-bw=3D@var{= bbw}][,latency=3D@var{lat}][,bandwidth=3D@var{bw}] @findex -numa Define a NUMA node and assign RAM and VCPUs to it. Set the NUMA distance from a source node to a destination node. +Set the ACPI Heterogeneous Memory Attribute for the given nodes. =20 Legacy VCPU assignment uses @samp{cpus} option where @var{firstcpu} and @var{lastcpu} are CPU indexes. Each @@ -230,6 +233,29 @@ specified resources, it just assigns existing resource= s to NUMA nodes. This means that one still has to use the @option{-m}, @option{-smp} options to allocate RAM and VCPUs respectively. =20 +Use 'hmat-lb' to set System Locality Latency and Bandwidth Information +between initiator NUMA node and target NUMA node to build ACPI Heterogeneo= us Attribute Memory Table (HMAT). +Initiator NUMA node can create memory requests, usually including one or m= ore processors. +Target NUMA node contains addressable memory. + +For example: +@example +-m 2G \ +-smp 3,sockets=3D2,maxcpus=3D3 \ +-numa node,cpus=3D0-1,nodeid=3D0 \ +-numa node,mem=3D1G,cpus=3D2,nodeid=3D1 \ +-numa node,mem=3D1G,nodeid=3D2 \ +-numa hmat-lb,initiator=3D0,target=3D1,hierarchy=3Dmemory,data-type=3Dacce= ss-latency,base-lat=3D10,base-bw=3D20,latency=3D10,bandwidth=3D10 \ +-numa hmat-lb,initiator=3D1,target=3D2,hierarchy=3D1st-level,data-type=3Da= ccess-latency,base-bw=3D10,bandwidth=3D20 +@end example + +When the processors in NUMA node 0 access memory in NUMA node 1, +the first line containing 'hmat-lb' sets the latency and bandwidth informa= tion. +The latency is @var{lat} multiplied by @var{blat} and the bandwidth is @va= r{bw} multiplied by @var{bbw}. + +When the processors in NUMA node 1 access memory in NUMA node 2 that acts = as 2nd level memory side cache, +the second line containing 'hmat-lb' sets the access hit bandwidth informa= tion. + ETEXI =20 DEF("add-fd", HAS_ARG, QEMU_OPTION_add_fd, --=20 2.17.1 From nobody Fri Apr 19 05:21:21 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1547221139851885.7001831710215; Fri, 11 Jan 2019 07:38:59 -0800 (PST) Received: from localhost ([127.0.0.1]:54158 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghyte-0001qb-Ia for importer@patchew.org; Fri, 11 Jan 2019 10:38:58 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38218) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghyrZ-0000ex-1x for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:51 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ghyrW-0003s2-Ug for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:49 -0500 Received: from mga14.intel.com ([192.55.52.115]:56401) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ghyrW-0003df-Jv for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:46 -0500 Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Jan 2019 07:36:39 -0800 Received: from vmmtaopc.sh.intel.com ([10.239.13.92]) by orsmga006.jf.intel.com with ESMTP; 11 Jan 2019 07:36:37 -0800 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,466,1539673200"; d="scan'208";a="107522633" From: Tao Xu To: mst@redhat.com, imammedo@redhat.com Date: Fri, 11 Jan 2019 23:34:47 +0800 Message-Id: <20190111153451.14304-6-tao3.xu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111153451.14304-1-tao3.xu@intel.com> References: <20190111153451.14304-1-tao3.xu@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.115 Subject: [Qemu-devel] [PATCH v2 5/9] numa: Extend the command-line to provide memory side cache information X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ehabkost@redhat.com, jingqi.liu@intel.com, tao3.xu@intel.com, qemu-devel@nongnu.org, pbonzini@redhat.com, danmei.wei@intel.com, rth@twiddle.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Liu Jingqi Add -numa hmat-cache option to provide Memory Side Cache Information. These memory attributes help to build Memory Side Cache Information Structure(s) in ACPI Heterogeneous Memory Attribute Table (HMAT). Signed-off-by: Liu Jingqi Signed-off-by: Tao Xu --- numa.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++ qapi/misc.json | 72 ++++++++++++++++++++++++++++++++++++++++++++++++-- 2 files changed, 142 insertions(+), 2 deletions(-) diff --git a/numa.c b/numa.c index 97b77356ad..c2f4049689 100644 --- a/numa.c +++ b/numa.c @@ -298,6 +298,72 @@ static void parse_numa_hmat_lb(MachineState *ms, NumaH= matLBOptions *node, } } =20 +static void parse_numa_hmat_cache(MachineState *ms, NumaHmatCacheOptions *= node, + Error **errp) +{ + struct numa_hmat_cache_info *hmat_cache; + + if (node->node_id >=3D nb_numa_nodes) { + error_setg(errp, "Invalid node-id=3D%" PRIu32 + ", it should be less than %d.", + node->node_id, nb_numa_nodes); + return; + } + if (!numa_info[node->node_id].is_target) { + error_setg(errp, "Invalid node-id=3D%" PRIu32 + ", it isn't a target proximity domain.", + node->node_id); + return; + } + + if (node->total > MAX_HMAT_CACHE_LEVEL) { + error_setg(errp, "Invalid total=3D%" PRIu8 + ", it should be less than or equal to %d.", + node->total, MAX_HMAT_CACHE_LEVEL); + return; + } + if (node->level > node->total) { + error_setg(errp, "Invalid level=3D%" PRIu8 + ", it should be less than or equal to" + " total=3D%" PRIu8 ".", + node->level, node->total); + return; + } + if (hmat_cache_info[node->node_id][node->level]) { + error_setg(errp, "Duplicate configuration of the side cache for " + "node-id=3D%" PRIu32 " and level=3D%" PRIu8 ".", + node->node_id, node->level); + return; + } + + if ((node->level > 1) && + hmat_cache_info[node->node_id][node->level - 1] && + (node->size >=3D + hmat_cache_info[node->node_id][node->level - 1]->size)) { + error_setg(errp, "Invalid size=3D0x%" PRIx64 + ", the size of level=3D%" PRIu8 + " should be less than the size(0x%" PRIx64 + ") of level=3D%" PRIu8 ".", + node->size, node->level, + hmat_cache_info[node->node_id][node->level - 1]->size, + node->level - 1); + return; + } + + hmat_cache =3D g_malloc0(sizeof(*hmat_cache)); + + hmat_cache->mem_proximity =3D node->node_id; + hmat_cache->size =3D node->size; + hmat_cache->total_levels =3D node->total; + hmat_cache->level =3D node->level; + hmat_cache->associativity =3D node->assoc; + hmat_cache->write_policy =3D node->policy; + hmat_cache->line_size =3D node->line; + hmat_cache->num_smbios_handles =3D 0; + + hmat_cache_info[node->node_id][node->level] =3D hmat_cache; +} + static void set_numa_options(MachineState *ms, NumaOptions *object, Error **errp) { @@ -337,6 +403,12 @@ void set_numa_options(MachineState *ms, NumaOptions *o= bject, Error **errp) goto end; } break; + case NUMA_OPTIONS_TYPE_HMAT_CACHE: + parse_numa_hmat_cache(ms, &object->u.hmat_cache, &err); + if (err) { + goto end; + } + break; default: abort(); } diff --git a/qapi/misc.json b/qapi/misc.json index b18eb28459..0887a3791a 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -2748,10 +2748,12 @@ # # @hmat-lb: memory latency and bandwidth information (Since: 2.13) # +# @hmat-cache: memory side cache information (Since: 2.13) +# # Since: 2.1 ## { 'enum': 'NumaOptionsType', - 'data': [ 'node', 'dist', 'cpu', 'hmat-lb' ] } + 'data': [ 'node', 'dist', 'cpu', 'hmat-lb', 'hmat-cache' ] } =20 ## # @NumaOptions: @@ -2767,7 +2769,8 @@ 'node': 'NumaNodeOptions', 'dist': 'NumaDistOptions', 'cpu': 'NumaCpuOptions', - 'hmat-lb': 'NumaHmatLBOptions' }} + 'hmat-lb': 'NumaHmatLBOptions', + 'hmat-cache': 'NumaHmatCacheOptions' }} =20 ## # @NumaNodeOptions: @@ -2915,6 +2918,71 @@ '*latency': 'uint16', '*bandwidth': 'uint16' }} =20 +## +# @HmatCacheAssociativity: +# +# Cache associativity in the Memory Side Cache +# Information Structure of HMAT +# +# @none: None +# +# @direct: Direct Mapped +# +# @complex: Complex Cache Indexing (implementation specific) +# +# Since: 2.13 +## +{ 'enum': 'HmatCacheAssociativity', + 'data': [ 'none', 'direct', 'complex' ] } + +## +# @HmatCacheWritePolicy: +# +# Cache write policy in the Memory Side Cache +# Information Structure of HMAT +# +# @none: None +# +# @write-back: Write Back (WB) +# +# @write-through: Write Through (WT) +# +# Since: 2.13 +## +{ 'enum': 'HmatCacheWritePolicy', + 'data': [ 'none', 'write-back', 'write-through' ] } + +## +# @NumaHmatCacheOptions: +# +# Set the memory side cache information for a given memory domain. +# +# @node-id: the memory proximity domain to which the memory belongs. +# +# @size: the size of memory side cache in bytes. +# +# @total: the total cache levels for this memory proximity domain. +# +# @level: the cache level described in this structure. +# +# @assoc: the cache associativity, none/direct-mapped/complex(complex cach= e indexing). + +# @policy: the write policy, none/write-back/write-through. +# +# @line: the cache Line size in bytes. +# +# Since: 2.13 +## +{ 'struct': 'NumaHmatCacheOptions', + 'data': { + 'node-id': 'uint32', + 'size': 'size', + 'total': 'uint8', + 'level': 'uint8', + 'assoc': 'HmatCacheAssociativity', + 'policy': 'HmatCacheWritePolicy', + 'line': 'uint16' }} + ## # @HostMemPolicy: # --=20 2.17.1 From nobody Fri Apr 19 05:21:21 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 15472213932189.712917512340823; Fri, 11 Jan 2019 07:43:13 -0800 (PST) Received: from localhost ([127.0.0.1]:55242 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghyxj-0005Tq-Hg for importer@patchew.org; Fri, 11 Jan 2019 10:43:11 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38284) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghyrc-0000hO-5R for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:54 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ghyrX-0003tJ-MW for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:52 -0500 Received: from mga14.intel.com ([192.55.52.115]:56401) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ghyrX-0003df-7K for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:47 -0500 Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Jan 2019 07:36:42 -0800 Received: from vmmtaopc.sh.intel.com ([10.239.13.92]) by orsmga006.jf.intel.com with ESMTP; 11 Jan 2019 07:36:39 -0800 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,466,1539673200"; d="scan'208";a="107522639" From: Tao Xu To: mst@redhat.com, imammedo@redhat.com Date: Fri, 11 Jan 2019 23:34:48 +0800 Message-Id: <20190111153451.14304-7-tao3.xu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111153451.14304-1-tao3.xu@intel.com> References: <20190111153451.14304-1-tao3.xu@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.115 Subject: [Qemu-devel] [PATCH v2 6/9] hmat acpi: Implement _HMA method to update HMAT at runtime X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ehabkost@redhat.com, jingqi.liu@intel.com, tao3.xu@intel.com, qemu-devel@nongnu.org, pbonzini@redhat.com, danmei.wei@intel.com, rth@twiddle.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" From: Liu Jingqi OSPM evaluates HMAT only during system initialization. Any changes to the HMAT state at runtime or information regarding HMAT for hot plug are communicated using _HMA method. _HMA is an optional object that enables the platform to provide the OS with updated Heterogeneous Memory Attributes information at runtime. _HMA provides OSPM with the latest HMAT in entirety overriding existing HMAT. Signed-off-by: Liu Jingqi Signed-off-by: Tao Xu --- hw/acpi/hmat.c | 356 +++++++++++++++++++++++++++++++++++++++++++ hw/acpi/hmat.h | 71 +++++++++ hw/i386/acpi-build.c | 2 + hw/i386/pc.c | 2 + hw/i386/pc_piix.c | 3 + hw/i386/pc_q35.c | 3 + include/hw/i386/pc.h | 2 + 7 files changed, 439 insertions(+) diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c index 9d29ef7929..cf17c0ae4f 100644 --- a/hw/acpi/hmat.c +++ b/hw/acpi/hmat.c @@ -275,6 +275,267 @@ static void hmat_build_hma(GArray *hma, PCMachineStat= e *pcms) hmat_build_cache(hma); } =20 +static uint64_t +hmat_hma_method_read(void *opaque, hwaddr addr, unsigned size) +{ + printf("BUG: we never read _HMA IO Port.\n"); + return 0; +} + +/* _HMA Method: read HMA data. */ +static void hmat_handle_hma_method(AcpiHmaState *state, + HmatHmamIn *in, hwaddr hmam_mem_addr) +{ + HmatHmaBuffer *hma_buf =3D &state->hma_buf; + HmatHmamOut *read_hma_out; + GArray *hma; + uint32_t read_len =3D 0, ret_status; + int size; + + le32_to_cpus(&in->offset); + + hma =3D hma_buf->hma; + if (in->offset > hma->len) { + ret_status =3D HMAM_RET_STATUS_INVALID; + goto exit; + } + + /* It is the first time to read HMA. */ + if (!in->offset) { + hma_buf->dirty =3D false; + } else if (hma_buf->dirty) { /* HMA has been changed during Reading HM= A. */ + ret_status =3D HMAM_RET_STATUS_HMA_CHANGED; + goto exit; + } + + ret_status =3D HMAM_RET_STATUS_SUCCESS; + read_len =3D MIN(hma->len - in->offset, + HMAM_MEMORY_SIZE - 2 * sizeof(uint32_t)); +exit: + size =3D sizeof(HmatHmamOut) + read_len; + read_hma_out =3D g_malloc(size); + + read_hma_out->len =3D cpu_to_le32(size); + read_hma_out->ret_status =3D cpu_to_le32(ret_status); + memcpy(read_hma_out->data, hma->data + in->offset, read_len); + + cpu_physical_memory_write(hmam_mem_addr, read_hma_out, size); + + g_free(read_hma_out); +} + +static void +hmat_hma_method_write(void *opaque, hwaddr addr, uint64_t val, unsigned si= ze) +{ + AcpiHmaState *state =3D opaque; + hwaddr hmam_mem_addr =3D val; + HmatHmamIn *in; + + in =3D g_new(HmatHmamIn, 1); + cpu_physical_memory_read(hmam_mem_addr, in, sizeof(*in)); + + hmat_handle_hma_method(state, in, hmam_mem_addr); +} + +static const MemoryRegionOps hmat_hma_method_ops =3D { + .read =3D hmat_hma_method_read, + .write =3D hmat_hma_method_write, + .endianness =3D DEVICE_LITTLE_ENDIAN, + .valid =3D { + .min_access_size =3D 4, + .max_access_size =3D 4, + }, +}; + +static void hmat_init_hma_buffer(HmatHmaBuffer *hma_buf) +{ + hma_buf->hma =3D g_array_new(false, true /* clear */, 1); +} + +static uint8_t hmat_acpi_table_checksum(uint8_t *buffer, uint32_t length) +{ + uint8_t sum =3D 0; + uint8_t *end =3D buffer + length; + + while (buffer < end) { + sum =3D (uint8_t) (sum + *(buffer++)); + } + return (uint8_t)(0 - sum); +} + +static void hmat_build_header(AcpiTableHeader *h, + const char *sig, int len, uint8_t rev, + const char *oem_id, const char *oem_table_id) +{ + memcpy(&h->signature, sig, 4); + h->length =3D cpu_to_le32(len); + h->revision =3D rev; + + if (oem_id) { + strncpy((char *)h->oem_id, oem_id, sizeof h->oem_id); + } else { + memcpy(h->oem_id, ACPI_BUILD_APPNAME6, 6); + } + + if (oem_table_id) { + strncpy((char *)h->oem_table_id, oem_table_id, sizeof(h->oem_table= _id)); + } else { + memcpy(h->oem_table_id, ACPI_BUILD_APPNAME4, 4); + memcpy(h->oem_table_id + 4, sig, 4); + } + + h->oem_revision =3D cpu_to_le32(1); + memcpy(h->asl_compiler_id, ACPI_BUILD_APPNAME4, 4); + h->asl_compiler_revision =3D cpu_to_le32(1); + + /* Caculate the checksum of acpi table. */ + h->checksum =3D 0; + h->checksum =3D hmat_acpi_table_checksum((uint8_t *)h, len); +} + +static void hmat_build_hma_buffer(PCMachineState *pcms) +{ + HmatHmaBuffer *hma_buf =3D &(pcms->acpi_hma_state.hma_buf); + + /* Free the old hma buffer before new allocation. */ + g_array_free(hma_buf->hma, true); + + hma_buf->hma =3D g_array_new(false, true /* clear */, 1); + acpi_data_push(hma_buf->hma, sizeof(AcpiHmat)); + + /* build HMAT in a given buffer. */ + hmat_build_hma(hma_buf->hma, pcms); + hmat_build_header((void *)hma_buf->hma->data, + "HMAT", hma_buf->hma->len, 1, NULL, NULL); + hma_buf->dirty =3D true; +} + +static void hmat_build_common_aml(Aml *dev) +{ + Aml *method, *ifctx, *hmam_mem; + Aml *unsupport; + Aml *pckg, *pckg_index, *pckg_buf, *field; + Aml *hmam_out_buf, *hmam_out_buf_size; + uint8_t byte_list[1]; + + method =3D aml_method(HMA_COMMON_METHOD, 1, AML_SERIALIZED); + hmam_mem =3D aml_local(6); + hmam_out_buf =3D aml_local(7); + + aml_append(method, aml_store(aml_name(HMAM_ACPI_MEM_ADDR), hmam_mem)); + + /* map _HMA memory and IO into ACPI namespace. */ + aml_append(method, aml_operation_region(HMAM_IOPORT, AML_SYSTEM_IO, + aml_int(HMAM_ACPI_IO_BASE), HMAM_ACPI_IO_LEN)); + aml_append(method, aml_operation_region(HMAM_MEMORY, + AML_SYSTEM_MEMORY, hmam_mem, HMAM_MEMORY_SIZE)); + + /* + * _HMAC notifier: + * HMAM_NOTIFY: write the address of DSM memory and notify QEMU to + * emulate the access. + * + * It is the IO port so that accessing them will cause VM-exit, the + * control will be transferred to QEMU. + */ + field =3D aml_field(HMAM_IOPORT, AML_DWORD_ACC, AML_NOLOCK, + AML_PRESERVE); + aml_append(field, aml_named_field(HMAM_NOTIFY, + sizeof(uint32_t) * BITS_PER_BYTE)); + aml_append(method, field); + + /* + * _HMAC input: + * HMAM_OFFSET: store the current offset of _HMA buffer. + * + * They are RAM mapping on host so that these accesses never cause VME= xit. + */ + field =3D aml_field(HMAM_MEMORY, AML_DWORD_ACC, AML_NOLOCK, + AML_PRESERVE); + aml_append(field, aml_named_field(HMAM_OFFSET, + sizeof(typeof_field(HmatHmamIn, offset)) * BITS_PER_BYTE)); + aml_append(method, field); + + /* + * _HMAC output: + * HMAM_OUT_BUF_SIZE: the size of the buffer filled by QEMU. + * HMAM_OUT_BUF: the buffer QEMU uses to store the result. + * + * Since the page is reused by both input and out, the input data + * will be lost after storing new result into ODAT so we should fetch + * all the input data before writing the result. + */ + field =3D aml_field(HMAM_MEMORY, AML_DWORD_ACC, AML_NOLOCK, + AML_PRESERVE); + aml_append(field, aml_named_field(HMAM_OUT_BUF_SIZE, + sizeof(typeof_field(HmatHmamOut, len)) * BITS_PER_BYTE)); + aml_append(field, aml_named_field(HMAM_OUT_BUF, + (sizeof(HmatHmamOut) - sizeof(uint32_t)) * BITS_PER_BYTE)); + aml_append(method, field); + + /* + * do not support any method if HMA memory address has not been + * patched. + */ + unsupport =3D aml_if(aml_equal(hmam_mem, aml_int(0x0))); + byte_list[0] =3D HMAM_RET_STATUS_UNSUPPORT; + aml_append(unsupport, aml_return(aml_buffer(1, byte_list))); + aml_append(method, unsupport); + + /* The parameter (Arg0) of _HMAC is a package which contains a buffer.= */ + pckg =3D aml_arg(0); + ifctx =3D aml_if(aml_and(aml_equal(aml_object_type(pckg), + aml_int(4 /* Package */)) /* It is a Package? */, + aml_equal(aml_sizeof(pckg), aml_int(1)) /* 1 element */, + NULL)); + + pckg_index =3D aml_local(2); + pckg_buf =3D aml_local(3); + aml_append(ifctx, aml_store(aml_index(pckg, aml_int(0)), pckg_index)); + aml_append(ifctx, aml_store(aml_derefof(pckg_index), pckg_buf)); + aml_append(ifctx, aml_store(pckg_buf, aml_name(HMAM_OFFSET))); + aml_append(method, ifctx); + + /* + * tell QEMU about the real address of HMA memory, then QEMU + * gets the control and fills the result in _HMAC memory. + */ + aml_append(method, aml_store(hmam_mem, aml_name(HMAM_NOTIFY))); + + hmam_out_buf_size =3D aml_local(1); + /* RLEN is not included in the payload returned to guest. */ + aml_append(method, aml_subtract(aml_name(HMAM_OUT_BUF_SIZE), + aml_int(4), hmam_out_buf_size)); + aml_append(method, aml_store(aml_shiftleft(hmam_out_buf_size, aml_int(= 3)), + hmam_out_buf_size)); + aml_append(method, aml_create_field(aml_name(HMAM_OUT_BUF), + aml_int(0), hmam_out_buf_size, "OBUF")); + aml_append(method, aml_concatenate(aml_buffer(0, NULL), aml_name("OBUF= "), + hmam_out_buf)); + aml_append(method, aml_return(hmam_out_buf)); + aml_append(dev, method); +} + +void hmat_init_acpi_state(AcpiHmaState *state, MemoryRegion *io, + FWCfgState *fw_cfg, Object *owner) +{ + memory_region_init_io(&state->io_mr, owner, &hmat_hma_method_ops, stat= e, + "hma-acpi-io", HMAM_ACPI_IO_LEN); + memory_region_add_subregion(io, HMAM_ACPI_IO_BASE, &state->io_mr); + + state->hmam_mem =3D g_array_new(false, true /* clear */, 1); + fw_cfg_add_file(fw_cfg, HMAM_MEM_FILE, state->hmam_mem->data, + state->hmam_mem->len); + + hmat_init_hma_buffer(&state->hma_buf); +} + +void hmat_update(PCMachineState *pcms) +{ + /* build HMAT in a given buffer. */ + hmat_build_hma_buffer(pcms); +} + void hmat_build_acpi(GArray *table_data, BIOSLinker *linker, MachineState *machine) { @@ -291,3 +552,98 @@ void hmat_build_acpi(GArray *table_data, BIOSLinker *l= inker, (void *)(table_data->data + hmat_start), "HMAT", hmat_len, 1, NULL, NULL); } + +void hmat_build_aml(Aml *dev) +{ + Aml *method, *pkg, *buf, *buf_size, *offset, *call_result; + Aml *whilectx, *ifcond, *ifctx, *elsectx, *hma; + + hmat_build_common_aml(dev); + + buf =3D aml_local(0); + buf_size =3D aml_local(1); + hma =3D aml_local(2); + + aml_append(dev, aml_name_decl(HMAM_RHMA_STATUS, aml_int(0))); + + /* build helper function, RHMA. */ + method =3D aml_method("RHMA", 1, AML_SERIALIZED); + aml_append(method, aml_name_decl("OFST", aml_int(0))); + + /* prepare input package. */ + pkg =3D aml_package(1); + aml_append(method, aml_store(aml_arg(0), aml_name("OFST"))); + aml_append(pkg, aml_name("OFST")); + + /* call Read HMA function. */ + call_result =3D aml_call1(HMA_COMMON_METHOD, pkg); + aml_append(method, aml_store(call_result, buf)); + + /* handle _HMAC result. */ + aml_append(method, aml_create_dword_field(buf, + aml_int(0) /* offset at byte 0 */, "STAU")); + + aml_append(method, aml_store(aml_name("STAU"), + aml_name(HMAM_RHMA_STATUS))); + + /* if something is wrong during _HMAC. */ + ifcond =3D aml_equal(aml_int(HMAM_RET_STATUS_SUCCESS), + aml_name("STAU")); + ifctx =3D aml_if(aml_lnot(ifcond)); + aml_append(ifctx, aml_return(aml_buffer(0, NULL))); + aml_append(method, ifctx); + + aml_append(method, aml_store(aml_sizeof(buf), buf_size)); + aml_append(method, aml_subtract(buf_size, + aml_int(4) /* the size of "STAU" */, + buf_size)); + + /* if we read the end of hma. */ + ifctx =3D aml_if(aml_equal(buf_size, aml_int(0))); + aml_append(ifctx, aml_return(aml_buffer(0, NULL))); + aml_append(method, ifctx); + + aml_append(method, aml_create_field(buf, + aml_int(4 * BITS_PER_BYTE), /* offset at byte = 4.*/ + aml_shiftleft(buf_size, aml_int(3)), "BUFF")); + aml_append(method, aml_return(aml_name("BUFF"))); + aml_append(dev, method); + + /* build _HMA. */ + method =3D aml_method("_HMA", 0, AML_SERIALIZED); + offset =3D aml_local(3); + + aml_append(method, aml_store(aml_buffer(0, NULL), hma)); + aml_append(method, aml_store(aml_int(0), offset)); + + whilectx =3D aml_while(aml_int(1)); + aml_append(whilectx, aml_store(aml_call1("RHMA", offset), buf)); + aml_append(whilectx, aml_store(aml_sizeof(buf), buf_size)); + + /* + * if hma buffer was changed during RHMA, read from the beginning + * again. + */ + ifctx =3D aml_if(aml_equal(aml_name(HMAM_RHMA_STATUS), + aml_int(HMAM_RET_STATUS_HMA_CHANGED))); + aml_append(ifctx, aml_store(aml_buffer(0, NULL), hma)); + aml_append(ifctx, aml_store(aml_int(0), offset)); + aml_append(whilectx, ifctx); + + elsectx =3D aml_else(); + + /* finish hma read if no data is read out. */ + ifctx =3D aml_if(aml_equal(buf_size, aml_int(0))); + aml_append(ifctx, aml_return(hma)); + aml_append(elsectx, ifctx); + + /* update the offset. */ + aml_append(elsectx, aml_add(offset, buf_size, offset)); + /* append the data we read out to the hma buffer. */ + aml_append(elsectx, aml_concatenate(hma, buf, hma)); + aml_append(whilectx, elsectx); + aml_append(method, whilectx); + + aml_append(dev, method); +} + diff --git a/hw/acpi/hmat.h b/hw/acpi/hmat.h index f9fdcdcd33..dd6948f738 100644 --- a/hw/acpi/hmat.h +++ b/hw/acpi/hmat.h @@ -183,11 +183,82 @@ struct numa_hmat_cache_info { uint16_t num_smbios_handles; }; =20 +#define HMAM_MEMORY_SIZE 4096 +#define HMAM_MEM_FILE "etc/acpi/hma-mem" + +/* + * 32 bits IO port starting from 0x0a19 in guest is reserved for + * HMA ACPI emulation. + */ +#define HMAM_ACPI_IO_BASE 0x0a19 +#define HMAM_ACPI_IO_LEN 4 + +#define HMAM_ACPI_MEM_ADDR "HMTA" +#define HMAM_MEMORY "HRAM" +#define HMAM_IOPORT "HPIO" + +#define HMAM_NOTIFY "NTFI" +#define HMAM_OUT_BUF_SIZE "RLEN" +#define HMAM_OUT_BUF "ODAT" + +#define HMAM_RHMA_STATUS "RSTA" +#define HMA_COMMON_METHOD "HMAC" +#define HMAM_OFFSET "OFFT" + +#define HMAM_RET_STATUS_SUCCESS 0 /* Success */ +#define HMAM_RET_STATUS_UNSUPPORT 1 /* Not Supported */ +#define HMAM_RET_STATUS_INVALID 2 /* Invalid Input Parameters */ +#define HMAM_RET_STATUS_HMA_CHANGED 0x100 /* HMA Changed */ + +/* + * HmatHmaBuffer: + * @hma: HMA buffer with the updated HMAT. It is updated when + * the memory device is plugged or unplugged. + * @dirty: It allows OSPM to detect changes and restart read if there is a= ny. + */ +struct HmatHmaBuffer { + GArray *hma; + bool dirty; +}; +typedef struct HmatHmaBuffer HmatHmaBuffer; + +struct AcpiHmaState { + /* detect if HMA support is enabled. */ + bool is_enabled; + + /* the data of the fw_cfg file HMAM_MEM_FILE. */ + GArray *hmam_mem; + + HmatHmaBuffer hma_buf; + + /* the IO region used by OSPM to transfer control to QEMU. */ + MemoryRegion io_mr; +}; +typedef struct AcpiHmaState AcpiHmaState; + +struct HmatHmamIn { + /* the offset in the _HMA buffer */ + uint32_t offset; +} QEMU_PACKED; +typedef struct HmatHmamIn HmatHmamIn; + +struct HmatHmamOut { + /* the size of buffer filled by QEMU. */ + uint32_t len; + uint32_t ret_status; /* return status code. */ + uint8_t data[4088]; +} QEMU_PACKED; +typedef struct HmatHmamOut HmatHmamOut; + extern struct numa_hmat_lb_info *hmat_lb_info[HMAT_LB_LEVELS][HMAT_LB_TYPE= S]; extern struct numa_hmat_cache_info *hmat_cache_info[MAX_NODES][MAX_HMAT_CACHE_LEVEL + 1]; =20 void hmat_build_acpi(GArray *table_data, BIOSLinker *linker, MachineState *machine); +void hmat_build_aml(Aml *dsdt); +void hmat_init_acpi_state(AcpiHmaState *state, MemoryRegion *io, + FWCfgState *fw_cfg, Object *owner); +void hmat_update(PCMachineState *pcms); =20 #endif diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index a93d437175..569132f3ab 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1845,6 +1845,8 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, build_q35_pci0_int(dsdt); } =20 + hmat_build_aml(dsdt); + if (pcmc->legacy_cpu_hotplug) { build_legacy_cpu_hotplug_aml(dsdt, machine, pm->cpu_hp_io_base); } else { diff --git a/hw/i386/pc.c b/hw/i386/pc.c index 4952feb476..9afed44139 100644 --- a/hw/i386/pc.c +++ b/hw/i386/pc.c @@ -2401,6 +2401,8 @@ static void pc_memory_plug(HotplugHandler *hotplug_de= v, nvdimm_plug(&pcms->acpi_nvdimm_state); } =20 + hmat_update(pcms); + hhc =3D HOTPLUG_HANDLER_GET_CLASS(pcms->acpi_dev); hhc->plug(HOTPLUG_HANDLER(pcms->acpi_dev), dev, &error_abort); out: diff --git a/hw/i386/pc_piix.c b/hw/i386/pc_piix.c index ed6984638e..38d7a758ef 100644 --- a/hw/i386/pc_piix.c +++ b/hw/i386/pc_piix.c @@ -301,6 +301,9 @@ static void pc_init1(MachineState *machine, nvdimm_init_acpi_state(&pcms->acpi_nvdimm_state, system_io, pcms->fw_cfg, OBJECT(pcms)); } + + hmat_init_acpi_state(&pcms->acpi_hma_state, system_io, + pcms->fw_cfg, OBJECT(pcms)); } =20 /* Looking for a pc_compat_2_4() function? It doesn't exist. diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c index b7b7959934..e819c3b2f6 100644 --- a/hw/i386/pc_q35.c +++ b/hw/i386/pc_q35.c @@ -333,6 +333,9 @@ static void pc_q35_init(MachineState *machine) nvdimm_init_acpi_state(&pcms->acpi_nvdimm_state, system_io, pcms->fw_cfg, OBJECT(pcms)); } + + hmat_init_acpi_state(&pcms->acpi_hma_state, system_io, + pcms->fw_cfg, OBJECT(pcms)); } =20 #define DEFINE_Q35_MACHINE(suffix, name, compatfn, optionfn) \ diff --git a/include/hw/i386/pc.h b/include/hw/i386/pc.h index 84720bede9..800e9cac1d 100644 --- a/include/hw/i386/pc.h +++ b/include/hw/i386/pc.h @@ -16,6 +16,7 @@ #include "hw/mem/pc-dimm.h" #include "hw/mem/nvdimm.h" #include "hw/acpi/acpi_dev_interface.h" +#include "hw/acpi/hmat.h" =20 #define HPET_INTCAP "hpet-intcap" =20 @@ -46,6 +47,7 @@ struct PCMachineState { OnOffAuto smm; =20 AcpiNVDIMMState acpi_nvdimm_state; + AcpiHmaState acpi_hma_state; =20 bool acpi_build_enabled; bool smbus_enabled; --=20 2.17.1 From nobody Fri Apr 19 05:21:21 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1547221217428386.4943799668831; Fri, 11 Jan 2019 07:40:17 -0800 (PST) Received: from localhost ([127.0.0.1]:54457 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghyuu-0002xe-7m for importer@patchew.org; Fri, 11 Jan 2019 10:40:16 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38233) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghyrZ-0000fe-Vc for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:53 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ghyrX-0003ta-Nm for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:49 -0500 Received: from mga14.intel.com ([192.55.52.115]:56403) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ghyrX-0003gw-A6 for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:47 -0500 Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Jan 2019 07:36:44 -0800 Received: from vmmtaopc.sh.intel.com ([10.239.13.92]) by orsmga006.jf.intel.com with ESMTP; 11 Jan 2019 07:36:42 -0800 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,466,1539673200"; d="scan'208";a="107522648" From: Tao Xu To: mst@redhat.com, imammedo@redhat.com Date: Fri, 11 Jan 2019 23:34:49 +0800 Message-Id: <20190111153451.14304-8-tao3.xu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111153451.14304-1-tao3.xu@intel.com> References: <20190111153451.14304-1-tao3.xu@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.115 Subject: [Qemu-devel] [PATCH v2 7/9] hmat acpi: fix some coding style and small issues X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ehabkost@redhat.com, jingqi.liu@intel.com, tao3.xu@intel.com, qemu-devel@nongnu.org, pbonzini@redhat.com, danmei.wei@intel.com, rth@twiddle.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Per Igor and Eric's comments, fix some small issues of V1 patch: - update the version number in qapi/misc.json - including the expansion of the acronym HMAT in qapi/misc.json - correct spell mistakes in qapi/misc.json and qemu-options.hx - fix the comment syle in hw/i386/acpi-build.c and hw/acpi/hmat.h - remove some unnecessary head files in hw/acpi/hmat.c - use hardcoded numbers from spec to generate Memory Subsystem Address Range Structure in hw/acpi/hmat.c - drop the struct AcpiHmat and AcpiHmatSpaRange in hw/acpi/hmat.h Reviewed-by: Liu Jingqi Signed-off-by: Tao Xu --- hw/acpi/hmat.c | 39 +++++++++++++++++---------------------- hw/acpi/hmat.h | 26 ++++---------------------- hw/i386/acpi-build.c | 6 ++++-- qapi/misc.json | 44 +++++++++++++++++++++++--------------------- qemu-options.hx | 2 +- 5 files changed, 49 insertions(+), 68 deletions(-) diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c index cf17c0ae4f..e8ba9250e9 100644 --- a/hw/acpi/hmat.c +++ b/hw/acpi/hmat.c @@ -22,17 +22,12 @@ * License along with this library; if not, see */ =20 -#include "unistd.h" -#include "fcntl.h" #include "qemu/osdep.h" #include "sysemu/numa.h" #include "hw/i386/pc.h" #include "hw/i386/acpi-build.h" -#include "hw/acpi/acpi.h" #include "hw/acpi/hmat.h" -#include "hw/acpi/aml-build.h" #include "hw/nvram/fw_cfg.h" -#include "hw/acpi/bios-linker-loader.h" =20 struct numa_hmat_lb_info *hmat_lb_info[HMAT_LB_LEVELS][HMAT_LB_TYPES] =3D = {0}; struct numa_hmat_cache_info @@ -42,7 +37,7 @@ static uint32_t initiator_pxm[MAX_NODES], target_pxm[MAX_= NODES]; static uint32_t num_initiator, num_target; =20 /* Build Memory Subsystem Address Range Structure */ -static void hmat_build_spa_info(GArray *table_data, +static void build_hmat_spa(GArray *table_data, uint64_t base, uint64_t length, int node) { uint16_t flags =3D 0; @@ -54,27 +49,27 @@ static void hmat_build_spa_info(GArray *table_data, flags |=3D HMAT_SPA_MEM_VALID; } =20 + /* Memory Subsystem Address Range Structure */ /* Type */ - build_append_int_noprefix(table_data, ACPI_HMAT_SPA, sizeof(uint16_t)); - /* Reserved0 */ - build_append_int_noprefix(table_data, 0, sizeof(uint16_t)); + build_append_int_noprefix(table_data, 0, 2); + /* Reserved */ + build_append_int_noprefix(table_data, 0, 2); /* Length */ - build_append_int_noprefix(table_data, sizeof(AcpiHmatSpaRange), - sizeof(uint32_t)); + build_append_int_noprefix(table_data, 40, 4); /* Flags */ - build_append_int_noprefix(table_data, flags, sizeof(uint16_t)); - /* Reserved1 */ - build_append_int_noprefix(table_data, 0, sizeof(uint16_t)); + build_append_int_noprefix(table_data, flags, 2); + /* Reserved */ + build_append_int_noprefix(table_data, 0, 2); /* Process Proximity Domain */ - build_append_int_noprefix(table_data, node, sizeof(uint32_t)); + build_append_int_noprefix(table_data, node, 4); /* Memory Proximity Domain */ - build_append_int_noprefix(table_data, node, sizeof(uint32_t)); - /* Reserved2 */ - build_append_int_noprefix(table_data, 0, sizeof(uint32_t)); + build_append_int_noprefix(table_data, node, 4); + /* Reserved */ + build_append_int_noprefix(table_data, 0, 4); /* System Physical Address Range Base */ - build_append_int_noprefix(table_data, base, sizeof(uint64_t)); + build_append_int_noprefix(table_data, base, 8); /* System Physical Address Range Length */ - build_append_int_noprefix(table_data, length, sizeof(uint64_t)); + build_append_int_noprefix(table_data, length, 8); } =20 static int pc_dimm_device_list(Object *obj, void *opaque) @@ -401,7 +396,7 @@ static void hmat_build_hma_buffer(PCMachineState *pcms) g_array_free(hma_buf->hma, true); =20 hma_buf->hma =3D g_array_new(false, true /* clear */, 1); - acpi_data_push(hma_buf->hma, sizeof(AcpiHmat)); + acpi_data_push(hma_buf->hma, 40); =20 /* build HMAT in a given buffer. */ hmat_build_hma(hma_buf->hma, pcms); @@ -543,7 +538,7 @@ void hmat_build_acpi(GArray *table_data, BIOSLinker *li= nker, uint64_t hmat_start, hmat_len; =20 hmat_start =3D table_data->len; - acpi_data_push(table_data, sizeof(AcpiHmat)); + acpi_data_push(table_data, 40); =20 hmat_build_hma(table_data, pcms); hmat_len =3D table_data->len - hmat_start; diff --git a/hw/acpi/hmat.h b/hw/acpi/hmat.h index dd6948f738..2c080a51b8 100644 --- a/hw/acpi/hmat.h +++ b/hw/acpi/hmat.h @@ -78,27 +78,6 @@ enum { #define HMAT_LB_LEVELS (HMAT_LB_MEM_CACHE_3RD_LEVEL + 1) #define HMAT_LB_TYPES (HMAT_LB_DATA_WRITE_BANDWIDTH + 1) =20 -/* - * HMAT (Heterogeneous Memory Attributes Table) - */ -struct AcpiHmat { - ACPI_TABLE_HEADER_DEF - uint32_t reserved; -} QEMU_PACKED; -typedef struct AcpiHmat AcpiHmat; - -struct AcpiHmatSpaRange { - ACPI_HMAT_SUB_HEADER_DEF - uint16_t flags; - uint16_t reserved1; - uint32_t proc_proximity; - uint32_t mem_proximity; - uint32_t reserved2; - uint64_t spa_base; - uint64_t spa_length; -} QEMU_PACKED; -typedef struct AcpiHmatSpaRange AcpiHmatSpaRange; - struct AcpiHmatLBInfo { ACPI_HMAT_SUB_HEADER_DEF uint8_t flags; @@ -166,7 +145,10 @@ struct numa_hmat_cache_info { uint32_t mem_proximity; /* Size of memory side cache in bytes. */ uint64_t size; - /* Total cache levels for this memory proximity domain. */ + /* + * Total cache levels for this memory + * pr#include "hw/acpi/aml-build.h"oximity domain. + */ uint8_t total_levels; /* Cache level described in this structure. */ uint8_t level; diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 569132f3ab..729e67e829 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -120,7 +120,8 @@ typedef struct AcpiBuildPciBusHotplugState { bool pcihp_bridge_en; } AcpiBuildPciBusHotplugState; =20 -/* The memory contains at least one hole +/* + * The memory contains at least one hole * from 640k-1M and possibly another one from 3.5G-4G. * So far, the number of memory ranges is up to 2 * more than the number of numa nodes. @@ -2267,7 +2268,8 @@ void build_mem_ranges(PCMachineState *pcms) uint64_t mem_len, mem_base, next_base; int i; =20 - /* the memory map is a bit tricky, it contains at least one hole + /* + * the memory map is a bit tricky, it contains at least one hole * from 640k-1M and possibly another one from 3.5G-4G. */ mem_ranges_number =3D 0; diff --git a/qapi/misc.json b/qapi/misc.json index 0887a3791a..dc06190168 100644 --- a/qapi/misc.json +++ b/qapi/misc.json @@ -2746,9 +2746,9 @@ # # @cpu: property based CPU(s) to node mapping (Since: 2.10) # -# @hmat-lb: memory latency and bandwidth information (Since: 2.13) +# @hmat-lb: memory latency and bandwidth information (Since: 3.10) # -# @hmat-cache: memory side cache information (Since: 2.13) +# @hmat-cache: memory side cache information (Since: 3.10) # # Since: 2.1 ## @@ -2837,43 +2837,45 @@ # @HmatLBMemoryHierarchy: # # The memory hierarchy in the System Locality Latency -# and Bandwidth Information Structure of HMAT +# and Bandwidth Information Structure of HMAT (Heterogeneous +# Memory Attribute Table) # # @memory: the structure represents the memory performance # # @last-level: last level memory of memory side cached memory # -# @1st-level: first level memory of memory side cached memory +# @first-level: first level memory of memory side cached memory # -# @2nd-level: second level memory of memory side cached memory +# @second-level: second level memory of memory side cached memory # -# @3rd-level: third level memory of memory side cached memory +# @third-level: third level memory of memory side cached memory # -# Since: 2.13 +# Since: 3.10 ## { 'enum': 'HmatLBMemoryHierarchy', - 'data': [ 'memory', 'last-level', '1st-level', - '2nd-level', '3rd-level' ] } + 'data': [ 'memory', 'last-level', 'first-level', + 'second-level', 'third-level' ] } =20 ## # @HmatLBDataType: # # Data type in the System Locality Latency -# and Bandwidth Information Structure of HMAT +# and Bandwidth Information Structure of HMAT (Heterogeneous +# Memory Attribute Table) # -# @access-latency: access latency +# @access-latency: access latency (nanoseconds) # -# @read-latency: read latency +# @read-latency: read latency (nanoseconds) # -# @write-latency: write latency +# @write-latency: write latency (nanoseconds) # -# @access-bandwidth: access bandwitch +# @access-bandwidth: access bandwidth (MB/s) # -# @read-bandwidth: read bandwidth +# @read-bandwidth: read bandwidth (MB/s) # -# @write-bandwidth: write bandwidth +# @write-bandwidth: write bandwidth (MB/s) # -# Since: 2.13 +# Since: 3.10 ## { 'enum': 'HmatLBDataType', 'data': [ 'access-latency', 'read-latency', 'write-latency', @@ -2905,7 +2907,7 @@ # @bandwidth: the value of bandwidth based on Base Unit between # @initiator and @target proximity domain. # -# Since: 2.13 +# Since: 3.10 ## { 'struct': 'NumaHmatLBOptions', 'data': { @@ -2930,7 +2932,7 @@ # # @complex: Complex Cache Indexing (implementation specific) # -# Since: 2.13 +# Since: 3.10 ## { 'enum': 'HmatCacheAssociativity', 'data': [ 'none', 'direct', 'complex' ] } @@ -2947,7 +2949,7 @@ # # @write-through: Write Through (WT) # -# Since: 2.13 +# Since: 3.10 ## { 'enum': 'HmatCacheWritePolicy', 'data': [ 'none', 'write-back', 'write-through' ] } @@ -2971,7 +2973,7 @@ # # @line: the cache Line size in bytes. # -# Since: 2.13 +# Since: 3.10 ## { 'struct': 'NumaHmatCacheOptions', 'data': { diff --git a/qemu-options.hx b/qemu-options.hx index 88f078c846..99363b7144 100644 --- a/qemu-options.hx +++ b/qemu-options.hx @@ -246,7 +246,7 @@ For example: -numa node,mem=3D1G,cpus=3D2,nodeid=3D1 \ -numa node,mem=3D1G,nodeid=3D2 \ -numa hmat-lb,initiator=3D0,target=3D1,hierarchy=3Dmemory,data-type=3Dacce= ss-latency,base-lat=3D10,base-bw=3D20,latency=3D10,bandwidth=3D10 \ --numa hmat-lb,initiator=3D1,target=3D2,hierarchy=3D1st-level,data-type=3Da= ccess-latency,base-bw=3D10,bandwidth=3D20 +-numa hmat-lb,initiator=3D1,target=3D2,hierarchy=3Dfirst-level,data-type= =3Daccess-latency,base-bw=3D10,bandwidth=3D20 @end example =20 When the processors in NUMA node 0 access memory in NUMA node 1, --=20 2.17.1 From nobody Fri Apr 19 05:21:21 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1547221325554393.50134005480766; Fri, 11 Jan 2019 07:42:05 -0800 (PST) Received: from localhost ([127.0.0.1]:54928 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghywY-0004VW-TN for importer@patchew.org; Fri, 11 Jan 2019 10:41:58 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38231) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghyrZ-0000fW-PJ for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:52 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ghyrX-0003tv-Sx for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:49 -0500 Received: from mga14.intel.com ([192.55.52.115]:56407) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ghyrX-0003i3-Gd for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:47 -0500 Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Jan 2019 07:36:46 -0800 Received: from vmmtaopc.sh.intel.com ([10.239.13.92]) by orsmga006.jf.intel.com with ESMTP; 11 Jan 2019 07:36:44 -0800 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,466,1539673200"; d="scan'208";a="107522655" From: Tao Xu To: mst@redhat.com, imammedo@redhat.com Date: Fri, 11 Jan 2019 23:34:50 +0800 Message-Id: <20190111153451.14304-9-tao3.xu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111153451.14304-1-tao3.xu@intel.com> References: <20190111153451.14304-1-tao3.xu@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.115 Subject: [Qemu-devel] [PATCH v2 8/9] hmat acpi: move some function inside of the caller X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ehabkost@redhat.com, jingqi.liu@intel.com, tao3.xu@intel.com, qemu-devel@nongnu.org, pbonzini@redhat.com, danmei.wei@intel.com, rth@twiddle.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Per Igor comments, these function is used only once, move its body inside of the caller Reviewed-by: Liu Jingqi Signed-off-by: Tao Xu --- hw/acpi/hmat.c | 93 ++++++++++++++++++++------------------------------ 1 file changed, 37 insertions(+), 56 deletions(-) diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c index e8ba9250e9..4523e98ef1 100644 --- a/hw/acpi/hmat.c +++ b/hw/acpi/hmat.c @@ -84,22 +84,41 @@ static int pc_dimm_device_list(Object *obj, void *opaqu= e) return 0; } =20 +static void classify_proximity_domains(void) +{ + int node; + + for (node =3D 0; node < nb_numa_nodes; node++) { + if (numa_info[node].is_initiator) { + initiator_pxm[num_initiator++] =3D node; + } + if (numa_info[node].is_target) { + target_pxm[num_target++] =3D node; + } + } +} + +static void hmat_build_hma(GArray *hma, PCMachineState *pcms) +{ /* * The Proximity Domain of System Physical Address ranges defined * in the HMAT, NFIT and SRAT tables shall match each other. */ -static void hmat_build_spa(GArray *table_data, PCMachineState *pcms) -{ + GSList *device_list =3D NULL; + AcpiHmatLBInfo *hmat_lb; + AcpiHmatCacheInfo *hmat_cache; + struct numa_hmat_lb_info *numa_hmat_lb; + struct numa_hmat_cache_info *numa_hmat_cache; uint64_t mem_base, mem_len; - int i; + int i, j, hrchy, type, level; =20 if (pcms->numa_nodes && !mem_ranges_number) { build_mem_ranges(pcms); } =20 for (i =3D 0; i < mem_ranges_number; i++) { - hmat_build_spa_info(table_data, mem_ranges[i].base, + build_hmat_spa(hma, mem_ranges[i].base, mem_ranges[i].length, mem_ranges[i].node); } =20 @@ -113,30 +132,10 @@ static void hmat_build_spa(GArray *table_data, PCMach= ineState *pcms) mem_len =3D object_property_get_uint(OBJECT(dimm), PC_DIMM_SIZE_PR= OP, NULL); i =3D object_property_get_uint(OBJECT(dimm), PC_DIMM_NODE_PROP, NU= LL); - hmat_build_spa_info(table_data, mem_base, mem_len, i); - } -} - -static void classify_proximity_domains(void) -{ - int node; - - for (node =3D 0; node < nb_numa_nodes; node++) { - if (numa_info[node].is_initiator) { - initiator_pxm[num_initiator++] =3D node; - } - if (numa_info[node].is_target) { - target_pxm[num_target++] =3D node; - } + build_hmat_spa(hma, mem_base, mem_len, i); } -} - -static void hmat_build_lb(GArray *table_data) -{ - AcpiHmatLBInfo *hmat_lb; - struct numa_hmat_lb_info *numa_hmat_lb; - int i, j, hrchy, type; =20 + /* Build HMAT System Locality Latency and Bandwidth Information. */ if (!num_initiator && !num_target) { classify_proximity_domains(); } @@ -154,8 +153,8 @@ static void hmat_build_lb(GArray *table_data) uint32_t size; uint8_t m, n; =20 - start =3D table_data->len; - hmat_lb =3D acpi_data_push(table_data, sizeof(*hmat_lb)); + start =3D hma->len; + hmat_lb =3D acpi_data_push(hma, sizeof(*hmat_lb)); =20 hmat_lb->type =3D cpu_to_le16(ACPI_HMAT_LB_INFO); hmat_lb->flags =3D numa_hmat_lb->hierarchy; @@ -174,19 +173,19 @@ static void hmat_build_lb(GArray *table_data) =20 /* the initiator proximity domain list */ for (i =3D 0; i < num_initiator; i++) { - list_entry =3D acpi_data_push(table_data, sizeof(uint3= 2_t)); + list_entry =3D acpi_data_push(hma, sizeof(uint32_t)); *list_entry =3D cpu_to_le32(initiator_pxm[i]); } =20 /* the target proximity domain list */ for (i =3D 0; i < num_target; i++) { - list_entry =3D acpi_data_push(table_data, sizeof(uint3= 2_t)); + list_entry =3D acpi_data_push(hma, sizeof(uint32_t)); *list_entry =3D cpu_to_le32(target_pxm[i]); } =20 /* latency or bandwidth entries */ size =3D sizeof(uint16_t) * num_initiator * num_target; - entry_start =3D acpi_data_push(table_data, size); + entry_start =3D acpi_data_push(hma, size); =20 for (i =3D 0; i < num_initiator; i++) { m =3D initiator_pxm[i]; @@ -200,26 +199,20 @@ static void hmat_build_lb(GArray *table_data) } } } - hmat_lb =3D (AcpiHmatLBInfo *)(table_data->data + start); - hmat_lb->length =3D cpu_to_le16(table_data->len - start); + hmat_lb =3D (AcpiHmatLBInfo *)(hma->data + start); + hmat_lb->length =3D cpu_to_le16(hma->len - start); } } } -} - -static void hmat_build_cache(GArray *table_data) -{ - AcpiHmatCacheInfo *hmat_cache; - struct numa_hmat_cache_info *numa_hmat_cache; - int i, level; =20 + /* Build HMAT Memory Side Cache Information. */ for (i =3D 0; i < nb_numa_nodes; i++) { for (level =3D 0; level <=3D MAX_HMAT_CACHE_LEVEL; level++) { numa_hmat_cache =3D hmat_cache_info[i][level]; if (numa_hmat_cache) { - uint64_t start =3D table_data->len; + uint64_t start =3D hma->len; =20 - hmat_cache =3D acpi_data_push(table_data, sizeof(*hmat_cac= he)); + hmat_cache =3D acpi_data_push(hma, sizeof(*hmat_cache)); hmat_cache->length =3D cpu_to_le32(sizeof(*hmat_cache)); hmat_cache->type =3D cpu_to_le16(ACPI_HMAT_CACHE_INFO); hmat_cache->mem_proximity =3D @@ -242,10 +235,10 @@ static void hmat_build_cache(GArray *table_data) int size; =20 size =3D hmat_cache->num_smbios_handles * sizeof(uint1= 6_t); - smbios_handles =3D acpi_data_push(table_data, size); + smbios_handles =3D acpi_data_push(hma, size); =20 hmat_cache =3D (AcpiHmatCacheInfo *) - (table_data->data + start); + (hma->data + start); hmat_cache->length +=3D size; =20 /* TBD: set smbios handles */ @@ -258,18 +251,6 @@ static void hmat_build_cache(GArray *table_data) } } =20 -static void hmat_build_hma(GArray *hma, PCMachineState *pcms) -{ - /* Build HMAT Memory Subsystem Address Range. */ - hmat_build_spa(hma, pcms); - - /* Build HMAT System Locality Latency and Bandwidth Information. */ - hmat_build_lb(hma); - - /* Build HMAT Memory Side Cache Information. */ - hmat_build_cache(hma); -} - static uint64_t hmat_hma_method_read(void *opaque, hwaddr addr, unsigned size) { --=20 2.17.1 From nobody Fri Apr 19 05:21:21 2024 Delivered-To: importer@patchew.org Received-SPF: pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) client-ip=209.51.188.17; envelope-from=qemu-devel-bounces+importer=patchew.org@nongnu.org; helo=lists.gnu.org; Authentication-Results: mx.zohomail.com; spf=pass (zoho.com: domain of gnu.org designates 209.51.188.17 as permitted sender) smtp.mailfrom=qemu-devel-bounces+importer=patchew.org@nongnu.org; dmarc=fail(p=none dis=none) header.from=intel.com Return-Path: Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) by mx.zohomail.com with SMTPS id 1547221327918179.9108400494099; Fri, 11 Jan 2019 07:42:07 -0800 (PST) Received: from localhost ([127.0.0.1]:54935 helo=lists.gnu.org) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghywg-0004Xb-JN for importer@patchew.org; Fri, 11 Jan 2019 10:42:06 -0500 Received: from eggs.gnu.org ([209.51.188.92]:38308) by lists.gnu.org with esmtp (Exim 4.71) (envelope-from ) id 1ghyre-0000jp-Gs for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:57 -0500 Received: from Debian-exim by eggs.gnu.org with spam-scanned (Exim 4.71) (envelope-from ) id 1ghyra-0003xm-ES for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:54 -0500 Received: from mga14.intel.com ([192.55.52.115]:56407) by eggs.gnu.org with esmtps (TLS1.0:DHE_RSA_AES_256_CBC_SHA1:32) (Exim 4.71) (envelope-from ) id 1ghyrZ-0003i3-Ry for qemu-devel@nongnu.org; Fri, 11 Jan 2019 10:36:50 -0500 Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga103.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 11 Jan 2019 07:36:49 -0800 Received: from vmmtaopc.sh.intel.com ([10.239.13.92]) by orsmga006.jf.intel.com with ESMTP; 11 Jan 2019 07:36:46 -0800 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,466,1539673200"; d="scan'208";a="107522662" From: Tao Xu To: mst@redhat.com, imammedo@redhat.com Date: Fri, 11 Jan 2019 23:34:51 +0800 Message-Id: <20190111153451.14304-10-tao3.xu@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20190111153451.14304-1-tao3.xu@intel.com> References: <20190111153451.14304-1-tao3.xu@intel.com> X-detected-operating-system: by eggs.gnu.org: Genre and OS details not recognized. X-Received-From: 192.55.52.115 Subject: [Qemu-devel] [PATCH v2 9/9] acpi: rewrite the _FIT method to use it in _HMA method X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.21 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: ehabkost@redhat.com, jingqi.liu@intel.com, tao3.xu@intel.com, qemu-devel@nongnu.org, pbonzini@redhat.com, danmei.wei@intel.com, rth@twiddle.net Errors-To: qemu-devel-bounces+importer=patchew.org@nongnu.org Sender: "Qemu-devel" Content-Transfer-Encoding: quoted-printable MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Per Igor's comment, rewrite NFIT code to build _HMA method. We rewirte the function nvdimm_build_common_dsm(Aml *dev) and nvdimm_build_fit(Aml *dev) in hw/acpi/nvdimm.c so that we can use method_number as input to decide to generate _FIT or _HMA method. Reviewed-by: Liu Jingqi Signed-off-by: Tao Xu --- hw/acpi/hmat.c | 209 +-------------------- hw/acpi/hmat.h | 2 - hw/acpi/nvdimm.c | 389 ++++++++++++++++++++++++++-------------- hw/i386/acpi-build.c | 2 +- include/hw/mem/nvdimm.h | 11 ++ 5 files changed, 270 insertions(+), 343 deletions(-) diff --git a/hw/acpi/hmat.c b/hw/acpi/hmat.c index 4523e98ef1..37ece80101 100644 --- a/hw/acpi/hmat.c +++ b/hw/acpi/hmat.c @@ -100,10 +100,10 @@ static void classify_proximity_domains(void) =20 static void hmat_build_hma(GArray *hma, PCMachineState *pcms) { -/* - * The Proximity Domain of System Physical Address ranges defined - * in the HMAT, NFIT and SRAT tables shall match each other. - */ + /* + * The Proximity Domain of System Physical Address ranges defined + * in the HMAT, NFIT and SRAT tables shall match each other. + */ =20 GSList *device_list =3D NULL; AcpiHmatLBInfo *hmat_lb; @@ -386,112 +386,6 @@ static void hmat_build_hma_buffer(PCMachineState *pcm= s) hma_buf->dirty =3D true; } =20 -static void hmat_build_common_aml(Aml *dev) -{ - Aml *method, *ifctx, *hmam_mem; - Aml *unsupport; - Aml *pckg, *pckg_index, *pckg_buf, *field; - Aml *hmam_out_buf, *hmam_out_buf_size; - uint8_t byte_list[1]; - - method =3D aml_method(HMA_COMMON_METHOD, 1, AML_SERIALIZED); - hmam_mem =3D aml_local(6); - hmam_out_buf =3D aml_local(7); - - aml_append(method, aml_store(aml_name(HMAM_ACPI_MEM_ADDR), hmam_mem)); - - /* map _HMA memory and IO into ACPI namespace. */ - aml_append(method, aml_operation_region(HMAM_IOPORT, AML_SYSTEM_IO, - aml_int(HMAM_ACPI_IO_BASE), HMAM_ACPI_IO_LEN)); - aml_append(method, aml_operation_region(HMAM_MEMORY, - AML_SYSTEM_MEMORY, hmam_mem, HMAM_MEMORY_SIZE)); - - /* - * _HMAC notifier: - * HMAM_NOTIFY: write the address of DSM memory and notify QEMU to - * emulate the access. - * - * It is the IO port so that accessing them will cause VM-exit, the - * control will be transferred to QEMU. - */ - field =3D aml_field(HMAM_IOPORT, AML_DWORD_ACC, AML_NOLOCK, - AML_PRESERVE); - aml_append(field, aml_named_field(HMAM_NOTIFY, - sizeof(uint32_t) * BITS_PER_BYTE)); - aml_append(method, field); - - /* - * _HMAC input: - * HMAM_OFFSET: store the current offset of _HMA buffer. - * - * They are RAM mapping on host so that these accesses never cause VME= xit. - */ - field =3D aml_field(HMAM_MEMORY, AML_DWORD_ACC, AML_NOLOCK, - AML_PRESERVE); - aml_append(field, aml_named_field(HMAM_OFFSET, - sizeof(typeof_field(HmatHmamIn, offset)) * BITS_PER_BYTE)); - aml_append(method, field); - - /* - * _HMAC output: - * HMAM_OUT_BUF_SIZE: the size of the buffer filled by QEMU. - * HMAM_OUT_BUF: the buffer QEMU uses to store the result. - * - * Since the page is reused by both input and out, the input data - * will be lost after storing new result into ODAT so we should fetch - * all the input data before writing the result. - */ - field =3D aml_field(HMAM_MEMORY, AML_DWORD_ACC, AML_NOLOCK, - AML_PRESERVE); - aml_append(field, aml_named_field(HMAM_OUT_BUF_SIZE, - sizeof(typeof_field(HmatHmamOut, len)) * BITS_PER_BYTE)); - aml_append(field, aml_named_field(HMAM_OUT_BUF, - (sizeof(HmatHmamOut) - sizeof(uint32_t)) * BITS_PER_BYTE)); - aml_append(method, field); - - /* - * do not support any method if HMA memory address has not been - * patched. - */ - unsupport =3D aml_if(aml_equal(hmam_mem, aml_int(0x0))); - byte_list[0] =3D HMAM_RET_STATUS_UNSUPPORT; - aml_append(unsupport, aml_return(aml_buffer(1, byte_list))); - aml_append(method, unsupport); - - /* The parameter (Arg0) of _HMAC is a package which contains a buffer.= */ - pckg =3D aml_arg(0); - ifctx =3D aml_if(aml_and(aml_equal(aml_object_type(pckg), - aml_int(4 /* Package */)) /* It is a Package? */, - aml_equal(aml_sizeof(pckg), aml_int(1)) /* 1 element */, - NULL)); - - pckg_index =3D aml_local(2); - pckg_buf =3D aml_local(3); - aml_append(ifctx, aml_store(aml_index(pckg, aml_int(0)), pckg_index)); - aml_append(ifctx, aml_store(aml_derefof(pckg_index), pckg_buf)); - aml_append(ifctx, aml_store(pckg_buf, aml_name(HMAM_OFFSET))); - aml_append(method, ifctx); - - /* - * tell QEMU about the real address of HMA memory, then QEMU - * gets the control and fills the result in _HMAC memory. - */ - aml_append(method, aml_store(hmam_mem, aml_name(HMAM_NOTIFY))); - - hmam_out_buf_size =3D aml_local(1); - /* RLEN is not included in the payload returned to guest. */ - aml_append(method, aml_subtract(aml_name(HMAM_OUT_BUF_SIZE), - aml_int(4), hmam_out_buf_size)); - aml_append(method, aml_store(aml_shiftleft(hmam_out_buf_size, aml_int(= 3)), - hmam_out_buf_size)); - aml_append(method, aml_create_field(aml_name(HMAM_OUT_BUF), - aml_int(0), hmam_out_buf_size, "OBUF")); - aml_append(method, aml_concatenate(aml_buffer(0, NULL), aml_name("OBUF= "), - hmam_out_buf)); - aml_append(method, aml_return(hmam_out_buf)); - aml_append(dev, method); -} - void hmat_init_acpi_state(AcpiHmaState *state, MemoryRegion *io, FWCfgState *fw_cfg, Object *owner) { @@ -528,98 +422,3 @@ void hmat_build_acpi(GArray *table_data, BIOSLinker *l= inker, (void *)(table_data->data + hmat_start), "HMAT", hmat_len, 1, NULL, NULL); } - -void hmat_build_aml(Aml *dev) -{ - Aml *method, *pkg, *buf, *buf_size, *offset, *call_result; - Aml *whilectx, *ifcond, *ifctx, *elsectx, *hma; - - hmat_build_common_aml(dev); - - buf =3D aml_local(0); - buf_size =3D aml_local(1); - hma =3D aml_local(2); - - aml_append(dev, aml_name_decl(HMAM_RHMA_STATUS, aml_int(0))); - - /* build helper function, RHMA. */ - method =3D aml_method("RHMA", 1, AML_SERIALIZED); - aml_append(method, aml_name_decl("OFST", aml_int(0))); - - /* prepare input package. */ - pkg =3D aml_package(1); - aml_append(method, aml_store(aml_arg(0), aml_name("OFST"))); - aml_append(pkg, aml_name("OFST")); - - /* call Read HMA function. */ - call_result =3D aml_call1(HMA_COMMON_METHOD, pkg); - aml_append(method, aml_store(call_result, buf)); - - /* handle _HMAC result. */ - aml_append(method, aml_create_dword_field(buf, - aml_int(0) /* offset at byte 0 */, "STAU")); - - aml_append(method, aml_store(aml_name("STAU"), - aml_name(HMAM_RHMA_STATUS))); - - /* if something is wrong during _HMAC. */ - ifcond =3D aml_equal(aml_int(HMAM_RET_STATUS_SUCCESS), - aml_name("STAU")); - ifctx =3D aml_if(aml_lnot(ifcond)); - aml_append(ifctx, aml_return(aml_buffer(0, NULL))); - aml_append(method, ifctx); - - aml_append(method, aml_store(aml_sizeof(buf), buf_size)); - aml_append(method, aml_subtract(buf_size, - aml_int(4) /* the size of "STAU" */, - buf_size)); - - /* if we read the end of hma. */ - ifctx =3D aml_if(aml_equal(buf_size, aml_int(0))); - aml_append(ifctx, aml_return(aml_buffer(0, NULL))); - aml_append(method, ifctx); - - aml_append(method, aml_create_field(buf, - aml_int(4 * BITS_PER_BYTE), /* offset at byte = 4.*/ - aml_shiftleft(buf_size, aml_int(3)), "BUFF")); - aml_append(method, aml_return(aml_name("BUFF"))); - aml_append(dev, method); - - /* build _HMA. */ - method =3D aml_method("_HMA", 0, AML_SERIALIZED); - offset =3D aml_local(3); - - aml_append(method, aml_store(aml_buffer(0, NULL), hma)); - aml_append(method, aml_store(aml_int(0), offset)); - - whilectx =3D aml_while(aml_int(1)); - aml_append(whilectx, aml_store(aml_call1("RHMA", offset), buf)); - aml_append(whilectx, aml_store(aml_sizeof(buf), buf_size)); - - /* - * if hma buffer was changed during RHMA, read from the beginning - * again. - */ - ifctx =3D aml_if(aml_equal(aml_name(HMAM_RHMA_STATUS), - aml_int(HMAM_RET_STATUS_HMA_CHANGED))); - aml_append(ifctx, aml_store(aml_buffer(0, NULL), hma)); - aml_append(ifctx, aml_store(aml_int(0), offset)); - aml_append(whilectx, ifctx); - - elsectx =3D aml_else(); - - /* finish hma read if no data is read out. */ - ifctx =3D aml_if(aml_equal(buf_size, aml_int(0))); - aml_append(ifctx, aml_return(hma)); - aml_append(elsectx, ifctx); - - /* update the offset. */ - aml_append(elsectx, aml_add(offset, buf_size, offset)); - /* append the data we read out to the hma buffer. */ - aml_append(elsectx, aml_concatenate(hma, buf, hma)); - aml_append(whilectx, elsectx); - aml_append(method, whilectx); - - aml_append(dev, method); -} - diff --git a/hw/acpi/hmat.h b/hw/acpi/hmat.h index 2c080a51b8..dd57678816 100644 --- a/hw/acpi/hmat.h +++ b/hw/acpi/hmat.h @@ -31,7 +31,6 @@ #include "hw/acpi/bios-linker-loader.h" #include "hw/acpi/aml-build.h" =20 -#define ACPI_HMAT_SPA 0 #define ACPI_HMAT_LB_INFO 1 #define ACPI_HMAT_CACHE_INFO 2 =20 @@ -238,7 +237,6 @@ extern struct numa_hmat_cache_info =20 void hmat_build_acpi(GArray *table_data, BIOSLinker *linker, MachineState *machine); -void hmat_build_aml(Aml *dsdt); void hmat_init_acpi_state(AcpiHmaState *state, MemoryRegion *io, FWCfgState *fw_cfg, Object *owner); void hmat_update(PCMachineState *pcms); diff --git a/hw/acpi/nvdimm.c b/hw/acpi/nvdimm.c index e53b2cb681..795236bb1b 100644 --- a/hw/acpi/nvdimm.c +++ b/hw/acpi/nvdimm.c @@ -32,6 +32,7 @@ #include "hw/acpi/bios-linker-loader.h" #include "hw/nvram/fw_cfg.h" #include "hw/mem/nvdimm.h" +#include "hw/acpi/hmat.h" =20 static int nvdimm_device_list(Object *obj, void *opaque) { @@ -959,26 +960,49 @@ void nvdimm_init_acpi_state(AcpiNVDIMMState *state, M= emoryRegion *io, =20 #define NVDIMM_QEMU_RSVD_UUID "648B9CF2-CDA1-4312-8AD9-49C4AF32BD62" =20 -static void nvdimm_build_common_dsm(Aml *dev) +static void nvdimm_build_common_dsm(Aml *dev, uint16_t method_number) { - Aml *method, *ifctx, *function, *handle, *uuid, *dsm_mem, *elsectx2; + Aml *method =3D NULL, *ifctx =3D NULL, *function =3D NULL; + Aml *handle =3D NULL, *uuid =3D NULL, *dsm_mem, *elsectx2; Aml *elsectx, *unsupport, *unpatched, *expected_uuid, *uuid_invalid; - Aml *pckg, *pckg_index, *pckg_buf, *field, *dsm_out_buf, *dsm_out_buf_= size; + Aml *pckg =3D NULL, *pckg_index, *pckg_buf, *field; + Aml *dsm_out_buf, *dsm_out_buf_size; uint8_t byte_list[1]; + uint16_t acpi_io_base =3D 0; + const char *acpi_mem_addr =3D NULL, *ioport =3D NULL, *memory =3D NULL; + const char *aml_offset =3D NULL; + + switch (method_number) { + case 0: /* build common dsm in _FIT method */ + acpi_mem_addr =3D NVDIMM_ACPI_MEM_ADDR; + ioport =3D NVDIMM_DSM_IOPORT; + acpi_io_base =3D NVDIMM_ACPI_IO_BASE; + memory =3D NVDIMM_DSM_MEMORY; + aml_offset =3D NVDIMM_DSM_ARG3; + method =3D aml_method(NVDIMM_COMMON_DSM, 5, AML_SERIALIZED); + uuid =3D aml_arg(0); + function =3D aml_arg(2); + handle =3D aml_arg(4); + break; + case 1: /* build common dsm in _HMA method */ + acpi_mem_addr =3D HMAM_ACPI_MEM_ADDR; + ioport =3D HMAM_IOPORT; + acpi_io_base =3D HMAM_ACPI_IO_BASE; + memory =3D HMAM_MEMORY; + aml_offset =3D HMAM_OFFSET; + method =3D aml_method(HMA_COMMON_METHOD, 1, AML_SERIALIZED); + break; + } =20 - method =3D aml_method(NVDIMM_COMMON_DSM, 5, AML_SERIALIZED); - uuid =3D aml_arg(0); - function =3D aml_arg(2); - handle =3D aml_arg(4); dsm_mem =3D aml_local(6); dsm_out_buf =3D aml_local(7); =20 - aml_append(method, aml_store(aml_name(NVDIMM_ACPI_MEM_ADDR), dsm_mem)); + aml_append(method, aml_store(aml_name("%s", acpi_mem_addr), dsm_mem)); =20 /* map DSM memory and IO into ACPI namespace. */ - aml_append(method, aml_operation_region(NVDIMM_DSM_IOPORT, AML_SYSTEM_= IO, - aml_int(NVDIMM_ACPI_IO_BASE), NVDIMM_ACPI_IO_LEN)); - aml_append(method, aml_operation_region(NVDIMM_DSM_MEMORY, + aml_append(method, aml_operation_region(ioport, AML_SYSTEM_IO, + aml_int(acpi_io_base), NVDIMM_ACPI_IO_LEN)); + aml_append(method, aml_operation_region(memory, AML_SYSTEM_MEMORY, dsm_mem, sizeof(NvdimmDsmIn))); =20 /* @@ -989,122 +1013,191 @@ static void nvdimm_build_common_dsm(Aml *dev) * It is the IO port so that accessing them will cause VM-exit, the * control will be transferred to QEMU. */ - field =3D aml_field(NVDIMM_DSM_IOPORT, AML_DWORD_ACC, AML_NOLOCK, + field =3D aml_field(ioport, AML_DWORD_ACC, AML_NOLOCK, AML_PRESERVE); aml_append(field, aml_named_field(NVDIMM_DSM_NOTIFY, sizeof(uint32_t) * BITS_PER_BYTE)); aml_append(method, field); =20 - /* - * DSM input: - * NVDIMM_DSM_HANDLE: store device's handle, it's zero if the _DSM call - * happens on NVDIMM Root Device. - * NVDIMM_DSM_REVISION: store the Arg1 of _DSM call. - * NVDIMM_DSM_FUNCTION: store the Arg2 of _DSM call. - * NVDIMM_DSM_ARG3: store the Arg3 of _DSM call which is a Package - * containing function-specific arguments. - * - * They are RAM mapping on host so that these accesses never cause - * VM-EXIT. - */ - field =3D aml_field(NVDIMM_DSM_MEMORY, AML_DWORD_ACC, AML_NOLOCK, - AML_PRESERVE); - aml_append(field, aml_named_field(NVDIMM_DSM_HANDLE, - sizeof(typeof_field(NvdimmDsmIn, handle)) * BITS_PER_BYTE)); - aml_append(field, aml_named_field(NVDIMM_DSM_REVISION, - sizeof(typeof_field(NvdimmDsmIn, revision)) * BITS_PER_BYTE= )); - aml_append(field, aml_named_field(NVDIMM_DSM_FUNCTION, - sizeof(typeof_field(NvdimmDsmIn, function)) * BITS_PER_BYTE= )); - aml_append(field, aml_named_field(NVDIMM_DSM_ARG3, - (sizeof(NvdimmDsmIn) - offsetof(NvdimmDsmIn, arg3)) * BITS_PER_BY= TE)); - aml_append(method, field); + field =3D aml_field(memory, AML_DWORD_ACC, AML_NOLOCK, AML_PRESERVE); + switch (method_number) { + case 0: /* build common dsm in _FIT method */ + /* + * DSM input: + * NVDIMM_DSM_HANDLE: store device's handle, it's zero if + * the _DSM call happens on NVDIMM Root + * Device. + * NVDIMM_DSM_REVISION: store the Arg1 of _DSM call. + * NVDIMM_DSM_FUNCTION: store the Arg2 of _DSM call. + * NVDIMM_DSM_ARG3: store the Arg3 of _DSM call which is a + * Package containing function-specific + * arguments. + * + * They are RAM mapping on host so that these accesses + * never cause VM-EXIT. + */ + aml_append(field, aml_named_field(NVDIMM_DSM_HANDLE, + sizeof(typeof_field(NvdimmDsmIn, handle)) * + BITS_PER_BYTE)); + aml_append(field, aml_named_field(NVDIMM_DSM_REVISION, + sizeof(typeof_field(NvdimmDsmIn, revision)) * + BITS_PER_BYTE)); + aml_append(field, aml_named_field(NVDIMM_DSM_FUNCTION, + sizeof(typeof_field(NvdimmDsmIn, function)) * + BITS_PER_BYTE)); + aml_append(field, aml_named_field(NVDIMM_DSM_ARG3, + (sizeof(NvdimmDsmIn) - offsetof(NvdimmDsmIn, arg3)) * + BITS_PER_BYTE)); + aml_append(method, field); =20 - /* - * DSM output: - * NVDIMM_DSM_OUT_BUF_SIZE: the size of the buffer filled by QEMU. - * NVDIMM_DSM_OUT_BUF: the buffer QEMU uses to store the result. - * - * Since the page is reused by both input and out, the input data - * will be lost after storing new result into ODAT so we should fetch - * all the input data before writing the result. - */ - field =3D aml_field(NVDIMM_DSM_MEMORY, AML_DWORD_ACC, AML_NOLOCK, - AML_PRESERVE); - aml_append(field, aml_named_field(NVDIMM_DSM_OUT_BUF_SIZE, - sizeof(typeof_field(NvdimmDsmOut, len)) * BITS_PER_BYTE)); - aml_append(field, aml_named_field(NVDIMM_DSM_OUT_BUF, - (sizeof(NvdimmDsmOut) - offsetof(NvdimmDsmOut, data)) * BITS_PER_BY= TE)); - aml_append(method, field); + /* + * DSM output: + * NVDIMM_DSM_OUT_BUF_SIZE: the size of the buffer + * filled by QEMU. + * NVDIMM_DSM_OUT_BUF: the buffer QEMU uses to + * store the result. + * + * Since the page is reused by both input and out, + * the input data will be lost after storing new + * result into ODAT so we should fetch all the input + * data before writing the result. + */ + field =3D aml_field(memory, AML_DWORD_ACC, AML_NOLOCK, + AML_PRESERVE); + aml_append(field, aml_named_field(NVDIMM_DSM_OUT_BUF_SIZE, + sizeof(typeof_field(NvdimmDsmOut, len)) * + BITS_PER_BYTE)); + aml_append(field, aml_named_field(NVDIMM_DSM_OUT_BUF, + (sizeof(NvdimmDsmOut) - offsetof(NvdimmDsmOut, data)) * + BITS_PER_BYTE)); + aml_append(method, field); =20 - /* - * do not support any method if DSM memory address has not been - * patched. - */ - unpatched =3D aml_equal(dsm_mem, aml_int(0x0)); + /* + * do not support any method if DSM memory address has not been + * patched. + */ + unpatched =3D aml_equal(dsm_mem, aml_int(0x0)); + + expected_uuid =3D aml_local(0); + + ifctx =3D aml_if(aml_equal(handle, aml_int(0x0))); + aml_append(ifctx, aml_store( + aml_touuid("2F10E7A4-9E91-11E4-89D3-123B93F75CBA") + /* UUID for NVDIMM Root Device */, expected_uuid)); + aml_append(method, ifctx); + elsectx =3D aml_else(); + ifctx =3D aml_if(aml_equal(handle, + aml_int(NVDIMM_QEMU_RSVD_HANDLE_ROOT))); + aml_append(ifctx, aml_store(aml_touuid(NVDIMM_QEMU_RSVD_UUID + /* UUID for QEMU internal use */), expected_uuid)); + aml_append(elsectx, ifctx); + elsectx2 =3D aml_else(); + aml_append(elsectx2, aml_store( + aml_touuid("4309AC30-0D11-11E4-9191-0800200C9A66") + /* UUID for NVDIMM Devices */, expected_uuid)); + aml_append(elsectx, elsectx2); + aml_append(method, elsectx); + + uuid_invalid =3D aml_lnot(aml_equal(uuid, expected_uuid)); + + unsupport =3D aml_if(aml_or(unpatched, uuid_invalid, NULL)); =20 - expected_uuid =3D aml_local(0); + /* + * function 0 is called to inquire what functions are supported by + * OSPM + */ + ifctx =3D aml_if(aml_equal(function, aml_int(0))); + byte_list[0] =3D 0 /* No function Supported */; + aml_append(ifctx, aml_return(aml_buffer(1, byte_list))); + aml_append(unsupport, ifctx); =20 - ifctx =3D aml_if(aml_equal(handle, aml_int(0x0))); - aml_append(ifctx, aml_store( - aml_touuid("2F10E7A4-9E91-11E4-89D3-123B93F75CBA") - /* UUID for NVDIMM Root Device */, expected_uuid)); - aml_append(method, ifctx); - elsectx =3D aml_else(); - ifctx =3D aml_if(aml_equal(handle, aml_int(NVDIMM_QEMU_RSVD_HANDLE_ROO= T))); - aml_append(ifctx, aml_store(aml_touuid(NVDIMM_QEMU_RSVD_UUID - /* UUID for QEMU internal use */), expected_uuid)); - aml_append(elsectx, ifctx); - elsectx2 =3D aml_else(); - aml_append(elsectx2, aml_store( - aml_touuid("4309AC30-0D11-11E4-9191-0800200C9A66") - /* UUID for NVDIMM Devices */, expected_uuid)); - aml_append(elsectx, elsectx2); - aml_append(method, elsectx); + /* No function is supported yet. */ + byte_list[0] =3D NVDIMM_DSM_RET_STATUS_UNSUPPORT; + aml_append(unsupport, aml_return(aml_buffer(1, byte_list))); + aml_append(method, unsupport); + + /* + * The HDLE indicates the DSM function is issued from which device, + * it reserves 0 for root device and is the handle for + * NVDIMM devices. See the comments in nvdimm_slot_to_handle(). + */ + aml_append(method, aml_store(handle, + aml_name(NVDIMM_DSM_HANDLE))); + aml_append(method, aml_store(aml_arg(1), + aml_name(NVDIMM_DSM_REVISION))); + aml_append(method, aml_store(aml_arg(2), + aml_name(NVDIMM_DSM_FUNCTION))); =20 - uuid_invalid =3D aml_lnot(aml_equal(uuid, expected_uuid)); + /* + * The fourth parameter (Arg3) of _DSM is a package which contains + * a buffer, the layout of the buffer is specified by UUID (Arg0), + * Revision ID (Arg1) and Function Index (Arg2) which are document= ed + * in the DSM Spec. + */ + pckg =3D aml_arg(3); + ifctx =3D aml_if(aml_and(aml_equal(aml_object_type(pckg), + aml_int(4 /* Package */)) /* It is a Package? */, + aml_equal(aml_sizeof(pckg), + aml_int(1)) /* 1 element? */, + NULL)); =20 - unsupport =3D aml_if(aml_or(unpatched, uuid_invalid, NULL)); + break; + case 1: /* build common dsm in _HMA method */ + /* + * _HMAC input: + * HMAM_OFFSET: store the current offset of _HMA buffer. + * + * They are RAM mapping on host so that + * these accesses never cause VMExit. + */ + aml_append(field, aml_named_field(HMAM_OFFSET, + sizeof(typeof_field(HmatHmamIn, offset)) * BITS_PER_BYTE)); + aml_append(method, field); =20 - /* - * function 0 is called to inquire what functions are supported by - * OSPM - */ - ifctx =3D aml_if(aml_equal(function, aml_int(0))); - byte_list[0] =3D 0 /* No function Supported */; - aml_append(ifctx, aml_return(aml_buffer(1, byte_list))); - aml_append(unsupport, ifctx); + /* + * _HMAC output: + * HMAM_OUT_BUF_SIZE: the size of the buffer filled by QEMU. + * HMAM_OUT_BUF: the buffer QEMU uses to store the result. + * + * Since the page is reused by both input and out, the input data + * will be lost after storing new result into ODAT so we should + * fetch all the input data before writing the result. + */ + field =3D aml_field(HMAM_MEMORY, AML_DWORD_ACC, AML_NOLOCK, + AML_PRESERVE); + aml_append(field, aml_named_field(HMAM_OUT_BUF_SIZE, + sizeof(typeof_field(HmatHmamOut, len)) * BITS_PER_BYTE)); + aml_append(field, aml_named_field(HMAM_OUT_BUF, + (sizeof(HmatHmamOut) - sizeof(uint32_t)) * BITS_PER_BYTE)); + aml_append(method, field); =20 - /* No function is supported yet. */ - byte_list[0] =3D NVDIMM_DSM_RET_STATUS_UNSUPPORT; - aml_append(unsupport, aml_return(aml_buffer(1, byte_list))); - aml_append(method, unsupport); + /* + * do not support any method if HMA memory address has not been + * patched. + */ + unsupport =3D aml_if(aml_equal(dsm_mem, aml_int(0x0))); + byte_list[0] =3D HMAM_RET_STATUS_UNSUPPORT; + aml_append(unsupport, aml_return(aml_buffer(1, byte_list))); + aml_append(method, unsupport); =20 - /* - * The HDLE indicates the DSM function is issued from which device, - * it reserves 0 for root device and is the handle for NVDIMM devices. - * See the comments in nvdimm_slot_to_handle(). - */ - aml_append(method, aml_store(handle, aml_name(NVDIMM_DSM_HANDLE))); - aml_append(method, aml_store(aml_arg(1), aml_name(NVDIMM_DSM_REVISION)= )); - aml_append(method, aml_store(aml_arg(2), aml_name(NVDIMM_DSM_FUNCTION)= )); + /* + * The parameter (Arg0) of _HMAC is + * a package which contains a buffer. + */ + pckg =3D aml_arg(0); + ifctx =3D aml_if(aml_and(aml_equal(aml_object_type(pckg), + aml_int(4 /* Package */)) /* It is a Package? */, + aml_equal(aml_sizeof(pckg), aml_int(1)) /* 1 element *= /, + NULL)); =20 - /* - * The fourth parameter (Arg3) of _DSM is a package which contains - * a buffer, the layout of the buffer is specified by UUID (Arg0), - * Revision ID (Arg1) and Function Index (Arg2) which are documented - * in the DSM Spec. - */ - pckg =3D aml_arg(3); - ifctx =3D aml_if(aml_and(aml_equal(aml_object_type(pckg), - aml_int(4 /* Package */)) /* It is a Package? */, - aml_equal(aml_sizeof(pckg), aml_int(1)) /* 1 element? *= /, - NULL)); + break; + } =20 pckg_index =3D aml_local(2); pckg_buf =3D aml_local(3); aml_append(ifctx, aml_store(aml_index(pckg, aml_int(0)), pckg_index)); aml_append(ifctx, aml_store(aml_derefof(pckg_index), pckg_buf)); - aml_append(ifctx, aml_store(pckg_buf, aml_name(NVDIMM_DSM_ARG3))); + aml_append(ifctx, aml_store(pckg_buf, aml_name("%s", aml_offset))); aml_append(method, ifctx); =20 /* @@ -1138,19 +1231,37 @@ static void nvdimm_build_device_dsm(Aml *dev, uint3= 2_t handle) aml_append(dev, method); } =20 -static void nvdimm_build_fit(Aml *dev) +void nvdimm_build_fit(Aml *dev, uint16_t method_number) { - Aml *method, *pkg, *buf, *buf_size, *offset, *call_result; - Aml *whilectx, *ifcond, *ifctx, *elsectx, *fit; + Aml *method, *pkg, *buf, *buf_size, *offset, *call_result =3D NULL; + Aml *whilectx, *ifcond, *ifctx, *elsectx, *buf_name; + const char *help_function =3D NULL, *method_name =3D NULL; + int ret_status_success, ret_status_changed; + + switch (method_number) { + case 0: /* _FIT method */ + method_name =3D "_FIT"; + help_function =3D "RFIT"; + ret_status_success =3D NVDIMM_DSM_RET_STATUS_SUCCESS; + ret_status_changed =3D NVDIMM_DSM_RET_STATUS_FIT_CHANGED; + break; + case 1: /* _HMA method */ + method_name =3D "_HMA"; + nvdimm_build_common_dsm(dev, METHOD_NAME_HMA); + help_function =3D "RHMA"; + ret_status_success =3D HMAM_RET_STATUS_SUCCESS; + ret_status_changed =3D HMAM_RET_STATUS_HMA_CHANGED; + break; + } =20 buf =3D aml_local(0); buf_size =3D aml_local(1); - fit =3D aml_local(2); + buf_name =3D aml_local(2); =20 aml_append(dev, aml_name_decl(NVDIMM_DSM_RFIT_STATUS, aml_int(0))); =20 - /* build helper function, RFIT. */ - method =3D aml_method("RFIT", 1, AML_SERIALIZED); + /* build helper function. */ + method =3D aml_method(help_function, 1, AML_SERIALIZED); aml_append(method, aml_name_decl("OFST", aml_int(0))); =20 /* prepare input package. */ @@ -1158,12 +1269,20 @@ static void nvdimm_build_fit(Aml *dev) aml_append(method, aml_store(aml_arg(0), aml_name("OFST"))); aml_append(pkg, aml_name("OFST")); =20 - /* call Read_FIT function. */ - call_result =3D aml_call5(NVDIMM_COMMON_DSM, - aml_touuid(NVDIMM_QEMU_RSVD_UUID), - aml_int(1) /* Revision 1 */, - aml_int(0x1) /* Read FIT */, - pkg, aml_int(NVDIMM_QEMU_RSVD_HANDLE_ROOT)); + /* call Read function. */ + switch (method_number) { + case 0: /* build common dsm in _FIT method */ + call_result =3D aml_call5(NVDIMM_COMMON_DSM, + aml_touuid(NVDIMM_QEMU_RSVD_UUID), + aml_int(1) /* Revision 1 */, + aml_int(0x1) /* Read FIT */, + pkg, aml_int(NVDIMM_QEMU_RSVD_HANDLE_ROOT)= ); + break; + case 1: /* build common dsm in _FIT method */ + call_result =3D aml_call1(HMA_COMMON_METHOD, pkg); + break; + } + aml_append(method, aml_store(call_result, buf)); =20 /* handle _DSM result. */ @@ -1174,7 +1293,7 @@ static void nvdimm_build_fit(Aml *dev) aml_name(NVDIMM_DSM_RFIT_STATUS))); =20 /* if something is wrong during _DSM. */ - ifcond =3D aml_equal(aml_int(NVDIMM_DSM_RET_STATUS_SUCCESS), + ifcond =3D aml_equal(aml_int(ret_status_success), aml_name("STAU")); ifctx =3D aml_if(aml_lnot(ifcond)); aml_append(ifctx, aml_return(aml_buffer(0, NULL))); @@ -1185,7 +1304,7 @@ static void nvdimm_build_fit(Aml *dev) aml_int(4) /* the size of "STAU" */, buf_size)); =20 - /* if we read the end of fit. */ + /* if we read the end of fit or hma. */ ifctx =3D aml_if(aml_equal(buf_size, aml_int(0))); aml_append(ifctx, aml_return(aml_buffer(0, NULL))); aml_append(method, ifctx); @@ -1196,38 +1315,38 @@ static void nvdimm_build_fit(Aml *dev) aml_append(method, aml_return(aml_name("BUFF"))); aml_append(dev, method); =20 - /* build _FIT. */ - method =3D aml_method("_FIT", 0, AML_SERIALIZED); + /* build _FIT or _HMA. */ + method =3D aml_method(method_name, 0, AML_SERIALIZED); offset =3D aml_local(3); =20 - aml_append(method, aml_store(aml_buffer(0, NULL), fit)); + aml_append(method, aml_store(aml_buffer(0, NULL), buf_name)); aml_append(method, aml_store(aml_int(0), offset)); =20 whilectx =3D aml_while(aml_int(1)); - aml_append(whilectx, aml_store(aml_call1("RFIT", offset), buf)); + aml_append(whilectx, aml_store(aml_call1(help_function, offset), buf)); aml_append(whilectx, aml_store(aml_sizeof(buf), buf_size)); =20 /* - * if fit buffer was changed during RFIT, read from the beginning - * again. + * if buffer was changed during RFIT or RHMA, + * read from the beginning again. */ ifctx =3D aml_if(aml_equal(aml_name(NVDIMM_DSM_RFIT_STATUS), - aml_int(NVDIMM_DSM_RET_STATUS_FIT_CHANGED))); - aml_append(ifctx, aml_store(aml_buffer(0, NULL), fit)); + aml_int(ret_status_changed))); + aml_append(ifctx, aml_store(aml_buffer(0, NULL), buf_name)); aml_append(ifctx, aml_store(aml_int(0), offset)); aml_append(whilectx, ifctx); =20 elsectx =3D aml_else(); =20 - /* finish fit read if no data is read out. */ + /* finish fit or hma read if no data is read out. */ ifctx =3D aml_if(aml_equal(buf_size, aml_int(0))); - aml_append(ifctx, aml_return(fit)); + aml_append(ifctx, aml_return(buf_name)); aml_append(elsectx, ifctx); =20 /* update the offset. */ aml_append(elsectx, aml_add(offset, buf_size, offset)); - /* append the data we read out to the fit buffer. */ - aml_append(elsectx, aml_concatenate(fit, buf, fit)); + /* append the data we read out to the fit or hma buffer. */ + aml_append(elsectx, aml_concatenate(buf_name, buf, buf_name)); aml_append(whilectx, elsectx); aml_append(method, whilectx); =20 @@ -1288,11 +1407,11 @@ static void nvdimm_build_ssdt(GArray *table_offsets= , GArray *table_data, */ aml_append(dev, aml_name_decl("_HID", aml_string("ACPI0012"))); =20 - nvdimm_build_common_dsm(dev); + nvdimm_build_common_dsm(dev, METHOD_NAME_FIT); =20 /* 0 is reserved for root device. */ nvdimm_build_device_dsm(dev, 0); - nvdimm_build_fit(dev); + nvdimm_build_fit(dev, METHOD_NAME_FIT); =20 nvdimm_build_nvdimm_devices(dev, ram_slots); =20 diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c index 729e67e829..3e014b1ead 100644 --- a/hw/i386/acpi-build.c +++ b/hw/i386/acpi-build.c @@ -1846,7 +1846,7 @@ build_dsdt(GArray *table_data, BIOSLinker *linker, build_q35_pci0_int(dsdt); } =20 - hmat_build_aml(dsdt); + nvdimm_build_fit(dsdt, METHOD_NAME_HMA); =20 if (pcmc->legacy_cpu_hotplug) { build_legacy_cpu_hotplug_aml(dsdt, machine, pm->cpu_hp_io_base); diff --git a/include/hw/mem/nvdimm.h b/include/hw/mem/nvdimm.h index c5c9b3c7f8..3bc38735e4 100644 --- a/include/hw/mem/nvdimm.h +++ b/include/hw/mem/nvdimm.h @@ -25,6 +25,7 @@ =20 #include "hw/mem/pc-dimm.h" #include "hw/acpi/bios-linker-loader.h" +#include "hw/acpi/aml-build.h" =20 #define NVDIMM_DEBUG 0 #define nvdimm_debug(fmt, ...) \ @@ -110,6 +111,15 @@ typedef struct NVDIMMClass NVDIMMClass; #define NVDIMM_ACPI_IO_BASE 0x0a18 #define NVDIMM_ACPI_IO_LEN 4 =20 +/* + * The ACPI Device Configuration method name used in nvdimm_build_fit, + * use number to representative the name: + * 0 means "_FIT" + * 1 means "_HMA" + */ +#define METHOD_NAME_FIT 0 +#define METHOD_NAME_HMA 1 + /* * NvdimmFitBuffer: * @fit: FIT structures for present NVDIMMs. It is updated when @@ -150,4 +160,5 @@ void nvdimm_build_acpi(GArray *table_offsets, GArray *t= able_data, uint32_t ram_slots); void nvdimm_plug(AcpiNVDIMMState *state); void nvdimm_acpi_plug_cb(HotplugHandler *hotplug_dev, DeviceState *dev); +void nvdimm_build_fit(Aml *dev, uint16_t method_number); #endif --=20 2.17.1