default-configs/i386-softmmu.mak | 1 + default-configs/x86_64-softmmu.mak | 1 + docs/specs/vmgenid.txt | 245 +++++++++++++++++++++++++++++++++ hmp-commands-info.hx | 13 ++ hmp.c | 9 ++ hmp.h | 1 + hw/acpi/Makefile.objs | 1 + hw/acpi/aml-build.c | 2 + hw/acpi/bios-linker-loader.c | 58 +++++++- hw/acpi/vmgenid.c | 253 +++++++++++++++++++++++++++++++++++ hw/i386/acpi-build.c | 16 +++ include/hw/acpi/acpi_dev_interface.h | 1 + include/hw/acpi/aml-build.h | 1 + include/hw/acpi/bios-linker-loader.h | 6 + include/hw/acpi/vmgenid.h | 35 +++++ qapi-schema.json | 20 +++ stubs/Makefile.objs | 1 + stubs/vmgenid.c | 8 ++ tests/Makefile.include | 2 + tests/acpi-utils.h | 75 +++++++++++ tests/bios-tables-test.c | 72 +--------- tests/vmgenid-test.c | 195 +++++++++++++++++++++++++++ 22 files changed, 942 insertions(+), 74 deletions(-) create mode 100644 docs/specs/vmgenid.txt create mode 100644 hw/acpi/vmgenid.c create mode 100644 include/hw/acpi/vmgenid.h create mode 100644 stubs/vmgenid.c create mode 100644 tests/acpi-utils.h create mode 100644 tests/vmgenid-test.c
From: Ben Warren <ben@skyportsystems.com> This patch set adds support for passing a GUID to Windows guests. It is a re-implementation of previous patch sets written by Igor Mammedov et al, but this time passing the GUID data as a fw_cfg blob. This patch set has dependencies on new guest functionality, in particular the support for a new linker-loader command and the ability to write back data to QEMU over a DMA link. Work is in flight in both SeaBIOS and OVMF to support this. v5->v6: - Rebased to top of tree. - Changed device from sysbus to a simple device. This removed the need for adding dynamic sysbus support to pc_piix boards. - Removed patch that introduced QWORD patching of AML. - Removed ability to set GUID via QMP/HMP. - Improved comments/documentation in code. v4->v5: - Added significantly more detail to the documentation. - Replaced the previously-implemented linker-loader command with a new one: "write pointer". This allows writing the guest address of a fw_cfg blob back to an arbitrary offset in a writeable fw_cfg file visible to QEMU. This will require support in SeaBIOS and OVMF (ongoing). - Fixed endianness issues throughout. - Several styling cleanups. v3->v4: - Rebased to top of tree. - Re-added document patch that was accidentally dropped from the last revision. - Added VMState functionality so that VGIA is restored properly. - Added Unit tests v2->v3: - Added second writeable fw_cfg for storing the VM Generaiton ID address. This uses a new linker-loader command for instructing the guest to write back the allocated address. A patch for SeaBIOS has been submitted (https://www.seabios.org/pipermail/seabios/2017-January/011079.html) and the resulting binary will need to be pulled into QEMU once accepted. - Setting VM Generation ID by command line or qmp/hmp now accepts an "auto" value, whereby QEMU generates a random GUID. - Incorporated review comments from v2 mainly around code styling and AML syntax - Changed to use the E05 ACPI event instead of E00 v1->v2: - Removed "changed" boolean parameter as it is unneeded - Added ACPI Notify logic - Style changes to pass checkpatch.pl - Added support for dynamic sysbus to pc_piix boards This patch set adds support for passing a GUID to Windows guests. It is a re-implementation of previous patch sets written by Igor Mammedov et al, but this time passing the GUID data as a fw_cfg blob. This patch set has dependencies on new guest functionality, in particular the support for a new linker-loader command and the ability to write back data to QEMU over a DMA link. Work is in flight in both SeaBIOS and OVMF to support this. v5->v6: - Rebased to top of tree. - Changed device from sysbus to a simple device. This removed the need for adding dynamic sysbus support to pc_piix boards. - Removed patch that introduced QWORD patching of AML. - Removed ability to set GUID via QMP/HMP. - Improved comments/documentation in code. v4->v5: - Added significantly more detail to the documentation. - Replaced the previously-implemented linker-loader command with a new one: "write pointer". This allows writing the guest address of a fw_cfg blob back to an arbitrary offset in a writeable fw_cfg file visible to QEMU. This will require support in SeaBIOS and OVMF (ongoing). - Fixed endianness issues throughout. - Several styling cleanups. v3->v4: - Rebased to top of tree. - Re-added document patch that was accidentally dropped from the last revision. - Added VMState functionality so that VGIA is restored properly. - Added Unit tests v2->v3: - Added second writeable fw_cfg for storing the VM Generaiton ID address. This uses a new linker-loader command for instructing the guest to write back the allocated address. A patch for SeaBIOS has been submitted (https://www.seabios.org/pipermail/seabios/2017-January/011079.html) and the resulting binary will need to be pulled into QEMU once accepted. - Setting VM Generation ID by command line or qmp/hmp now accepts an "auto" value, whereby QEMU generates a random GUID. - Incorporated review comments from v2 mainly around code styling and AML syntax - Changed to use the E05 ACPI event instead of E00 v1->v2: - Removed "changed" boolean parameter as it is unneeded - Added ACPI Notify logic - Style changes to pass checkpatch.pl - Added support for dynamic sysbus to pc_piix boards Ben Warren (6): linker-loader: Add new 'write pointer' command docs: VM Generation ID device description ACPI: Add vmgenid blob storage to the build tables ACPI: Add Virtual Machine Generation ID support tests: Move reusable ACPI macros into a new header file tests: Add unit tests for the VM Generation ID feature Igor Mammedov (1): qmp/hmp: add query-vm-generation-id and 'info vm-generation-id' commands default-configs/i386-softmmu.mak | 1 + default-configs/x86_64-softmmu.mak | 1 + docs/specs/vmgenid.txt | 245 +++++++++++++++++++++++++++++++++ hmp-commands-info.hx | 13 ++ hmp.c | 9 ++ hmp.h | 1 + hw/acpi/Makefile.objs | 1 + hw/acpi/aml-build.c | 2 + hw/acpi/bios-linker-loader.c | 58 +++++++- hw/acpi/vmgenid.c | 253 +++++++++++++++++++++++++++++++++++ hw/i386/acpi-build.c | 16 +++ include/hw/acpi/acpi_dev_interface.h | 1 + include/hw/acpi/aml-build.h | 1 + include/hw/acpi/bios-linker-loader.h | 6 + include/hw/acpi/vmgenid.h | 35 +++++ qapi-schema.json | 20 +++ stubs/Makefile.objs | 1 + stubs/vmgenid.c | 8 ++ tests/Makefile.include | 2 + tests/acpi-utils.h | 75 +++++++++++ tests/bios-tables-test.c | 72 +--------- tests/vmgenid-test.c | 195 +++++++++++++++++++++++++++ 22 files changed, 942 insertions(+), 74 deletions(-) create mode 100644 docs/specs/vmgenid.txt create mode 100644 hw/acpi/vmgenid.c create mode 100644 include/hw/acpi/vmgenid.h create mode 100644 stubs/vmgenid.c create mode 100644 tests/acpi-utils.h create mode 100644 tests/vmgenid-test.c -- 2.7.4
On 02/15/17 07:15, ben@skyportsystems.com wrote: > From: Ben Warren <ben@skyportsystems.com> > > This patch set adds support for passing a GUID to Windows guests. It > is a re-implementation of previous patch sets written by Igor Mammedov > et al, but this time passing the GUID data as a fw_cfg blob. > > This patch set has dependencies on new guest functionality, in > particular the support for a new linker-loader command and the ability > to write back data to QEMU over a DMA link. Work is in flight in both > SeaBIOS and OVMF to support this. > > v5->v6: > - Rebased to top of tree. > - Changed device from sysbus to a simple device. This removed the need for > adding dynamic sysbus support to pc_piix boards. > - Removed patch that introduced QWORD patching of AML. > - Removed ability to set GUID via QMP/HMP. > - Improved comments/documentation in code. So here's my testing with a RHEL-7 guest: (1) The command line option passed to QEMU is -device vmgenid,guid=00112233-4455-6677-8899-AABBCCDDEEFF This is the example GUID provided in the SMBIOS spec v3.0.0 (DSP0134), section 7.2.1 "System -- UUID". (SMBIOS is only relevant here because it codifies the fact that Microsoft consumes UUID in little-endian order.) The expected representation, according to the SMBIOS spec, is 33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF (2) Here's an excerpt from the OVMF log: > ProcessCmdAllocate: File="etc/vmgenid_guid" Alignment=0x1000 Zone=1 Size=0x1000 Address=0x7FE5C000 This is where "etc/vmgenid_guid" is allocated and downloaded, the allocation address is 0x7FE5C000. > Select Item: 0x19 > Select Item: 0x22 > ProcessCmdAllocate: File="etc/acpi/tables" Alignment=0x40 Zone=1 Size=0x20000 Address=0x7E7AB000 > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x49 Start=0x40 Length=0x1403 > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1467 PointerSize=4 > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x146B PointerSize=4 > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x144C Start=0x1443 Length=0x74 > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x14C0 Start=0x14B7 Length=0x80 > Select Item: 0x19 > SaveCondensedWritePointerToS3Context: 0x002B/[0x00000000+8] := 0x7FE5C000 (0) This is where OVMF stashes the WRITE_POINTER command in "condensed" form, for S3. The fw_cfg selector value is 0x2B (for the fw_cfg file to be rewritten), the pointer is located at offset 0, has size 0, and the value to assign is 0x7FE5C000. And, this is #0 of the saved / condensed WRITE_POINTER commands. > Select Item: 0x2B > ProcessCmdWritePointer: PointerFile="etc/vmgenid_addr" PointeeFile="etc/vmgenid_guid" PointerOffset=0x0 PointerSize=8 This is where the WRITE_POINTER command is actually executed, during normal boot. > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/vmgenid_guid" PointerOffset=0x1561 PointerSize=4 This is where we link "etc/vmgenid_guid" into VGIA. > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x1540 Start=0x1537 Length=0xCA > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1625 PointerSize=4 > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1629 PointerSize=4 > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x162D PointerSize=4 > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x160A Start=0x1601 Length=0x30 > ProcessCmdAddPointer: PointerFile="etc/acpi/rsdp" PointeeFile="etc/acpi/tables" PointerOffset=0x10 PointerSize=4 > ProcessCmdAddChecksum: File="etc/acpi/rsdp" ResultOffset=0x8 Start=0x0 Length=0x24 > InstallQemuFwCfgTables: unknown loader command: 0x0 > InstallQemuFwCfgTables: unknown loader command: 0x0 > InstallQemuFwCfgTables: unknown loader command: 0x0 > InstallQemuFwCfgTables: unknown loader command: 0x0 > InstallQemuFwCfgTables: unknown loader command: 0x0 > InstallQemuFwCfgTables: unknown loader command: 0x0 > InstallQemuFwCfgTables: unknown loader command: 0x0 > InstallQemuFwCfgTables: unknown loader command: 0x0 > InstallQemuFwCfgTables: unknown loader command: 0x0 > InstallQemuFwCfgTables: unknown loader command: 0x0 > InstallQemuFwCfgTables: unknown loader command: 0x0 > InstallQemuFwCfgTables: unknown loader command: 0x0 > InstallQemuFwCfgTables: unknown loader command: 0x0 > InstallQemuFwCfgTables: unknown loader command: 0x0 > InstallQemuFwCfgTables: unknown loader command: 0x0 > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AB000 (remaining: 0x20000): found "FACS" size 0x40 > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AB040 (remaining: 0x1FFC0): found "DSDT" size 0x1403 > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/vmgenid_guid" at 0x7FE5C000 (remaining: 0x1000): not found; marking fw_cfg blob as opaque This is where the OVMF SDT Header Probe Suppressor does its job. (NB, the "opaque marking" has happened already in ProcessCmdWritePointer() too, above.) > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC443 (remaining: 0x1EBBD): found "FACP" size 0x74 > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC4B7 (remaining: 0x1EB49): found "APIC" size 0x80 > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC537 (remaining: 0x1EAC9): found "SSDT" size 0xCA > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC601 (remaining: 0x1E9FF): found "RSDT" size 0x30 > TransferS3ContextToBootScript: boot script fragment saved, ScratchBuffer=7FE4F018 This is where the WRITE_POINTER commands, stashed earlier in condensed form, are translated to S3 Boot Script opcodes. > InstallQemuFwCfgTables: installed 5 tables Such as: FACS, DSDT, FACP, APIC, SSDT. OVMF recognizes RSDT and ignores it (it's handled by edk2 automatically). > InstallQemuFwCfgTables: freeing "etc/acpi/rsdp" > InstallQemuFwCfgTables: freeing "etc/acpi/tables" OVMF sees that the above two blobs have not been marked as "opaque" -- they only contained ACPI tables, judged from the ADD_POINTER commands that pointed into them. So these two blobs are freed. Note that "etc/vmgenid_guid" is not freed. So, from the firmware log, everything looks OK. (3) I dumped the SSDT in the RHEL-7 guest: > /* > * Intel ACPI Component Architecture > * AML/ASL+ Disassembler version 20160527-64 > * Copyright (c) 2000 - 2016 Intel Corporation > * > * Disassembling to symbolic ASL+ operators > * > * Disassembly of ssdt.dat, Wed Feb 15 19:21:11 2017 > * > * Original Table Header: > * Signature "SSDT" > * Length 0x000000CA (202) > * Revision 0x01 > * Checksum 0x1D > * OEM ID "BOCHS " > * OEM Table ID "VMGENID" > * OEM Revision 0x00000001 (1) > * Compiler ID "BXPC" > * Compiler Version 0x00000001 (1) > */ > DefinitionBlock ("", "SSDT", 1, "BOCHS ", "VMGENID", 0x00000001) > { > Name (VGIA, 0x7FE5C000) Note that the value matches the value logged by the firmware in (2). > Scope (\_SB) > { > Device (VGEN) > { > Name (_HID, "QEMUVGID") // _HID: Hardware ID > Name (_CID, "VM_Gen_Counter") // _CID: Compatible ID > Name (_DDN, "VM_Gen_Counter") // _DDN: DOS Device Name > Method (_STA, 0, NotSerialized) // _STA: Status > { > Local0 = 0x0F > If (VGIA == Zero) > { > Local0 = Zero > } > > Return (Local0) > } > > Method (ADDR, 0, NotSerialized) > { > Local0 = Package (0x02) {} > Local0 [Zero] = (VGIA + 0x28) > Local0 [One] = Zero > Return (Local0) > } > } > } > > Method (\_GPE._E05, 0, NotSerialized) // _Exx: Edge-Triggered GPE > { > Notify (\_SB.VGEN, 0x80) // Status Change > } > } Looks good and matches the documentation. (4) To be sure, I checked the address against the guest dmesg, which contains a dump of the UEFI memory map: > [ 0.000000] efi: mem52: type=10, attr=0xf, range=[0x000000007fe5a000-0x000000007fe5e000) (0MB) The page (4096 bytes) at 0x7FE5C000 falls into this range. Type=10 means EfiACPIMemoryNVS. (5) At this point I dumped the guest RAM with the dump-guest-memory monitor command, opened it with "crash", and listed it: > crash> rd -p -8 0x7FE5C000 0x40 > 7fe5c000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > 7fe5c010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > 7fe5c020: 00 00 00 00 00 00 00 00 33 22 11 00 55 44 77 66 ........3"..UDwf > 7fe5c030: 88 99 aa bb cc dd ee ff 00 00 00 00 00 00 00 00 ................ We can see that the GUID starts at 0x7FE5C000 + 0x28, and also that the byte-level representation matches the little endian one given in (1). This proves that the initial blob download worked fine. (6) Here I attached "gdb" to QEMU, set a breakpoint on vmgenid_handle_reset(), allowed the inferior process to continue execution. Then I suspended and resumed the guest (ACPI S3). The breakpoint was hit during resume: > Breakpoint 1, vmgenid_handle_reset (opaque=0x7f2bd03c36e0) at .../hw/acpi/vmgenid.c:205 > 205 VmGenIdState *vms = VMGENID(opaque); First of all, before allowing QEMU to zero out the address blob, I listed the address and the contents of the address blob (here exploiting that my host is also little endian): > (gdb) print (void*)vms->vmgenid_addr_le > $2 = (void *) 0x7f2bd03c37b0 > (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le > $4 = 0x7fe5c000 This proves that QEMU has the right address, matching the firmware log from (2), and the ACPI dump from (3). (7) At this point I allowed the inferior to proceed a bit: > (gdb) n > 207 memset(vms->vmgenid_addr_le, 0, ARRAY_SIZE(vms->vmgenid_addr_le)); > (gdb) n > 208 } I verified that the blob was zeroed: > (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le > $5 = 0x0 then allowed the inferior to run free. > (gdb) cont > Continuing. (8) New messages appeared in the firmware log: > S3ResumeExecuteBootScript() > PeiS3ResumeState - 7FF92B18 > transfer control to Standalone Boot Script Executor > S3BootScriptExecute: > TableHeader - 0x7E7A7000 > TableHeader.Version - 0x0001 > TableHeader.TableLength - 0x000000ED > ExecuteBootScript - 7E7A700D > EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE > BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000010, 0x00000000 Here the ACPI S3 Boot Script, prepared in TransferS3ContextToBootScript() -- see (2) -- creates a DMA access command for fw_cfg. The DMA access command is written to pre-reserved memory (see "ScratchBuffer" above). > S3BootScriptWidthUint8 - 0x7FE4F018 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F019 (0x2B) The fw_cfg selector is 0x2B. (See under (2).) > S3BootScriptWidthUint8 - 0x7FE4F01A (0x00) > S3BootScriptWidthUint8 - 0x7FE4F01B (0x0C) This is a combined select+skip operation. > S3BootScriptWidthUint8 - 0x7FE4F01C (0x00) > S3BootScriptWidthUint8 - 0x7FE4F01D (0x00) > S3BootScriptWidthUint8 - 0x7FE4F01E (0x00) > S3BootScriptWidthUint8 - 0x7FE4F01F (0x00) The skip size is 0 bytes. > S3BootScriptWidthUint8 - 0x7FE4F020 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F021 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F022 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F023 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F024 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F025 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F026 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F027 (0x00) The address is irrelevant for skip, so it's just nuleld. > ExecuteBootScript - 7E7A7030 > EFI_BOOT_SCRIPT_IO_WRITE_OPCODE > BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002 > S3BootScriptWidthUint32 - 0x00000514 (0x00000000) > S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F) The Boot Script passes the DMA command to QEMU, by writing the address of the command buffer to IO ports 0x514 and 0x518, in BE byte order. > ExecuteBootScript - 7E7A704B > EFI_BOOT_SCRIPT_MEM_POLL_OPCODE > BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF, 0x0000000000000000 > S3BootScriptWidthUint32 - 0x7FE4F018 > ExecuteBootScript - 7E7A7072 This waits until the DMA command succeeds (reading back the Control field continuously until it reads as zero). > EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE > BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000018, 0x00000000 This is another DMA access command for fw_cfg, prepared in the same pre-reserved buffer. This time > S3BootScriptWidthUint8 - 0x7FE4F018 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F019 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F01A (0x00) > S3BootScriptWidthUint8 - 0x7FE4F01B (0x10) we request a write operation, > S3BootScriptWidthUint8 - 0x7FE4F01C (0x00) > S3BootScriptWidthUint8 - 0x7FE4F01D (0x00) > S3BootScriptWidthUint8 - 0x7FE4F01E (0x00) > S3BootScriptWidthUint8 - 0x7FE4F01F (0x08) with a length of 8 bytes (big endian), matching the pointer size, > S3BootScriptWidthUint8 - 0x7FE4F020 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F021 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F022 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F023 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F024 (0x7F) > S3BootScriptWidthUint8 - 0x7FE4F025 (0xE4) > S3BootScriptWidthUint8 - 0x7FE4F026 (0xF0) > S3BootScriptWidthUint8 - 0x7FE4F027 (0x28) the data to transfer is located at 0x7FE4F028 (just below, tacked to the command buffer itself), > S3BootScriptWidthUint8 - 0x7FE4F028 (0x00) > S3BootScriptWidthUint8 - 0x7FE4F029 (0xC0) > S3BootScriptWidthUint8 - 0x7FE4F02A (0xE5) > S3BootScriptWidthUint8 - 0x7FE4F02B (0x7F) > S3BootScriptWidthUint8 - 0x7FE4F02C (0x00) > S3BootScriptWidthUint8 - 0x7FE4F02D (0x00) > S3BootScriptWidthUint8 - 0x7FE4F02E (0x00) > S3BootScriptWidthUint8 - 0x7FE4F02F (0x00) and the data to write is the original allocation address of the blob (0x7fe5c000). > ExecuteBootScript - 7E7A709D > EFI_BOOT_SCRIPT_IO_WRITE_OPCODE > BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002 > S3BootScriptWidthUint32 - 0x00000514 (0x00000000) > S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F) > ExecuteBootScript - 7E7A70B8 > EFI_BOOT_SCRIPT_MEM_POLL_OPCODE > BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF, 0x0000000000000000 > S3BootScriptWidthUint32 - 0x7FE4F018 > ExecuteBootScript - 7E7A70DF Same story as above: fire off the transfer and wait until it completes. > EFI_BOOT_SCRIPT_INFORMATION_OPCODE > BootScriptExecuteInformation - 0x7E7A70E6 > BootScriptInformation: DE AD BE EF > ExecuteBootScript - 7E7A70EA > S3_BOOT_SCRIPT_LIB_TERMINATE_OPCODE > S3BootScriptDone - Success > [...] The DEADBEEF informational (no-op) opcode is something that OVMF appends to the very end for hysterical raisins. (9) Okay, so the guest is now resumed and running, let's interrupt it in gdb again, and check the contents of address blob again (we know the address of the address blob from step (6)): > ^C > Program received signal SIGINT, Interrupt. > 0x00007f2bbf1d1ebf in ppoll () from /lib64/libc.so.6 > (gdb) print /x *(uint64_t*)0x7f2bd03c37b0 > $6 = 0x7fe5c000 Et voila. (10) I detached gdb from QEMU, and issued the following monitor command: > $ virsh qemu-monitor-command ovmf.rhel7 --hmp 'info vm-generation-id' > 00112233-4455-6677-8899-aabbccddeeff (11) I also booted a Windows Server 2012 R2 guest (Q35, broadcast SMI enabled) with a similar vmgenid device/parameter. According to Device Manager | System devices, "Microsoft Hyper-V Generation Counter" is working properly. I also tested S3 briefly, it worked okay. (I mentioned the SMI broadcast above because for that, OVMF generates an independent S3 Boot Script fragment.) I'll let someone else test live migration. For patches #1, #3, #4 and #5: Tested-by: Laszlo Ersek <lersek@redhat.com> I'll soon post the OVMF patches. Thanks! Laszlo
On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote: > On 02/15/17 07:15, ben@skyportsystems.com wrote: > > From: Ben Warren <ben@skyportsystems.com> > > > > This patch set adds support for passing a GUID to Windows guests. It > > is a re-implementation of previous patch sets written by Igor Mammedov > > et al, but this time passing the GUID data as a fw_cfg blob. > > > > This patch set has dependencies on new guest functionality, in > > particular the support for a new linker-loader command and the ability > > to write back data to QEMU over a DMA link. Work is in flight in both > > SeaBIOS and OVMF to support this. > > > > v5->v6: > > - Rebased to top of tree. > > - Changed device from sysbus to a simple device. This removed the need for > > adding dynamic sysbus support to pc_piix boards. > > - Removed patch that introduced QWORD patching of AML. > > - Removed ability to set GUID via QMP/HMP. > > - Improved comments/documentation in code. > > So here's my testing with a RHEL-7 guest: > > (1) The command line option passed to QEMU is > > -device vmgenid,guid=00112233-4455-6677-8899-AABBCCDDEEFF > > This is the example GUID provided in the SMBIOS spec v3.0.0 (DSP0134), > section 7.2.1 "System -- UUID". (SMBIOS is only relevant here because it > codifies the fact that Microsoft consumes UUID in little-endian order.) > The expected representation, according to the SMBIOS spec, is > > 33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF > > (2) Here's an excerpt from the OVMF log: > > > ProcessCmdAllocate: File="etc/vmgenid_guid" Alignment=0x1000 Zone=1 Size=0x1000 Address=0x7FE5C000 > > This is where "etc/vmgenid_guid" is allocated and downloaded, the > allocation address is 0x7FE5C000. > > > Select Item: 0x19 > > Select Item: 0x22 > > ProcessCmdAllocate: File="etc/acpi/tables" Alignment=0x40 Zone=1 Size=0x20000 Address=0x7E7AB000 > > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x49 Start=0x40 Length=0x1403 > > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1467 PointerSize=4 > > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x146B PointerSize=4 > > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x144C Start=0x1443 Length=0x74 > > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x14C0 Start=0x14B7 Length=0x80 > > Select Item: 0x19 > > SaveCondensedWritePointerToS3Context: 0x002B/[0x00000000+8] := 0x7FE5C000 (0) > > This is where OVMF stashes the WRITE_POINTER command in "condensed" > form, for S3. The fw_cfg selector value is 0x2B (for the fw_cfg file to > be rewritten), the pointer is located at offset 0, has size 0, and the > value to assign is 0x7FE5C000. And, this is #0 of the saved / condensed > WRITE_POINTER commands. > > > Select Item: 0x2B > > ProcessCmdWritePointer: PointerFile="etc/vmgenid_addr" PointeeFile="etc/vmgenid_guid" PointerOffset=0x0 PointerSize=8 > > This is where the WRITE_POINTER command is actually executed, during > normal boot. > > > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/vmgenid_guid" PointerOffset=0x1561 PointerSize=4 > > This is where we link "etc/vmgenid_guid" into VGIA. > > > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x1540 Start=0x1537 Length=0xCA > > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1625 PointerSize=4 > > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1629 PointerSize=4 > > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x162D PointerSize=4 > > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x160A Start=0x1601 Length=0x30 > > ProcessCmdAddPointer: PointerFile="etc/acpi/rsdp" PointeeFile="etc/acpi/tables" PointerOffset=0x10 PointerSize=4 > > ProcessCmdAddChecksum: File="etc/acpi/rsdp" ResultOffset=0x8 Start=0x0 Length=0x24 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > InstallQemuFwCfgTables: unknown loader command: 0x0 > > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AB000 (remaining: 0x20000): found "FACS" size 0x40 > > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AB040 (remaining: 0x1FFC0): found "DSDT" size 0x1403 > > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/vmgenid_guid" at 0x7FE5C000 (remaining: 0x1000): not found; marking fw_cfg blob as opaque > > This is where the OVMF SDT Header Probe Suppressor does its job. (NB, > the "opaque marking" has happened already in ProcessCmdWritePointer() > too, above.) > > > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC443 (remaining: 0x1EBBD): found "FACP" size 0x74 > > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC4B7 (remaining: 0x1EB49): found "APIC" size 0x80 > > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC537 (remaining: 0x1EAC9): found "SSDT" size 0xCA > > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC601 (remaining: 0x1E9FF): found "RSDT" size 0x30 > > TransferS3ContextToBootScript: boot script fragment saved, ScratchBuffer=7FE4F018 > > This is where the WRITE_POINTER commands, stashed earlier in condensed > form, are translated to S3 Boot Script opcodes. > > > InstallQemuFwCfgTables: installed 5 tables > > Such as: FACS, DSDT, FACP, APIC, SSDT. OVMF recognizes RSDT and ignores > it (it's handled by edk2 automatically). > > > InstallQemuFwCfgTables: freeing "etc/acpi/rsdp" > > InstallQemuFwCfgTables: freeing "etc/acpi/tables" > > OVMF sees that the above two blobs have not been marked as "opaque" -- > they only contained ACPI tables, judged from the ADD_POINTER commands > that pointed into them. So these two blobs are freed. > > Note that "etc/vmgenid_guid" is not freed. > > So, from the firmware log, everything looks OK. > > (3) I dumped the SSDT in the RHEL-7 guest: > > > /* > > * Intel ACPI Component Architecture > > * AML/ASL+ Disassembler version 20160527-64 > > * Copyright (c) 2000 - 2016 Intel Corporation > > * > > * Disassembling to symbolic ASL+ operators > > * > > * Disassembly of ssdt.dat, Wed Feb 15 19:21:11 2017 > > * > > * Original Table Header: > > * Signature "SSDT" > > * Length 0x000000CA (202) > > * Revision 0x01 > > * Checksum 0x1D > > * OEM ID "BOCHS " > > * OEM Table ID "VMGENID" > > * OEM Revision 0x00000001 (1) > > * Compiler ID "BXPC" > > * Compiler Version 0x00000001 (1) > > */ > > DefinitionBlock ("", "SSDT", 1, "BOCHS ", "VMGENID", 0x00000001) > > { > > Name (VGIA, 0x7FE5C000) > > Note that the value matches the value logged by the firmware in (2). > > > Scope (\_SB) > > { > > Device (VGEN) > > { > > Name (_HID, "QEMUVGID") // _HID: Hardware ID > > Name (_CID, "VM_Gen_Counter") // _CID: Compatible ID > > Name (_DDN, "VM_Gen_Counter") // _DDN: DOS Device Name > > Method (_STA, 0, NotSerialized) // _STA: Status > > { > > Local0 = 0x0F > > If (VGIA == Zero) > > { > > Local0 = Zero > > } > > > > Return (Local0) > > } > > > > Method (ADDR, 0, NotSerialized) > > { > > Local0 = Package (0x02) {} > > Local0 [Zero] = (VGIA + 0x28) > > Local0 [One] = Zero > > Return (Local0) > > } > > } > > } > > > > Method (\_GPE._E05, 0, NotSerialized) // _Exx: Edge-Triggered GPE > > { > > Notify (\_SB.VGEN, 0x80) // Status Change > > } > > } > > Looks good and matches the documentation. > > (4) To be sure, I checked the address against the guest dmesg, which > contains a dump of the UEFI memory map: > > > [ 0.000000] efi: mem52: type=10, attr=0xf, range=[0x000000007fe5a000-0x000000007fe5e000) (0MB) > > The page (4096 bytes) at 0x7FE5C000 falls into this range. Type=10 means > EfiACPIMemoryNVS. > > (5) At this point I dumped the guest RAM with the dump-guest-memory > monitor command, opened it with "crash", and listed it: > > > crash> rd -p -8 0x7FE5C000 0x40 > > 7fe5c000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > > 7fe5c010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ > > 7fe5c020: 00 00 00 00 00 00 00 00 33 22 11 00 55 44 77 66 ........3"..UDwf > > 7fe5c030: 88 99 aa bb cc dd ee ff 00 00 00 00 00 00 00 00 ................ > > We can see that the GUID starts at 0x7FE5C000 + 0x28, and also that the > byte-level representation matches the little endian one given in (1). > > This proves that the initial blob download worked fine. > > (6) Here I attached "gdb" to QEMU, set a breakpoint on > vmgenid_handle_reset(), allowed the inferior process to continue > execution. > > Then I suspended and resumed the guest (ACPI S3). The breakpoint was hit > during resume: > > > Breakpoint 1, vmgenid_handle_reset (opaque=0x7f2bd03c36e0) at .../hw/acpi/vmgenid.c:205 > > 205 VmGenIdState *vms = VMGENID(opaque); > > First of all, before allowing QEMU to zero out the address blob, I > listed the address and the contents of the address blob (here exploiting > that my host is also little endian): > > > (gdb) print (void*)vms->vmgenid_addr_le > > $2 = (void *) 0x7f2bd03c37b0 > > > (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le > > $4 = 0x7fe5c000 > > This proves that QEMU has the right address, matching the firmware log > from (2), and the ACPI dump from (3). > > (7) At this point I allowed the inferior to proceed a bit: > > > (gdb) n > > 207 memset(vms->vmgenid_addr_le, 0, ARRAY_SIZE(vms->vmgenid_addr_le)); > > (gdb) n > > 208 } > > I verified that the blob was zeroed: > > > (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le > > $5 = 0x0 > > then allowed the inferior to run free. > > > (gdb) cont > > Continuing. > > (8) New messages appeared in the firmware log: > > > S3ResumeExecuteBootScript() > > PeiS3ResumeState - 7FF92B18 > > transfer control to Standalone Boot Script Executor > > S3BootScriptExecute: > > TableHeader - 0x7E7A7000 > > TableHeader.Version - 0x0001 > > TableHeader.TableLength - 0x000000ED > > ExecuteBootScript - 7E7A700D > > EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE > > BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000010, 0x00000000 > > Here the ACPI S3 Boot Script, prepared in > TransferS3ContextToBootScript() -- see (2) -- creates a DMA access > command for fw_cfg. The DMA access command is written to pre-reserved > memory (see "ScratchBuffer" above). > > > S3BootScriptWidthUint8 - 0x7FE4F018 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F019 (0x2B) > > The fw_cfg selector is 0x2B. (See under (2).) > > > S3BootScriptWidthUint8 - 0x7FE4F01A (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F01B (0x0C) > > This is a combined select+skip operation. > > > S3BootScriptWidthUint8 - 0x7FE4F01C (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F01D (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F01E (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F01F (0x00) > > The skip size is 0 bytes. > > > S3BootScriptWidthUint8 - 0x7FE4F020 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F021 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F022 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F023 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F024 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F025 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F026 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F027 (0x00) > > The address is irrelevant for skip, so it's just nuleld. > > > ExecuteBootScript - 7E7A7030 > > EFI_BOOT_SCRIPT_IO_WRITE_OPCODE > > BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002 > > S3BootScriptWidthUint32 - 0x00000514 (0x00000000) > > S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F) > > The Boot Script passes the DMA command to QEMU, by writing the address > of the command buffer to IO ports 0x514 and 0x518, in BE byte order. > > > ExecuteBootScript - 7E7A704B > > EFI_BOOT_SCRIPT_MEM_POLL_OPCODE > > BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF, 0x0000000000000000 > > S3BootScriptWidthUint32 - 0x7FE4F018 > > ExecuteBootScript - 7E7A7072 > > This waits until the DMA command succeeds (reading back the Control > field continuously until it reads as zero). > > > EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE > > BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000018, 0x00000000 > > This is another DMA access command for fw_cfg, prepared in the same > pre-reserved buffer. This time > > > S3BootScriptWidthUint8 - 0x7FE4F018 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F019 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F01A (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F01B (0x10) > > we request a write operation, > > > S3BootScriptWidthUint8 - 0x7FE4F01C (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F01D (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F01E (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F01F (0x08) > > with a length of 8 bytes (big endian), matching the pointer size, > > > S3BootScriptWidthUint8 - 0x7FE4F020 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F021 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F022 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F023 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F024 (0x7F) > > S3BootScriptWidthUint8 - 0x7FE4F025 (0xE4) > > S3BootScriptWidthUint8 - 0x7FE4F026 (0xF0) > > S3BootScriptWidthUint8 - 0x7FE4F027 (0x28) > > the data to transfer is located at 0x7FE4F028 (just below, tacked to the > command buffer itself), > > > S3BootScriptWidthUint8 - 0x7FE4F028 (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F029 (0xC0) > > S3BootScriptWidthUint8 - 0x7FE4F02A (0xE5) > > S3BootScriptWidthUint8 - 0x7FE4F02B (0x7F) > > S3BootScriptWidthUint8 - 0x7FE4F02C (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F02D (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F02E (0x00) > > S3BootScriptWidthUint8 - 0x7FE4F02F (0x00) > > and the data to write is the original allocation address of the blob > (0x7fe5c000). > > > ExecuteBootScript - 7E7A709D > > EFI_BOOT_SCRIPT_IO_WRITE_OPCODE > > BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002 > > S3BootScriptWidthUint32 - 0x00000514 (0x00000000) > > S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F) > > ExecuteBootScript - 7E7A70B8 > > EFI_BOOT_SCRIPT_MEM_POLL_OPCODE > > BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF, 0x0000000000000000 > > S3BootScriptWidthUint32 - 0x7FE4F018 > > ExecuteBootScript - 7E7A70DF > > Same story as above: fire off the transfer and wait until it completes. > > > EFI_BOOT_SCRIPT_INFORMATION_OPCODE > > BootScriptExecuteInformation - 0x7E7A70E6 > > BootScriptInformation: DE AD BE EF > > ExecuteBootScript - 7E7A70EA > > S3_BOOT_SCRIPT_LIB_TERMINATE_OPCODE > > S3BootScriptDone - Success > > [...] > > The DEADBEEF informational (no-op) opcode is something that OVMF appends > to the very end for hysterical raisins. > > (9) Okay, so the guest is now resumed and running, let's interrupt it in > gdb again, and check the contents of address blob again (we know the > address of the address blob from step (6)): > > > ^C > > Program received signal SIGINT, Interrupt. > > 0x00007f2bbf1d1ebf in ppoll () from /lib64/libc.so.6 > > (gdb) print /x *(uint64_t*)0x7f2bd03c37b0 > > $6 = 0x7fe5c000 > > Et voila. > > (10) I detached gdb from QEMU, and issued the following monitor command: > > > $ virsh qemu-monitor-command ovmf.rhel7 --hmp 'info vm-generation-id' > > 00112233-4455-6677-8899-aabbccddeeff > > (11) I also booted a Windows Server 2012 R2 guest (Q35, broadcast SMI > enabled) with a similar vmgenid device/parameter. According to Device > Manager | System devices, "Microsoft Hyper-V Generation Counter" is > working properly. > > I also tested S3 briefly, it worked okay. (I mentioned the SMI broadcast > above because for that, OVMF generates an independent S3 Boot Script > fragment.) > > > I'll let someone else test live migration. > > For patches #1, #3, #4 and #5: > > Tested-by: Laszlo Ersek <lersek@redhat.com> > > I'll soon post the OVMF patches. > > Thanks! > Laszlo How do you feel about Igor's request to change WRITE_POINTER to add offset in there, so guest can pass in the address of GUID and not start of table? Would that be a lot of work to add? -- MST
> On Feb 15, 2017, at 12:09 PM, Michael S. Tsirkin <mst@redhat.com> wrote: > > On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote: >> On 02/15/17 07:15, ben@skyportsystems.com wrote: >>> From: Ben Warren <ben@skyportsystems.com> >>> >>> This patch set adds support for passing a GUID to Windows guests. It >>> is a re-implementation of previous patch sets written by Igor Mammedov >>> et al, but this time passing the GUID data as a fw_cfg blob. >>> >>> This patch set has dependencies on new guest functionality, in >>> particular the support for a new linker-loader command and the ability >>> to write back data to QEMU over a DMA link. Work is in flight in both >>> SeaBIOS and OVMF to support this. >>> >>> v5->v6: >>> - Rebased to top of tree. >>> - Changed device from sysbus to a simple device. This removed the need for >>> adding dynamic sysbus support to pc_piix boards. >>> - Removed patch that introduced QWORD patching of AML. >>> - Removed ability to set GUID via QMP/HMP. >>> - Improved comments/documentation in code. >> >> So here's my testing with a RHEL-7 guest: >> >> (1) The command line option passed to QEMU is >> >> -device vmgenid,guid=00112233-4455-6677-8899-AABBCCDDEEFF >> >> This is the example GUID provided in the SMBIOS spec v3.0.0 (DSP0134), >> section 7.2.1 "System -- UUID". (SMBIOS is only relevant here because it >> codifies the fact that Microsoft consumes UUID in little-endian order.) >> The expected representation, according to the SMBIOS spec, is >> >> 33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF >> >> (2) Here's an excerpt from the OVMF log: >> >>> ProcessCmdAllocate: File="etc/vmgenid_guid" Alignment=0x1000 Zone=1 Size=0x1000 Address=0x7FE5C000 >> >> This is where "etc/vmgenid_guid" is allocated and downloaded, the >> allocation address is 0x7FE5C000. >> >>> Select Item: 0x19 >>> Select Item: 0x22 >>> ProcessCmdAllocate: File="etc/acpi/tables" Alignment=0x40 Zone=1 Size=0x20000 Address=0x7E7AB000 >>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x49 Start=0x40 Length=0x1403 >>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1467 PointerSize=4 >>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x146B PointerSize=4 >>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x144C Start=0x1443 Length=0x74 >>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x14C0 Start=0x14B7 Length=0x80 >>> Select Item: 0x19 >>> SaveCondensedWritePointerToS3Context: 0x002B/[0x00000000+8] := 0x7FE5C000 (0) >> >> This is where OVMF stashes the WRITE_POINTER command in "condensed" >> form, for S3. The fw_cfg selector value is 0x2B (for the fw_cfg file to >> be rewritten), the pointer is located at offset 0, has size 0, and the >> value to assign is 0x7FE5C000. And, this is #0 of the saved / condensed >> WRITE_POINTER commands. >> >>> Select Item: 0x2B >>> ProcessCmdWritePointer: PointerFile="etc/vmgenid_addr" PointeeFile="etc/vmgenid_guid" PointerOffset=0x0 PointerSize=8 >> >> This is where the WRITE_POINTER command is actually executed, during >> normal boot. >> >>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/vmgenid_guid" PointerOffset=0x1561 PointerSize=4 >> >> This is where we link "etc/vmgenid_guid" into VGIA. >> >>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x1540 Start=0x1537 Length=0xCA >>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1625 PointerSize=4 >>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1629 PointerSize=4 >>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x162D PointerSize=4 >>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x160A Start=0x1601 Length=0x30 >>> ProcessCmdAddPointer: PointerFile="etc/acpi/rsdp" PointeeFile="etc/acpi/tables" PointerOffset=0x10 PointerSize=4 >>> ProcessCmdAddChecksum: File="etc/acpi/rsdp" ResultOffset=0x8 Start=0x0 Length=0x24 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> InstallQemuFwCfgTables: unknown loader command: 0x0 >>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AB000 (remaining: 0x20000): found "FACS" size 0x40 >>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AB040 (remaining: 0x1FFC0): found "DSDT" size 0x1403 >>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/vmgenid_guid" at 0x7FE5C000 (remaining: 0x1000): not found; marking fw_cfg blob as opaque >> >> This is where the OVMF SDT Header Probe Suppressor does its job. (NB, >> the "opaque marking" has happened already in ProcessCmdWritePointer() >> too, above.) >> >>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC443 (remaining: 0x1EBBD): found "FACP" size 0x74 >>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC4B7 (remaining: 0x1EB49): found "APIC" size 0x80 >>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC537 (remaining: 0x1EAC9): found "SSDT" size 0xCA >>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC601 (remaining: 0x1E9FF): found "RSDT" size 0x30 >>> TransferS3ContextToBootScript: boot script fragment saved, ScratchBuffer=7FE4F018 >> >> This is where the WRITE_POINTER commands, stashed earlier in condensed >> form, are translated to S3 Boot Script opcodes. >> >>> InstallQemuFwCfgTables: installed 5 tables >> >> Such as: FACS, DSDT, FACP, APIC, SSDT. OVMF recognizes RSDT and ignores >> it (it's handled by edk2 automatically). >> >>> InstallQemuFwCfgTables: freeing "etc/acpi/rsdp" >>> InstallQemuFwCfgTables: freeing "etc/acpi/tables" >> >> OVMF sees that the above two blobs have not been marked as "opaque" -- >> they only contained ACPI tables, judged from the ADD_POINTER commands >> that pointed into them. So these two blobs are freed. >> >> Note that "etc/vmgenid_guid" is not freed. >> >> So, from the firmware log, everything looks OK. >> >> (3) I dumped the SSDT in the RHEL-7 guest: >> >>> /* >>> * Intel ACPI Component Architecture >>> * AML/ASL+ Disassembler version 20160527-64 >>> * Copyright (c) 2000 - 2016 Intel Corporation >>> * >>> * Disassembling to symbolic ASL+ operators >>> * >>> * Disassembly of ssdt.dat, Wed Feb 15 19:21:11 2017 >>> * >>> * Original Table Header: >>> * Signature "SSDT" >>> * Length 0x000000CA (202) >>> * Revision 0x01 >>> * Checksum 0x1D >>> * OEM ID "BOCHS " >>> * OEM Table ID "VMGENID" >>> * OEM Revision 0x00000001 (1) >>> * Compiler ID "BXPC" >>> * Compiler Version 0x00000001 (1) >>> */ >>> DefinitionBlock ("", "SSDT", 1, "BOCHS ", "VMGENID", 0x00000001) >>> { >>> Name (VGIA, 0x7FE5C000) >> >> Note that the value matches the value logged by the firmware in (2). >> >>> Scope (\_SB) >>> { >>> Device (VGEN) >>> { >>> Name (_HID, "QEMUVGID") // _HID: Hardware ID >>> Name (_CID, "VM_Gen_Counter") // _CID: Compatible ID >>> Name (_DDN, "VM_Gen_Counter") // _DDN: DOS Device Name >>> Method (_STA, 0, NotSerialized) // _STA: Status >>> { >>> Local0 = 0x0F >>> If (VGIA == Zero) >>> { >>> Local0 = Zero >>> } >>> >>> Return (Local0) >>> } >>> >>> Method (ADDR, 0, NotSerialized) >>> { >>> Local0 = Package (0x02) {} >>> Local0 [Zero] = (VGIA + 0x28) >>> Local0 [One] = Zero >>> Return (Local0) >>> } >>> } >>> } >>> >>> Method (\_GPE._E05, 0, NotSerialized) // _Exx: Edge-Triggered GPE >>> { >>> Notify (\_SB.VGEN, 0x80) // Status Change >>> } >>> } >> >> Looks good and matches the documentation. >> >> (4) To be sure, I checked the address against the guest dmesg, which >> contains a dump of the UEFI memory map: >> >>> [ 0.000000] efi: mem52: type=10, attr=0xf, range=[0x000000007fe5a000-0x000000007fe5e000) (0MB) >> >> The page (4096 bytes) at 0x7FE5C000 falls into this range. Type=10 means >> EfiACPIMemoryNVS. >> >> (5) At this point I dumped the guest RAM with the dump-guest-memory >> monitor command, opened it with "crash", and listed it: >> >>> crash> rd -p -8 0x7FE5C000 0x40 >>> 7fe5c000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>> 7fe5c010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................ >>> 7fe5c020: 00 00 00 00 00 00 00 00 33 22 11 00 55 44 77 66 ........3"..UDwf >>> 7fe5c030: 88 99 aa bb cc dd ee ff 00 00 00 00 00 00 00 00 ................ >> >> We can see that the GUID starts at 0x7FE5C000 + 0x28, and also that the >> byte-level representation matches the little endian one given in (1). >> >> This proves that the initial blob download worked fine. >> >> (6) Here I attached "gdb" to QEMU, set a breakpoint on >> vmgenid_handle_reset(), allowed the inferior process to continue >> execution. >> >> Then I suspended and resumed the guest (ACPI S3). The breakpoint was hit >> during resume: >> >>> Breakpoint 1, vmgenid_handle_reset (opaque=0x7f2bd03c36e0) at .../hw/acpi/vmgenid.c:205 >>> 205 VmGenIdState *vms = VMGENID(opaque); >> >> First of all, before allowing QEMU to zero out the address blob, I >> listed the address and the contents of the address blob (here exploiting >> that my host is also little endian): >> >>> (gdb) print (void*)vms->vmgenid_addr_le >>> $2 = (void *) 0x7f2bd03c37b0 >> >>> (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le >>> $4 = 0x7fe5c000 >> >> This proves that QEMU has the right address, matching the firmware log >> from (2), and the ACPI dump from (3). >> >> (7) At this point I allowed the inferior to proceed a bit: >> >>> (gdb) n >>> 207 memset(vms->vmgenid_addr_le, 0, ARRAY_SIZE(vms->vmgenid_addr_le)); >>> (gdb) n >>> 208 } >> >> I verified that the blob was zeroed: >> >>> (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le >>> $5 = 0x0 >> >> then allowed the inferior to run free. >> >>> (gdb) cont >>> Continuing. >> >> (8) New messages appeared in the firmware log: >> >>> S3ResumeExecuteBootScript() >>> PeiS3ResumeState - 7FF92B18 >>> transfer control to Standalone Boot Script Executor >>> S3BootScriptExecute: >>> TableHeader - 0x7E7A7000 >>> TableHeader.Version - 0x0001 >>> TableHeader.TableLength - 0x000000ED >>> ExecuteBootScript - 7E7A700D >>> EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE >>> BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000010, 0x00000000 >> >> Here the ACPI S3 Boot Script, prepared in >> TransferS3ContextToBootScript() -- see (2) -- creates a DMA access >> command for fw_cfg. The DMA access command is written to pre-reserved >> memory (see "ScratchBuffer" above). >> >>> S3BootScriptWidthUint8 - 0x7FE4F018 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F019 (0x2B) >> >> The fw_cfg selector is 0x2B. (See under (2).) >> >>> S3BootScriptWidthUint8 - 0x7FE4F01A (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F01B (0x0C) >> >> This is a combined select+skip operation. >> >>> S3BootScriptWidthUint8 - 0x7FE4F01C (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F01D (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F01E (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F01F (0x00) >> >> The skip size is 0 bytes. >> >>> S3BootScriptWidthUint8 - 0x7FE4F020 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F021 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F022 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F023 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F024 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F025 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F026 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F027 (0x00) >> >> The address is irrelevant for skip, so it's just nuleld. >> >>> ExecuteBootScript - 7E7A7030 >>> EFI_BOOT_SCRIPT_IO_WRITE_OPCODE >>> BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002 >>> S3BootScriptWidthUint32 - 0x00000514 (0x00000000) >>> S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F) >> >> The Boot Script passes the DMA command to QEMU, by writing the address >> of the command buffer to IO ports 0x514 and 0x518, in BE byte order. >> >>> ExecuteBootScript - 7E7A704B >>> EFI_BOOT_SCRIPT_MEM_POLL_OPCODE >>> BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF, 0x0000000000000000 >>> S3BootScriptWidthUint32 - 0x7FE4F018 >>> ExecuteBootScript - 7E7A7072 >> >> This waits until the DMA command succeeds (reading back the Control >> field continuously until it reads as zero). >> >>> EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE >>> BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000018, 0x00000000 >> >> This is another DMA access command for fw_cfg, prepared in the same >> pre-reserved buffer. This time >> >>> S3BootScriptWidthUint8 - 0x7FE4F018 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F019 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F01A (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F01B (0x10) >> >> we request a write operation, >> >>> S3BootScriptWidthUint8 - 0x7FE4F01C (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F01D (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F01E (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F01F (0x08) >> >> with a length of 8 bytes (big endian), matching the pointer size, >> >>> S3BootScriptWidthUint8 - 0x7FE4F020 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F021 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F022 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F023 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F024 (0x7F) >>> S3BootScriptWidthUint8 - 0x7FE4F025 (0xE4) >>> S3BootScriptWidthUint8 - 0x7FE4F026 (0xF0) >>> S3BootScriptWidthUint8 - 0x7FE4F027 (0x28) >> >> the data to transfer is located at 0x7FE4F028 (just below, tacked to the >> command buffer itself), >> >>> S3BootScriptWidthUint8 - 0x7FE4F028 (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F029 (0xC0) >>> S3BootScriptWidthUint8 - 0x7FE4F02A (0xE5) >>> S3BootScriptWidthUint8 - 0x7FE4F02B (0x7F) >>> S3BootScriptWidthUint8 - 0x7FE4F02C (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F02D (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F02E (0x00) >>> S3BootScriptWidthUint8 - 0x7FE4F02F (0x00) >> >> and the data to write is the original allocation address of the blob >> (0x7fe5c000). >> >>> ExecuteBootScript - 7E7A709D >>> EFI_BOOT_SCRIPT_IO_WRITE_OPCODE >>> BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002 >>> S3BootScriptWidthUint32 - 0x00000514 (0x00000000) >>> S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F) >>> ExecuteBootScript - 7E7A70B8 >>> EFI_BOOT_SCRIPT_MEM_POLL_OPCODE >>> BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF, 0x0000000000000000 >>> S3BootScriptWidthUint32 - 0x7FE4F018 >>> ExecuteBootScript - 7E7A70DF >> >> Same story as above: fire off the transfer and wait until it completes. >> >>> EFI_BOOT_SCRIPT_INFORMATION_OPCODE >>> BootScriptExecuteInformation - 0x7E7A70E6 >>> BootScriptInformation: DE AD BE EF >>> ExecuteBootScript - 7E7A70EA >>> S3_BOOT_SCRIPT_LIB_TERMINATE_OPCODE >>> S3BootScriptDone - Success >>> [...] >> >> The DEADBEEF informational (no-op) opcode is something that OVMF appends >> to the very end for hysterical raisins. >> >> (9) Okay, so the guest is now resumed and running, let's interrupt it in >> gdb again, and check the contents of address blob again (we know the >> address of the address blob from step (6)): >> >>> ^C >>> Program received signal SIGINT, Interrupt. >>> 0x00007f2bbf1d1ebf in ppoll () from /lib64/libc.so.6 >>> (gdb) print /x *(uint64_t*)0x7f2bd03c37b0 >>> $6 = 0x7fe5c000 >> >> Et voila. >> >> (10) I detached gdb from QEMU, and issued the following monitor command: >> >>> $ virsh qemu-monitor-command ovmf.rhel7 --hmp 'info vm-generation-id' >>> 00112233-4455-6677-8899-aabbccddeeff >> >> (11) I also booted a Windows Server 2012 R2 guest (Q35, broadcast SMI >> enabled) with a similar vmgenid device/parameter. According to Device >> Manager | System devices, "Microsoft Hyper-V Generation Counter" is >> working properly. >> >> I also tested S3 briefly, it worked okay. (I mentioned the SMI broadcast >> above because for that, OVMF generates an independent S3 Boot Script >> fragment.) >> >> >> I'll let someone else test live migration. >> >> For patches #1, #3, #4 and #5: >> >> Tested-by: Laszlo Ersek <lersek@redhat.com> >> >> I'll soon post the OVMF patches. >> >> Thanks! >> Laszlo > > > How do you feel about Igor's request to change WRITE_POINTER to add > offset in there, so guest can pass in the address of GUID and > not start of table? Would that be a lot of work to add? > I know you’re asking Laszlo, but hopefully the answer is “OK”. I’m finished making those changes to QEMU and am coding up SeaBIOS right now. > -- > MST
On 02/15/17 21:09, Michael S. Tsirkin wrote: > On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote: [snip] >> For patches #1, #3, #4 and #5: >> >> Tested-by: Laszlo Ersek <lersek@redhat.com> >> >> I'll soon post the OVMF patches. >> >> Thanks! >> Laszlo > > > How do you feel about Igor's request to change WRITE_POINTER to add > offset in there, so guest can pass in the address of GUID and > not start of table? Would that be a lot of work to add? I think it's doable in practice: simply add a constant from the command itself, for passing the value back to QEMU, and also for saving the fw_cfg write commend for S3 resume time. But, I disagree with it from a design POV. Igor's point is: > Math complicates QEMU code though and not only QMEMU but AML code as > well. As I understand it, the goal is to push the addition to the firmware (which is "one place"), rather than having to implement it twice in QEMU, i.e., in two places ((a) native QEMU logic, (b) AML generator). Here's my counter-argument: (a) As I mentioned earlier, assume a complex communication structure between the guest OS and QEMU. Currently our shared structure consists of a single field (the GUID), but next time it might contain several fields. For such a multi-field shared structure, QEMU will have to do manual offsetting into the guest RAM anyway, for accessing fields F1, F2, and F3. We will not create three separate WRITE_POINTER commands and let the firmware calculate and return the absolute GPAs of the fields F1, F2 and F3. Instead, there will be one WRITE_POINTER command, and QEMU will do the offsetting manually, minimally for fields F2 and F3. "src_offset" looks tempting now only because we have a shared structure with only one field, the GUID at offset 40 decimal. (b) Regarding the runtime addition in the AML code: As discussed before, the main reason *now*, for not pointing VGIA (and other named integer objects) with ADD_POINTER commands directly to "meaningful" fields, is that OVMF probes the targets of ADD_POINTER commands for patterns that look like ACPI table headers. And, for the time being, we want to suppress any mis-recognitions by prepending some padding. Igor was right to dislike this, and we agreed that *down the road* we should add allocation flags, or further allocation commands, to supplant this kind of heuristics in OVMF. But: - we don't have time to do it now, plus - please observe that the runtime addition in AML relates to the ADD_POINTER and the ALLOCATE commands. It does not relate to WRITE_POINTER at all. Whatever we change on WRITE_POINTER will do nothing for suppressing OVMF's table header probing -- because that is tied to ADD_POINTER --, therefore WRITE_POINTER tweaks cannot eliminate the "need to add" in AML. In summary, I think the proposed WRITE_POINTER modification is implementable, but I think it will not pay off, because: (a) for QEMU logic, it will not prove useful as soon as we have a multi-field shared structure (QEMU will have to add field offsets anyway), (b) and for eliminating the AML addition (which is a consequence of the current ADD_POINTER handling in OVMF), it does nothing. Thanks Laszlo
> On Feb 15, 2017, at 12:52 PM, Laszlo Ersek <lersek@redhat.com> wrote: > > On 02/15/17 21:09, Michael S. Tsirkin wrote: >> On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote: > > [snip] > >>> For patches #1, #3, #4 and #5: >>> >>> Tested-by: Laszlo Ersek <lersek@redhat.com> >>> >>> I'll soon post the OVMF patches. >>> >>> Thanks! >>> Laszlo >> >> >> How do you feel about Igor's request to change WRITE_POINTER to add >> offset in there, so guest can pass in the address of GUID and >> not start of table? Would that be a lot of work to add? > > I think it's doable in practice: simply add a constant from the command > itself, for passing the value back to QEMU, and also for saving the > fw_cfg write commend for S3 resume time. > > But, I disagree with it from a design POV. > > Igor's point is: > >> Math complicates QEMU code though and not only QMEMU but AML code as >> well. > > As I understand it, the goal is to push the addition to the firmware > (which is "one place"), rather than having to implement it twice in > QEMU, i.e., in two places ((a) native QEMU logic, (b) AML generator). > > Here's my counter-argument: > > (a) As I mentioned earlier, assume a complex communication structure > between the guest OS and QEMU. Currently our shared structure consists > of a single field (the GUID), but next time it might contain several fields. > > For such a multi-field shared structure, QEMU will have to do manual > offsetting into the guest RAM anyway, for accessing fields F1, F2, and > F3. We will not create three separate WRITE_POINTER commands and let the > firmware calculate and return the absolute GPAs of the fields F1, F2 and > F3. Instead, there will be one WRITE_POINTER command, and QEMU will do > the offsetting manually, minimally for fields F2 and F3. > > "src_offset" looks tempting now only because we have a shared structure > with only one field, the GUID at offset 40 decimal. > > (b) Regarding the runtime addition in the AML code: > > As discussed before, the main reason *now*, for not pointing VGIA (and > other named integer objects) with ADD_POINTER commands directly to > "meaningful" fields, is that OVMF probes the targets of ADD_POINTER > commands for patterns that look like ACPI table headers. And, for the > time being, we want to suppress any mis-recognitions by prepending some > padding. > > Igor was right to dislike this, and we agreed that *down the road* we > should add allocation flags, or further allocation commands, to supplant > this kind of heuristics in OVMF. But: > > - we don't have time to do it now, plus > > - please observe that the runtime addition in AML relates to the > ADD_POINTER and the ALLOCATE commands. It does not relate to > WRITE_POINTER at all. > > Whatever we change on WRITE_POINTER will do nothing for suppressing > OVMF's table header probing -- because that is tied to ADD_POINTER > --, therefore WRITE_POINTER tweaks cannot eliminate the "need to add" > in AML. > > > In summary, I think the proposed WRITE_POINTER modification is > implementable, but I think it will not pay off, because: > > (a) for QEMU logic, it will not prove useful as soon as we have a > multi-field shared structure (QEMU will have to add field offsets anyway), > > (b) and for eliminating the AML addition (which is a consequence of the > current ADD_POINTER handling in OVMF), it does nothing. > OK Laszlo, in v7 (imminent) I went ahead and implemented this src_offset. If you are truly dead-set against it, it’s not very hard to remove. To me it seems pretty harmless. > Thanks > Laszlo
On 02/16/17 07:10, Ben Warren wrote: > >> On Feb 15, 2017, at 12:52 PM, Laszlo Ersek <lersek@redhat.com> wrote: >> >> On 02/15/17 21:09, Michael S. Tsirkin wrote: >>> On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote: >> >> [snip] >> >>>> For patches #1, #3, #4 and #5: >>>> >>>> Tested-by: Laszlo Ersek <lersek@redhat.com> >>>> >>>> I'll soon post the OVMF patches. >>>> >>>> Thanks! >>>> Laszlo >>> >>> >>> How do you feel about Igor's request to change WRITE_POINTER to add >>> offset in there, so guest can pass in the address of GUID and >>> not start of table? Would that be a lot of work to add? >> >> I think it's doable in practice: simply add a constant from the command >> itself, for passing the value back to QEMU, and also for saving the >> fw_cfg write commend for S3 resume time. >> >> But, I disagree with it from a design POV. >> >> Igor's point is: >> >>> Math complicates QEMU code though and not only QMEMU but AML code as >>> well. >> >> As I understand it, the goal is to push the addition to the firmware >> (which is "one place"), rather than having to implement it twice in >> QEMU, i.e., in two places ((a) native QEMU logic, (b) AML generator). >> >> Here's my counter-argument: >> >> (a) As I mentioned earlier, assume a complex communication structure >> between the guest OS and QEMU. Currently our shared structure consists >> of a single field (the GUID), but next time it might contain several fields. >> >> For such a multi-field shared structure, QEMU will have to do manual >> offsetting into the guest RAM anyway, for accessing fields F1, F2, and >> F3. We will not create three separate WRITE_POINTER commands and let the >> firmware calculate and return the absolute GPAs of the fields F1, F2 and >> F3. Instead, there will be one WRITE_POINTER command, and QEMU will do >> the offsetting manually, minimally for fields F2 and F3. >> >> "src_offset" looks tempting now only because we have a shared structure >> with only one field, the GUID at offset 40 decimal. >> >> (b) Regarding the runtime addition in the AML code: >> >> As discussed before, the main reason *now*, for not pointing VGIA (and >> other named integer objects) with ADD_POINTER commands directly to >> "meaningful" fields, is that OVMF probes the targets of ADD_POINTER >> commands for patterns that look like ACPI table headers. And, for the >> time being, we want to suppress any mis-recognitions by prepending some >> padding. >> >> Igor was right to dislike this, and we agreed that *down the road* we >> should add allocation flags, or further allocation commands, to supplant >> this kind of heuristics in OVMF. But: >> >> - we don't have time to do it now, plus >> >> - please observe that the runtime addition in AML relates to the >> ADD_POINTER and the ALLOCATE commands. It does not relate to >> WRITE_POINTER at all. >> >> Whatever we change on WRITE_POINTER will do nothing for suppressing >> OVMF's table header probing -- because that is tied to ADD_POINTER >> --, therefore WRITE_POINTER tweaks cannot eliminate the "need to add" >> in AML. >> >> >> In summary, I think the proposed WRITE_POINTER modification is >> implementable, but I think it will not pay off, because: >> >> (a) for QEMU logic, it will not prove useful as soon as we have a >> multi-field shared structure (QEMU will have to add field offsets anyway), >> >> (b) and for eliminating the AML addition (which is a consequence of the >> current ADD_POINTER handling in OVMF), it does nothing. >> > OK Laszlo, in v7 (imminent) I went ahead and implemented this > src_offset. If you are truly dead-set against it, it’s not very hard > to remove. To me it seems pretty harmless. I'd like to hear Igor's opinion about my counter-argument above. It would be a fair bit of work in OVMF: - it introduces a new addition which has to be range checked (the sum cannot overflow either the pointer object or the size of the pointed-to blob), - if the range check fails, that's a new error condition that needs handling, - in edk2 we document error conditions carefully, so it affects documentation too, for the function that processes WRITE_POINTER, - I'd have to redo all the testing. After I post the OVMF patches, we can count on about one week of review time (assuming I don't have to post a v2, that is!). That comes pretty close to the end of February, which is when I want to do a downstream OVMF rebase. (I can't do it in March due to a conference, and then it's too late.) The downstream rebase will take a few days too. I would have preferred to post the OVMF patches near the beginning of this week. So, as I said, it is implementable, I'm not "dead-set" against it, but the price for me to pay is definitely not zero. And, as it stands, I disagree with the design -- there don't seem to be any general benefits. I'd like to hear back from Igor. If he convincingly refutes my argument, I'm game. Thanks, Laszlo
On Wed, 15 Feb 2017 21:52:55 +0100 Laszlo Ersek <lersek@redhat.com> wrote: > On 02/15/17 21:09, Michael S. Tsirkin wrote: > > On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote: > > [snip] > > >> For patches #1, #3, #4 and #5: > >> > >> Tested-by: Laszlo Ersek <lersek@redhat.com> > >> > >> I'll soon post the OVMF patches. > >> > >> Thanks! > >> Laszlo > > > > > > How do you feel about Igor's request to change WRITE_POINTER to add > > offset in there, so guest can pass in the address of GUID and > > not start of table? Would that be a lot of work to add? > > I think it's doable in practice: simply add a constant from the command > itself, for passing the value back to QEMU, and also for saving the > fw_cfg write commend for S3 resume time. > > But, I disagree with it from a design POV. > > Igor's point is: > > > Math complicates QEMU code though and not only QMEMU but AML code as > > well. > > As I understand it, the goal is to push the addition to the firmware > (which is "one place"), rather than having to implement it twice in > QEMU, i.e., in two places ((a) native QEMU logic, (b) AML generator). > > Here's my counter-argument: > > (a) As I mentioned earlier, assume a complex communication structure > between the guest OS and QEMU. Currently our shared structure consists > of a single field (the GUID), but next time it might contain several fields. > > For such a multi-field shared structure, QEMU will have to do manual > offsetting into the guest RAM anyway, for accessing fields F1, F2, and > F3. We will not create three separate WRITE_POINTER commands and let the > firmware calculate and return the absolute GPAs of the fields F1, F2 and > F3. Instead, there will be one WRITE_POINTER command, and QEMU will do > the offsetting manually, minimally for fields F2 and F3. > > "src_offset" looks tempting now only because we have a shared structure > with only one field, the GUID at offset 40 decimal. benefits of having src_offset from QEMU POV I see are: a) (biggest one) firmware and device code are clearly separated where: - VMGENID_GUID_OFFSET would be used only by firmware side, such as: WRITE_POINTER and AML addition to help OVMF detect non ACPI blob - device doesn't have to assume/or have a knowledge about layout of GUID blob except of size of data it's needs to access at location provided by WRITE_POINTER as v7 shows it. b) wrt shared blob I've envisioned slightly different approach, where multiple WRITE_POINTER commands are used instead of one with following workflow to extend shared blob: 1) firmware part of QEMU (acpi-build.c): if (device_foo_present) { fw_cfg_add_file_callback('/etc/device_foo_addr', device_foo->addr_storage) shared_off = device_foo->align(next_free_shared_offset) WRITE_POINTER('/etc/device_foo_addr', 0, '/etc/shared_blob, shared_off) next_free_shared_offset = shared_off + device_foo->data_size; } 2) device_foo accesses data at device_foo->addr_storage directly * there is no need to spread knowledge of shared_blob layout to device code anymore. * no need to care where in shared_blob data will be placed, * shared space is used only when device is present * since there is no shared_writeback_blob, there isn't need in mechanism to propagate written data to device or notify device about write drawback in this approach is that a device would consume a file slot if fw_cfg and space for WRITE_POINTER in linker file when present. > (b) Regarding the runtime addition in the AML code: as you pointed out WRITE_POINTER has nothing to do with addition on AML side which is influenced by ADD_POINTER and OVMF and could be fixed with flags down the road, so there is nothing to argue about on this bullet. > As discussed before, the main reason *now*, for not pointing VGIA (and > other named integer objects) with ADD_POINTER commands directly to > "meaningful" fields, is that OVMF probes the targets of ADD_POINTER > commands for patterns that look like ACPI table headers. And, for the > time being, we want to suppress any mis-recognitions by prepending some > padding. > > Igor was right to dislike this, and we agreed that *down the road* we > should add allocation flags, or further allocation commands, to supplant > this kind of heuristics in OVMF. But: > > - we don't have time to do it now, plus > > - please observe that the runtime addition in AML relates to the > ADD_POINTER and the ALLOCATE commands. It does not relate to > WRITE_POINTER at all. > > Whatever we change on WRITE_POINTER will do nothing for suppressing > OVMF's table header probing -- because that is tied to ADD_POINTER > --, therefore WRITE_POINTER tweaks cannot eliminate the "need to add" > in AML. > > > In summary, I think the proposed WRITE_POINTER modification is > implementable, but I think it will not pay off, because: > > (a) for QEMU logic, it will not prove useful as soon as we have a > multi-field shared structure (QEMU will have to add field offsets anyway), > > (b) and for eliminating the AML addition (which is a consequence of the > current ADD_POINTER handling in OVMF), it does nothing. > > Thanks > Laszlo
On 02/16/17 13:08, Igor Mammedov wrote: > On Wed, 15 Feb 2017 21:52:55 +0100 > Laszlo Ersek <lersek@redhat.com> wrote: > >> On 02/15/17 21:09, Michael S. Tsirkin wrote: >>> On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote: >> >> [snip] >> >>>> For patches #1, #3, #4 and #5: >>>> >>>> Tested-by: Laszlo Ersek <lersek@redhat.com> >>>> >>>> I'll soon post the OVMF patches. >>>> >>>> Thanks! >>>> Laszlo >>> >>> >>> How do you feel about Igor's request to change WRITE_POINTER to add >>> offset in there, so guest can pass in the address of GUID and >>> not start of table? Would that be a lot of work to add? >> >> I think it's doable in practice: simply add a constant from the command >> itself, for passing the value back to QEMU, and also for saving the >> fw_cfg write commend for S3 resume time. >> >> But, I disagree with it from a design POV. >> >> Igor's point is: >> >>> Math complicates QEMU code though and not only QMEMU but AML code as >>> well. >> >> As I understand it, the goal is to push the addition to the firmware >> (which is "one place"), rather than having to implement it twice in >> QEMU, i.e., in two places ((a) native QEMU logic, (b) AML generator). >> >> Here's my counter-argument: >> >> (a) As I mentioned earlier, assume a complex communication structure >> between the guest OS and QEMU. Currently our shared structure consists >> of a single field (the GUID), but next time it might contain several fields. >> >> For such a multi-field shared structure, QEMU will have to do manual >> offsetting into the guest RAM anyway, for accessing fields F1, F2, and >> F3. We will not create three separate WRITE_POINTER commands and let the >> firmware calculate and return the absolute GPAs of the fields F1, F2 and >> F3. Instead, there will be one WRITE_POINTER command, and QEMU will do >> the offsetting manually, minimally for fields F2 and F3. >> >> "src_offset" looks tempting now only because we have a shared structure >> with only one field, the GUID at offset 40 decimal. > > benefits of having src_offset from QEMU POV I see are: > a) (biggest one) firmware and device code are clearly separated where: > - VMGENID_GUID_OFFSET would be used only by firmware side, such as: > WRITE_POINTER and AML addition to help OVMF detect non ACPI blob > - device doesn't have to assume/or have a knowledge about > layout of GUID blob except of size of data it's needs > to access at location provided by WRITE_POINTER as v7 shows it. > > b) wrt shared blob I've envisioned slightly different approach, > where multiple WRITE_POINTER commands are used instead of one > with following workflow to extend shared blob: > 1) firmware part of QEMU (acpi-build.c): > if (device_foo_present) { > fw_cfg_add_file_callback('/etc/device_foo_addr', device_foo->addr_storage) > > shared_off = device_foo->align(next_free_shared_offset) > WRITE_POINTER('/etc/device_foo_addr', 0, > '/etc/shared_blob, shared_off) > > next_free_shared_offset = shared_off + device_foo->data_size; > } > 2) device_foo accesses data at device_foo->addr_storage directly > * there is no need to spread knowledge of shared_blob > layout to device code anymore. This is where I disagree, I think. Above you mention device_foo->data_size. If "data_size" covers multiple fields, then the device code itself will have to add relative offsets, for accessing those different fields in guest RAM. With the current command, the only difference is that the device code has to receive a "base offset" from the outside, pointing into the shared blob, and then add the field offsets to that. Thus the addition cannot be avoided anyway. You do have a point about sharing the same area between different devices though. The above pseudo-code looks like a good pattern. This way "acpi-build.c" won't have to hand out the shared blob offsets to existing device instances directly; instead, the blob offsets are handed down to the firmware, and the devices will get their struct base addresses from the firmware. Using one WRITE_POINTER command for that, per device, seems fine. I'll update the OVMF patches. Thanks Laszlo > * no need to care where in shared_blob data will be placed, > * shared space is used only when device is present > * since there is no shared_writeback_blob, there isn't > need in mechanism to propagate written data to device > or notify device about write > > drawback in this approach is that a device would consume > a file slot if fw_cfg and space for WRITE_POINTER in > linker file when present. > > >> (b) Regarding the runtime addition in the AML code: > as you pointed out WRITE_POINTER has nothing to do with addition > on AML side which is influenced by ADD_POINTER and OVMF and could > be fixed with flags down the road, so there is nothing to argue > about on this bullet. > > >> As discussed before, the main reason *now*, for not pointing VGIA (and >> other named integer objects) with ADD_POINTER commands directly to >> "meaningful" fields, is that OVMF probes the targets of ADD_POINTER >> commands for patterns that look like ACPI table headers. And, for the >> time being, we want to suppress any mis-recognitions by prepending some >> padding. >> >> Igor was right to dislike this, and we agreed that *down the road* we >> should add allocation flags, or further allocation commands, to supplant >> this kind of heuristics in OVMF. But: >> >> - we don't have time to do it now, plus >> >> - please observe that the runtime addition in AML relates to the >> ADD_POINTER and the ALLOCATE commands. It does not relate to >> WRITE_POINTER at all. >> >> Whatever we change on WRITE_POINTER will do nothing for suppressing >> OVMF's table header probing -- because that is tied to ADD_POINTER >> --, therefore WRITE_POINTER tweaks cannot eliminate the "need to add" >> in AML. >> >> >> In summary, I think the proposed WRITE_POINTER modification is >> implementable, but I think it will not pay off, because: >> >> (a) for QEMU logic, it will not prove useful as soon as we have a >> multi-field shared structure (QEMU will have to add field offsets anyway), >> >> (b) and for eliminating the AML addition (which is a consequence of the >> current ADD_POINTER handling in OVMF), it does nothing. >> >> Thanks >> Laszlo >
On Thu, 16 Feb 2017 14:29:28 +0100 Laszlo Ersek <lersek@redhat.com> wrote: > On 02/16/17 13:08, Igor Mammedov wrote: > > On Wed, 15 Feb 2017 21:52:55 +0100 > > Laszlo Ersek <lersek@redhat.com> wrote: > > > >> On 02/15/17 21:09, Michael S. Tsirkin wrote: > >>> On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote: > >> > >> [snip] > >> > >>>> For patches #1, #3, #4 and #5: > >>>> > >>>> Tested-by: Laszlo Ersek <lersek@redhat.com> > >>>> > >>>> I'll soon post the OVMF patches. > >>>> > >>>> Thanks! > >>>> Laszlo > >>> > >>> > >>> How do you feel about Igor's request to change WRITE_POINTER to add > >>> offset in there, so guest can pass in the address of GUID and > >>> not start of table? Would that be a lot of work to add? > >> > >> I think it's doable in practice: simply add a constant from the command > >> itself, for passing the value back to QEMU, and also for saving the > >> fw_cfg write commend for S3 resume time. > >> > >> But, I disagree with it from a design POV. > >> > >> Igor's point is: > >> > >>> Math complicates QEMU code though and not only QMEMU but AML code as > >>> well. > >> > >> As I understand it, the goal is to push the addition to the firmware > >> (which is "one place"), rather than having to implement it twice in > >> QEMU, i.e., in two places ((a) native QEMU logic, (b) AML generator). > >> > >> Here's my counter-argument: > >> > >> (a) As I mentioned earlier, assume a complex communication structure > >> between the guest OS and QEMU. Currently our shared structure consists > >> of a single field (the GUID), but next time it might contain several fields. > >> > >> For such a multi-field shared structure, QEMU will have to do manual > >> offsetting into the guest RAM anyway, for accessing fields F1, F2, and > >> F3. We will not create three separate WRITE_POINTER commands and let the > >> firmware calculate and return the absolute GPAs of the fields F1, F2 and > >> F3. Instead, there will be one WRITE_POINTER command, and QEMU will do > >> the offsetting manually, minimally for fields F2 and F3. > >> > >> "src_offset" looks tempting now only because we have a shared structure > >> with only one field, the GUID at offset 40 decimal. > > > > benefits of having src_offset from QEMU POV I see are: > > a) (biggest one) firmware and device code are clearly separated where: > > - VMGENID_GUID_OFFSET would be used only by firmware side, such as: > > WRITE_POINTER and AML addition to help OVMF detect non ACPI blob > > - device doesn't have to assume/or have a knowledge about > > layout of GUID blob except of size of data it's needs > > to access at location provided by WRITE_POINTER as v7 shows it. > > > > b) wrt shared blob I've envisioned slightly different approach, > > where multiple WRITE_POINTER commands are used instead of one > > with following workflow to extend shared blob: > > 1) firmware part of QEMU (acpi-build.c): > > if (device_foo_present) { > > fw_cfg_add_file_callback('/etc/device_foo_addr', device_foo->addr_storage) > > > > shared_off = device_foo->align(next_free_shared_offset) > > WRITE_POINTER('/etc/device_foo_addr', 0, > > '/etc/shared_blob, shared_off) > > > > next_free_shared_offset = shared_off + device_foo->data_size; > > } > > 2) device_foo accesses data at device_foo->addr_storage directly > > * there is no need to spread knowledge of shared_blob > > layout to device code anymore. > > This is where I disagree, I think. Above you mention > device_foo->data_size. If "data_size" covers multiple fields, then the > device code itself will have to add relative offsets, for accessing > those different fields in guest RAM. > > With the current command, the only difference is that the device code > has to receive a "base offset" from the outside, pointing into the > shared blob, and then add the field offsets to that. Thus the addition > cannot be avoided anyway. instead of explicit offsets for base offset, device will probably use struct cast to access fields. > You do have a point about sharing the same area between different > devices though. The above pseudo-code looks like a good pattern. This > way "acpi-build.c" won't have to hand out the shared blob offsets to > existing device instances directly; instead, the blob offsets are handed > down to the firmware, and the devices will get their struct base > addresses from the firmware. Using one WRITE_POINTER command for that, > per device, seems fine. > > I'll update the OVMF patches. Thanks! > > Thanks > Laszlo > > > * no need to care where in shared_blob data will be placed, > > * shared space is used only when device is present > > * since there is no shared_writeback_blob, there isn't > > need in mechanism to propagate written data to device > > or notify device about write > > > > drawback in this approach is that a device would consume > > a file slot if fw_cfg and space for WRITE_POINTER in > > linker file when present. > > > > > >> (b) Regarding the runtime addition in the AML code: > > as you pointed out WRITE_POINTER has nothing to do with addition > > on AML side which is influenced by ADD_POINTER and OVMF and could > > be fixed with flags down the road, so there is nothing to argue > > about on this bullet. > > > > > >> As discussed before, the main reason *now*, for not pointing VGIA (and > >> other named integer objects) with ADD_POINTER commands directly to > >> "meaningful" fields, is that OVMF probes the targets of ADD_POINTER > >> commands for patterns that look like ACPI table headers. And, for the > >> time being, we want to suppress any mis-recognitions by prepending some > >> padding. > >> > >> Igor was right to dislike this, and we agreed that *down the road* we > >> should add allocation flags, or further allocation commands, to supplant > >> this kind of heuristics in OVMF. But: > >> > >> - we don't have time to do it now, plus > >> > >> - please observe that the runtime addition in AML relates to the > >> ADD_POINTER and the ALLOCATE commands. It does not relate to > >> WRITE_POINTER at all. > >> > >> Whatever we change on WRITE_POINTER will do nothing for suppressing > >> OVMF's table header probing -- because that is tied to ADD_POINTER > >> --, therefore WRITE_POINTER tweaks cannot eliminate the "need to add" > >> in AML. > >> > >> > >> In summary, I think the proposed WRITE_POINTER modification is > >> implementable, but I think it will not pay off, because: > >> > >> (a) for QEMU logic, it will not prove useful as soon as we have a > >> multi-field shared structure (QEMU will have to add field offsets anyway), > >> > >> (b) and for eliminating the AML addition (which is a consequence of the > >> current ADD_POINTER handling in OVMF), it does nothing. > >> > >> Thanks > >> Laszlo > > >
© 2016 - 2024 Red Hat, Inc.