[Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID

ben@skyportsystems.com posted 7 patches 7 years, 1 month ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/cover.1487139038.git.ben@skyportsystems.com
Test checkpatch passed
Test docker passed
Test s390x failed
There is a newer version of this series
default-configs/i386-softmmu.mak     |   1 +
default-configs/x86_64-softmmu.mak   |   1 +
docs/specs/vmgenid.txt               | 245 +++++++++++++++++++++++++++++++++
hmp-commands-info.hx                 |  13 ++
hmp.c                                |   9 ++
hmp.h                                |   1 +
hw/acpi/Makefile.objs                |   1 +
hw/acpi/aml-build.c                  |   2 +
hw/acpi/bios-linker-loader.c         |  58 +++++++-
hw/acpi/vmgenid.c                    | 253 +++++++++++++++++++++++++++++++++++
hw/i386/acpi-build.c                 |  16 +++
include/hw/acpi/acpi_dev_interface.h |   1 +
include/hw/acpi/aml-build.h          |   1 +
include/hw/acpi/bios-linker-loader.h |   6 +
include/hw/acpi/vmgenid.h            |  35 +++++
qapi-schema.json                     |  20 +++
stubs/Makefile.objs                  |   1 +
stubs/vmgenid.c                      |   8 ++
tests/Makefile.include               |   2 +
tests/acpi-utils.h                   |  75 +++++++++++
tests/bios-tables-test.c             |  72 +---------
tests/vmgenid-test.c                 | 195 +++++++++++++++++++++++++++
22 files changed, 942 insertions(+), 74 deletions(-)
create mode 100644 docs/specs/vmgenid.txt
create mode 100644 hw/acpi/vmgenid.c
create mode 100644 include/hw/acpi/vmgenid.h
create mode 100644 stubs/vmgenid.c
create mode 100644 tests/acpi-utils.h
create mode 100644 tests/vmgenid-test.c
[Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID
Posted by ben@skyportsystems.com 7 years, 1 month ago
From: Ben Warren <ben@skyportsystems.com>

This patch set adds support for passing a GUID to Windows guests.  It is a
re-implementation of previous patch sets written by Igor Mammedov et al, but
this time passing the GUID data as a fw_cfg blob.

This patch set has dependencies on new guest functionality, in particular the
support for a new linker-loader command and the ability to write back data
to QEMU over a DMA link.  Work is in flight in both SeaBIOS and OVMF to support this.

v5->v6:
    - Rebased to top of tree.
    - Changed device from sysbus to a simple device.  This removed the need for
      adding dynamic sysbus support to pc_piix boards.
    - Removed patch that introduced QWORD patching of AML.
    - Removed ability to set GUID via QMP/HMP.
    - Improved comments/documentation in code.

v4->v5:
    - Added significantly more detail to the documentation.
    - Replaced the previously-implemented linker-loader command with a new one:
      "write pointer".  This allows writing the guest address of a fw_cfg blob back
      to an arbitrary offset in a writeable fw_cfg file visible to QEMU.  This will
      require support in SeaBIOS and OVMF (ongoing).
    - Fixed endianness issues throughout.
    - Several styling cleanups.

v3->v4:
    - Rebased to top of tree.
    - Re-added document patch that was accidentally dropped from the last revision.
    - Added VMState functionality so that VGIA is restored properly.
    - Added Unit tests
v2->v3:
    - Added second writeable fw_cfg for storing the VM Generaiton ID
      address.  This uses a new linker-loader command for instructing the
      guest to write back the allocated address.  A patch for SeaBIOS has been
      submitted (https://www.seabios.org/pipermail/seabios/2017-January/011079.html)
      and the resulting binary will need to be pulled into QEMU once accepted.
    - Setting VM Generation ID by command line or qmp/hmp now accepts an "auto"
      value, whereby QEMU generates a random GUID.
    - Incorporated review comments from v2 mainly around code styling and AML syntax
    - Changed to use the E05 ACPI event instead of E00
v1->v2:
    - Removed "changed" boolean parameter as it is unneeded
    - Added ACPI Notify logic
    - Style changes to pass checkpatch.pl
    - Added support for dynamic sysbus to pc_piix boards

This patch set adds support for passing a GUID to Windows guests.  It is a
re-implementation of previous patch sets written by Igor Mammedov et al, but
this time passing the GUID data as a fw_cfg blob.

This patch set has dependencies on new guest functionality, in particular the
support for a new linker-loader command and the ability to write back data
to QEMU over a DMA link.  Work is in flight in both SeaBIOS and OVMF to support this.

v5->v6:
    - Rebased to top of tree.
    - Changed device from sysbus to a simple device.  This removed the need for
      adding dynamic sysbus support to pc_piix boards.
    - Removed patch that introduced QWORD patching of AML.
    - Removed ability to set GUID via QMP/HMP.
    - Improved comments/documentation in code.

v4->v5:
    - Added significantly more detail to the documentation.
    - Replaced the previously-implemented linker-loader command with a new one:
      "write pointer".  This allows writing the guest address of a fw_cfg blob back
      to an arbitrary offset in a writeable fw_cfg file visible to QEMU.  This will
      require support in SeaBIOS and OVMF (ongoing).
    - Fixed endianness issues throughout.
    - Several styling cleanups.

v3->v4:
    - Rebased to top of tree.
    - Re-added document patch that was accidentally dropped from the last revision.
    - Added VMState functionality so that VGIA is restored properly.
    - Added Unit tests
v2->v3:
    - Added second writeable fw_cfg for storing the VM Generaiton ID
      address.  This uses a new linker-loader command for instructing the
      guest to write back the allocated address.  A patch for SeaBIOS has been
      submitted (https://www.seabios.org/pipermail/seabios/2017-January/011079.html)
      and the resulting binary will need to be pulled into QEMU once accepted.
    - Setting VM Generation ID by command line or qmp/hmp now accepts an "auto"
      value, whereby QEMU generates a random GUID.
    - Incorporated review comments from v2 mainly around code styling and AML syntax
    - Changed to use the E05 ACPI event instead of E00
v1->v2:
    - Removed "changed" boolean parameter as it is unneeded
    - Added ACPI Notify logic
    - Style changes to pass checkpatch.pl
    - Added support for dynamic sysbus to pc_piix boards


Ben Warren (6):
  linker-loader: Add new 'write pointer' command
  docs: VM Generation ID device description
  ACPI: Add vmgenid blob storage to the build tables
  ACPI: Add Virtual Machine Generation ID support
  tests: Move reusable ACPI macros into a new header file
  tests: Add unit tests for the VM Generation ID feature

Igor Mammedov (1):
  qmp/hmp: add query-vm-generation-id and 'info vm-generation-id'
    commands

 default-configs/i386-softmmu.mak     |   1 +
 default-configs/x86_64-softmmu.mak   |   1 +
 docs/specs/vmgenid.txt               | 245 +++++++++++++++++++++++++++++++++
 hmp-commands-info.hx                 |  13 ++
 hmp.c                                |   9 ++
 hmp.h                                |   1 +
 hw/acpi/Makefile.objs                |   1 +
 hw/acpi/aml-build.c                  |   2 +
 hw/acpi/bios-linker-loader.c         |  58 +++++++-
 hw/acpi/vmgenid.c                    | 253 +++++++++++++++++++++++++++++++++++
 hw/i386/acpi-build.c                 |  16 +++
 include/hw/acpi/acpi_dev_interface.h |   1 +
 include/hw/acpi/aml-build.h          |   1 +
 include/hw/acpi/bios-linker-loader.h |   6 +
 include/hw/acpi/vmgenid.h            |  35 +++++
 qapi-schema.json                     |  20 +++
 stubs/Makefile.objs                  |   1 +
 stubs/vmgenid.c                      |   8 ++
 tests/Makefile.include               |   2 +
 tests/acpi-utils.h                   |  75 +++++++++++
 tests/bios-tables-test.c             |  72 +---------
 tests/vmgenid-test.c                 | 195 +++++++++++++++++++++++++++
 22 files changed, 942 insertions(+), 74 deletions(-)
 create mode 100644 docs/specs/vmgenid.txt
 create mode 100644 hw/acpi/vmgenid.c
 create mode 100644 include/hw/acpi/vmgenid.h
 create mode 100644 stubs/vmgenid.c
 create mode 100644 tests/acpi-utils.h
 create mode 100644 tests/vmgenid-test.c

-- 
2.7.4


Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID
Posted by Laszlo Ersek 7 years, 1 month ago
On 02/15/17 07:15, ben@skyportsystems.com wrote:
> From: Ben Warren <ben@skyportsystems.com>
>
> This patch set adds support for passing a GUID to Windows guests.  It
> is a re-implementation of previous patch sets written by Igor Mammedov
> et al, but this time passing the GUID data as a fw_cfg blob.
>
> This patch set has dependencies on new guest functionality, in
> particular the support for a new linker-loader command and the ability
> to write back data to QEMU over a DMA link.  Work is in flight in both
> SeaBIOS and OVMF to support this.
>
> v5->v6:
>     - Rebased to top of tree.
>     - Changed device from sysbus to a simple device.  This removed the need for
>       adding dynamic sysbus support to pc_piix boards.
>     - Removed patch that introduced QWORD patching of AML.
>     - Removed ability to set GUID via QMP/HMP.
>     - Improved comments/documentation in code.

So here's my testing with a RHEL-7 guest:

(1) The command line option passed to QEMU is

  -device vmgenid,guid=00112233-4455-6677-8899-AABBCCDDEEFF

This is the example GUID provided in the SMBIOS spec v3.0.0 (DSP0134),
section 7.2.1 "System -- UUID". (SMBIOS is only relevant here because it
codifies the fact that Microsoft consumes UUID in little-endian order.)
The expected representation, according to the SMBIOS spec, is

  33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF

(2) Here's an excerpt from the OVMF log:

> ProcessCmdAllocate: File="etc/vmgenid_guid" Alignment=0x1000 Zone=1 Size=0x1000 Address=0x7FE5C000

This is where "etc/vmgenid_guid" is allocated and downloaded, the
allocation address is 0x7FE5C000.

> Select Item: 0x19
> Select Item: 0x22
> ProcessCmdAllocate: File="etc/acpi/tables" Alignment=0x40 Zone=1 Size=0x20000 Address=0x7E7AB000
> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x49 Start=0x40 Length=0x1403
> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1467 PointerSize=4
> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x146B PointerSize=4
> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x144C Start=0x1443 Length=0x74
> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x14C0 Start=0x14B7 Length=0x80
> Select Item: 0x19
> SaveCondensedWritePointerToS3Context: 0x002B/[0x00000000+8] := 0x7FE5C000 (0)

This is where OVMF stashes the WRITE_POINTER command in "condensed"
form, for S3. The fw_cfg selector value is 0x2B (for the fw_cfg file to
be rewritten), the pointer is located at offset 0, has size 0, and the
value to assign is 0x7FE5C000. And, this is #0 of the saved / condensed
WRITE_POINTER commands.

> Select Item: 0x2B
> ProcessCmdWritePointer: PointerFile="etc/vmgenid_addr" PointeeFile="etc/vmgenid_guid" PointerOffset=0x0 PointerSize=8

This is where the WRITE_POINTER command is actually executed, during
normal boot.

> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/vmgenid_guid" PointerOffset=0x1561 PointerSize=4

This is where we link "etc/vmgenid_guid" into VGIA.

> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x1540 Start=0x1537 Length=0xCA
> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1625 PointerSize=4
> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1629 PointerSize=4
> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x162D PointerSize=4
> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x160A Start=0x1601 Length=0x30
> ProcessCmdAddPointer: PointerFile="etc/acpi/rsdp" PointeeFile="etc/acpi/tables" PointerOffset=0x10 PointerSize=4
> ProcessCmdAddChecksum: File="etc/acpi/rsdp" ResultOffset=0x8 Start=0x0 Length=0x24
> InstallQemuFwCfgTables: unknown loader command: 0x0
> InstallQemuFwCfgTables: unknown loader command: 0x0
> InstallQemuFwCfgTables: unknown loader command: 0x0
> InstallQemuFwCfgTables: unknown loader command: 0x0
> InstallQemuFwCfgTables: unknown loader command: 0x0
> InstallQemuFwCfgTables: unknown loader command: 0x0
> InstallQemuFwCfgTables: unknown loader command: 0x0
> InstallQemuFwCfgTables: unknown loader command: 0x0
> InstallQemuFwCfgTables: unknown loader command: 0x0
> InstallQemuFwCfgTables: unknown loader command: 0x0
> InstallQemuFwCfgTables: unknown loader command: 0x0
> InstallQemuFwCfgTables: unknown loader command: 0x0
> InstallQemuFwCfgTables: unknown loader command: 0x0
> InstallQemuFwCfgTables: unknown loader command: 0x0
> InstallQemuFwCfgTables: unknown loader command: 0x0
> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AB000 (remaining: 0x20000): found "FACS" size 0x40
> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AB040 (remaining: 0x1FFC0): found "DSDT" size 0x1403
> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/vmgenid_guid" at 0x7FE5C000 (remaining: 0x1000): not found; marking fw_cfg blob as opaque

This is where the OVMF SDT Header Probe Suppressor does its job. (NB,
the "opaque marking" has happened already in ProcessCmdWritePointer()
too, above.)

> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC443 (remaining: 0x1EBBD): found "FACP" size 0x74
> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC4B7 (remaining: 0x1EB49): found "APIC" size 0x80
> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC537 (remaining: 0x1EAC9): found "SSDT" size 0xCA
> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC601 (remaining: 0x1E9FF): found "RSDT" size 0x30
> TransferS3ContextToBootScript: boot script fragment saved, ScratchBuffer=7FE4F018

This is where the WRITE_POINTER commands, stashed earlier in condensed
form, are translated to S3 Boot Script opcodes.

> InstallQemuFwCfgTables: installed 5 tables

Such as: FACS, DSDT, FACP, APIC, SSDT. OVMF recognizes RSDT and ignores
it (it's handled by edk2 automatically).

> InstallQemuFwCfgTables: freeing "etc/acpi/rsdp"
> InstallQemuFwCfgTables: freeing "etc/acpi/tables"

OVMF sees that the above two blobs have not been marked as "opaque" --
they only contained ACPI tables, judged from the ADD_POINTER commands
that pointed into them. So these two blobs are freed.

Note that "etc/vmgenid_guid" is not freed.

So, from the firmware log, everything looks OK.

(3) I dumped the SSDT in the RHEL-7 guest:

> /*
>  * Intel ACPI Component Architecture
>  * AML/ASL+ Disassembler version 20160527-64
>  * Copyright (c) 2000 - 2016 Intel Corporation
>  *
>  * Disassembling to symbolic ASL+ operators
>  *
>  * Disassembly of ssdt.dat, Wed Feb 15 19:21:11 2017
>  *
>  * Original Table Header:
>  *     Signature        "SSDT"
>  *     Length           0x000000CA (202)
>  *     Revision         0x01
>  *     Checksum         0x1D
>  *     OEM ID           "BOCHS "
>  *     OEM Table ID     "VMGENID"
>  *     OEM Revision     0x00000001 (1)
>  *     Compiler ID      "BXPC"
>  *     Compiler Version 0x00000001 (1)
>  */
> DefinitionBlock ("", "SSDT", 1, "BOCHS ", "VMGENID", 0x00000001)
> {
>     Name (VGIA, 0x7FE5C000)

Note that the value matches the value logged by the firmware in (2).

>     Scope (\_SB)
>     {
>         Device (VGEN)
>         {
>             Name (_HID, "QEMUVGID")  // _HID: Hardware ID
>             Name (_CID, "VM_Gen_Counter")  // _CID: Compatible ID
>             Name (_DDN, "VM_Gen_Counter")  // _DDN: DOS Device Name
>             Method (_STA, 0, NotSerialized)  // _STA: Status
>             {
>                 Local0 = 0x0F
>                 If (VGIA == Zero)
>                 {
>                     Local0 = Zero
>                 }
>
>                 Return (Local0)
>             }
>
>             Method (ADDR, 0, NotSerialized)
>             {
>                 Local0 = Package (0x02) {}
>                 Local0 [Zero] = (VGIA + 0x28)
>                 Local0 [One] = Zero
>                 Return (Local0)
>             }
>         }
>     }
>
>     Method (\_GPE._E05, 0, NotSerialized)  // _Exx: Edge-Triggered GPE
>     {
>         Notify (\_SB.VGEN, 0x80) // Status Change
>     }
> }

Looks good and matches the documentation.

(4) To be sure, I checked the address against the guest dmesg, which
contains a dump of the UEFI memory map:

> [    0.000000] efi: mem52: type=10, attr=0xf, range=[0x000000007fe5a000-0x000000007fe5e000) (0MB)

The page (4096 bytes) at 0x7FE5C000 falls into this range. Type=10 means
EfiACPIMemoryNVS.

(5) At this point I dumped the guest RAM with the dump-guest-memory
monitor command, opened it with "crash", and listed it:

> crash> rd -p -8 0x7FE5C000 0x40
>         7fe5c000:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
>         7fe5c010:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
>         7fe5c020:  00 00 00 00 00 00 00 00 33 22 11 00 55 44 77 66   ........3"..UDwf
>         7fe5c030:  88 99 aa bb cc dd ee ff 00 00 00 00 00 00 00 00   ................

We can see that the GUID starts at 0x7FE5C000 + 0x28, and also that the
byte-level representation matches the little endian one given in (1).

This proves that the initial blob download worked fine.

(6) Here I attached "gdb" to QEMU, set a breakpoint on
vmgenid_handle_reset(), allowed the inferior process to continue
execution.

Then I suspended and resumed the guest (ACPI S3). The breakpoint was hit
during resume:

> Breakpoint 1, vmgenid_handle_reset (opaque=0x7f2bd03c36e0) at .../hw/acpi/vmgenid.c:205
> 205         VmGenIdState *vms = VMGENID(opaque);

First of all, before allowing QEMU to zero out the address blob, I
listed the address and the contents of the address blob (here exploiting
that my host is also little endian):

> (gdb) print (void*)vms->vmgenid_addr_le
> $2 = (void *) 0x7f2bd03c37b0

> (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le
> $4 = 0x7fe5c000

This proves that QEMU has the right address, matching the firmware log
from (2), and the ACPI dump from (3).

(7) At this point I allowed the inferior to proceed a bit:

> (gdb) n
> 207         memset(vms->vmgenid_addr_le, 0, ARRAY_SIZE(vms->vmgenid_addr_le));
> (gdb) n
> 208     }

I verified that the blob was zeroed:

> (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le
> $5 = 0x0

then allowed the inferior to run free.

> (gdb) cont
> Continuing.

(8) New messages appeared in the firmware log:

> S3ResumeExecuteBootScript()
> PeiS3ResumeState - 7FF92B18
> transfer control to Standalone Boot Script Executor
> S3BootScriptExecute:
> TableHeader - 0x7E7A7000
> TableHeader.Version - 0x0001
> TableHeader.TableLength - 0x000000ED
> ExecuteBootScript - 7E7A700D
> EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE
> BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000010, 0x00000000

Here the ACPI S3 Boot Script, prepared in
TransferS3ContextToBootScript() -- see (2) -- creates a DMA access
command for fw_cfg. The DMA access command is written to pre-reserved
memory (see "ScratchBuffer" above).

> S3BootScriptWidthUint8 - 0x7FE4F018 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F019 (0x2B)

The fw_cfg selector is 0x2B. (See under (2).)

> S3BootScriptWidthUint8 - 0x7FE4F01A (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F01B (0x0C)

This is a combined select+skip operation.

> S3BootScriptWidthUint8 - 0x7FE4F01C (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F01D (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F01E (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F01F (0x00)

The skip size is 0 bytes.

> S3BootScriptWidthUint8 - 0x7FE4F020 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F021 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F022 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F023 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F024 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F025 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F026 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F027 (0x00)

The address is irrelevant for skip, so it's just nuleld.

> ExecuteBootScript - 7E7A7030
> EFI_BOOT_SCRIPT_IO_WRITE_OPCODE
> BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002
> S3BootScriptWidthUint32 - 0x00000514 (0x00000000)
> S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F)

The Boot Script passes the DMA command to QEMU, by writing the address
of the command buffer to IO ports 0x514 and 0x518, in BE byte order.

> ExecuteBootScript - 7E7A704B
> EFI_BOOT_SCRIPT_MEM_POLL_OPCODE
> BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF, 0x0000000000000000
> S3BootScriptWidthUint32 - 0x7FE4F018
> ExecuteBootScript - 7E7A7072

This waits until the DMA command succeeds (reading back the Control
field continuously until it reads as zero).

> EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE
> BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000018, 0x00000000

This is another DMA access command for fw_cfg, prepared in the same
pre-reserved buffer. This time

> S3BootScriptWidthUint8 - 0x7FE4F018 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F019 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F01A (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F01B (0x10)

we request a write operation,

> S3BootScriptWidthUint8 - 0x7FE4F01C (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F01D (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F01E (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F01F (0x08)

with a length of 8 bytes (big endian), matching the pointer size,

> S3BootScriptWidthUint8 - 0x7FE4F020 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F021 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F022 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F023 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F024 (0x7F)
> S3BootScriptWidthUint8 - 0x7FE4F025 (0xE4)
> S3BootScriptWidthUint8 - 0x7FE4F026 (0xF0)
> S3BootScriptWidthUint8 - 0x7FE4F027 (0x28)

the data to transfer is located at 0x7FE4F028 (just below, tacked to the
command buffer itself),

> S3BootScriptWidthUint8 - 0x7FE4F028 (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F029 (0xC0)
> S3BootScriptWidthUint8 - 0x7FE4F02A (0xE5)
> S3BootScriptWidthUint8 - 0x7FE4F02B (0x7F)
> S3BootScriptWidthUint8 - 0x7FE4F02C (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F02D (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F02E (0x00)
> S3BootScriptWidthUint8 - 0x7FE4F02F (0x00)

and the data to write is the original allocation address of the blob
(0x7fe5c000).

> ExecuteBootScript - 7E7A709D
> EFI_BOOT_SCRIPT_IO_WRITE_OPCODE
> BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002
> S3BootScriptWidthUint32 - 0x00000514 (0x00000000)
> S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F)
> ExecuteBootScript - 7E7A70B8
> EFI_BOOT_SCRIPT_MEM_POLL_OPCODE
> BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF, 0x0000000000000000
> S3BootScriptWidthUint32 - 0x7FE4F018
> ExecuteBootScript - 7E7A70DF

Same story as above: fire off the transfer and wait until it completes.

> EFI_BOOT_SCRIPT_INFORMATION_OPCODE
> BootScriptExecuteInformation - 0x7E7A70E6
> BootScriptInformation: DE AD BE EF
> ExecuteBootScript - 7E7A70EA
> S3_BOOT_SCRIPT_LIB_TERMINATE_OPCODE
> S3BootScriptDone - Success
> [...]

The DEADBEEF informational (no-op) opcode is something that OVMF appends
to the very end for hysterical raisins.

(9) Okay, so the guest is now resumed and running, let's interrupt it in
gdb again, and check the contents of address blob again (we know the
address of the address blob from step (6)):

> ^C
> Program received signal SIGINT, Interrupt.
> 0x00007f2bbf1d1ebf in ppoll () from /lib64/libc.so.6
> (gdb) print /x *(uint64_t*)0x7f2bd03c37b0
> $6 = 0x7fe5c000

Et voila.

(10) I detached gdb from QEMU, and issued the following monitor command:

> $ virsh qemu-monitor-command ovmf.rhel7 --hmp 'info vm-generation-id'
> 00112233-4455-6677-8899-aabbccddeeff

(11) I also booted a Windows Server 2012 R2 guest (Q35, broadcast SMI
enabled) with a similar vmgenid device/parameter. According to Device
Manager | System devices, "Microsoft Hyper-V Generation Counter" is
working properly.

I also tested S3 briefly, it worked okay. (I mentioned the SMI broadcast
above because for that, OVMF generates an independent S3 Boot Script
fragment.)


I'll let someone else test live migration.

For patches #1, #3, #4 and #5:

Tested-by: Laszlo Ersek <lersek@redhat.com>

I'll soon post the OVMF patches.

Thanks!
Laszlo

Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID
Posted by Michael S. Tsirkin 7 years, 1 month ago
On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote:
> On 02/15/17 07:15, ben@skyportsystems.com wrote:
> > From: Ben Warren <ben@skyportsystems.com>
> >
> > This patch set adds support for passing a GUID to Windows guests.  It
> > is a re-implementation of previous patch sets written by Igor Mammedov
> > et al, but this time passing the GUID data as a fw_cfg blob.
> >
> > This patch set has dependencies on new guest functionality, in
> > particular the support for a new linker-loader command and the ability
> > to write back data to QEMU over a DMA link.  Work is in flight in both
> > SeaBIOS and OVMF to support this.
> >
> > v5->v6:
> >     - Rebased to top of tree.
> >     - Changed device from sysbus to a simple device.  This removed the need for
> >       adding dynamic sysbus support to pc_piix boards.
> >     - Removed patch that introduced QWORD patching of AML.
> >     - Removed ability to set GUID via QMP/HMP.
> >     - Improved comments/documentation in code.
> 
> So here's my testing with a RHEL-7 guest:
> 
> (1) The command line option passed to QEMU is
> 
>   -device vmgenid,guid=00112233-4455-6677-8899-AABBCCDDEEFF
> 
> This is the example GUID provided in the SMBIOS spec v3.0.0 (DSP0134),
> section 7.2.1 "System -- UUID". (SMBIOS is only relevant here because it
> codifies the fact that Microsoft consumes UUID in little-endian order.)
> The expected representation, according to the SMBIOS spec, is
> 
>   33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF
> 
> (2) Here's an excerpt from the OVMF log:
> 
> > ProcessCmdAllocate: File="etc/vmgenid_guid" Alignment=0x1000 Zone=1 Size=0x1000 Address=0x7FE5C000
> 
> This is where "etc/vmgenid_guid" is allocated and downloaded, the
> allocation address is 0x7FE5C000.
> 
> > Select Item: 0x19
> > Select Item: 0x22
> > ProcessCmdAllocate: File="etc/acpi/tables" Alignment=0x40 Zone=1 Size=0x20000 Address=0x7E7AB000
> > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x49 Start=0x40 Length=0x1403
> > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1467 PointerSize=4
> > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x146B PointerSize=4
> > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x144C Start=0x1443 Length=0x74
> > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x14C0 Start=0x14B7 Length=0x80
> > Select Item: 0x19
> > SaveCondensedWritePointerToS3Context: 0x002B/[0x00000000+8] := 0x7FE5C000 (0)
> 
> This is where OVMF stashes the WRITE_POINTER command in "condensed"
> form, for S3. The fw_cfg selector value is 0x2B (for the fw_cfg file to
> be rewritten), the pointer is located at offset 0, has size 0, and the
> value to assign is 0x7FE5C000. And, this is #0 of the saved / condensed
> WRITE_POINTER commands.
> 
> > Select Item: 0x2B
> > ProcessCmdWritePointer: PointerFile="etc/vmgenid_addr" PointeeFile="etc/vmgenid_guid" PointerOffset=0x0 PointerSize=8
> 
> This is where the WRITE_POINTER command is actually executed, during
> normal boot.
> 
> > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/vmgenid_guid" PointerOffset=0x1561 PointerSize=4
> 
> This is where we link "etc/vmgenid_guid" into VGIA.
> 
> > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x1540 Start=0x1537 Length=0xCA
> > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1625 PointerSize=4
> > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1629 PointerSize=4
> > ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x162D PointerSize=4
> > ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x160A Start=0x1601 Length=0x30
> > ProcessCmdAddPointer: PointerFile="etc/acpi/rsdp" PointeeFile="etc/acpi/tables" PointerOffset=0x10 PointerSize=4
> > ProcessCmdAddChecksum: File="etc/acpi/rsdp" ResultOffset=0x8 Start=0x0 Length=0x24
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > InstallQemuFwCfgTables: unknown loader command: 0x0
> > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AB000 (remaining: 0x20000): found "FACS" size 0x40
> > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AB040 (remaining: 0x1FFC0): found "DSDT" size 0x1403
> > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/vmgenid_guid" at 0x7FE5C000 (remaining: 0x1000): not found; marking fw_cfg blob as opaque
> 
> This is where the OVMF SDT Header Probe Suppressor does its job. (NB,
> the "opaque marking" has happened already in ProcessCmdWritePointer()
> too, above.)
> 
> > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC443 (remaining: 0x1EBBD): found "FACP" size 0x74
> > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC4B7 (remaining: 0x1EB49): found "APIC" size 0x80
> > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC537 (remaining: 0x1EAC9): found "SSDT" size 0xCA
> > Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC601 (remaining: 0x1E9FF): found "RSDT" size 0x30
> > TransferS3ContextToBootScript: boot script fragment saved, ScratchBuffer=7FE4F018
> 
> This is where the WRITE_POINTER commands, stashed earlier in condensed
> form, are translated to S3 Boot Script opcodes.
> 
> > InstallQemuFwCfgTables: installed 5 tables
> 
> Such as: FACS, DSDT, FACP, APIC, SSDT. OVMF recognizes RSDT and ignores
> it (it's handled by edk2 automatically).
> 
> > InstallQemuFwCfgTables: freeing "etc/acpi/rsdp"
> > InstallQemuFwCfgTables: freeing "etc/acpi/tables"
> 
> OVMF sees that the above two blobs have not been marked as "opaque" --
> they only contained ACPI tables, judged from the ADD_POINTER commands
> that pointed into them. So these two blobs are freed.
> 
> Note that "etc/vmgenid_guid" is not freed.
> 
> So, from the firmware log, everything looks OK.
> 
> (3) I dumped the SSDT in the RHEL-7 guest:
> 
> > /*
> >  * Intel ACPI Component Architecture
> >  * AML/ASL+ Disassembler version 20160527-64
> >  * Copyright (c) 2000 - 2016 Intel Corporation
> >  *
> >  * Disassembling to symbolic ASL+ operators
> >  *
> >  * Disassembly of ssdt.dat, Wed Feb 15 19:21:11 2017
> >  *
> >  * Original Table Header:
> >  *     Signature        "SSDT"
> >  *     Length           0x000000CA (202)
> >  *     Revision         0x01
> >  *     Checksum         0x1D
> >  *     OEM ID           "BOCHS "
> >  *     OEM Table ID     "VMGENID"
> >  *     OEM Revision     0x00000001 (1)
> >  *     Compiler ID      "BXPC"
> >  *     Compiler Version 0x00000001 (1)
> >  */
> > DefinitionBlock ("", "SSDT", 1, "BOCHS ", "VMGENID", 0x00000001)
> > {
> >     Name (VGIA, 0x7FE5C000)
> 
> Note that the value matches the value logged by the firmware in (2).
> 
> >     Scope (\_SB)
> >     {
> >         Device (VGEN)
> >         {
> >             Name (_HID, "QEMUVGID")  // _HID: Hardware ID
> >             Name (_CID, "VM_Gen_Counter")  // _CID: Compatible ID
> >             Name (_DDN, "VM_Gen_Counter")  // _DDN: DOS Device Name
> >             Method (_STA, 0, NotSerialized)  // _STA: Status
> >             {
> >                 Local0 = 0x0F
> >                 If (VGIA == Zero)
> >                 {
> >                     Local0 = Zero
> >                 }
> >
> >                 Return (Local0)
> >             }
> >
> >             Method (ADDR, 0, NotSerialized)
> >             {
> >                 Local0 = Package (0x02) {}
> >                 Local0 [Zero] = (VGIA + 0x28)
> >                 Local0 [One] = Zero
> >                 Return (Local0)
> >             }
> >         }
> >     }
> >
> >     Method (\_GPE._E05, 0, NotSerialized)  // _Exx: Edge-Triggered GPE
> >     {
> >         Notify (\_SB.VGEN, 0x80) // Status Change
> >     }
> > }
> 
> Looks good and matches the documentation.
> 
> (4) To be sure, I checked the address against the guest dmesg, which
> contains a dump of the UEFI memory map:
> 
> > [    0.000000] efi: mem52: type=10, attr=0xf, range=[0x000000007fe5a000-0x000000007fe5e000) (0MB)
> 
> The page (4096 bytes) at 0x7FE5C000 falls into this range. Type=10 means
> EfiACPIMemoryNVS.
> 
> (5) At this point I dumped the guest RAM with the dump-guest-memory
> monitor command, opened it with "crash", and listed it:
> 
> > crash> rd -p -8 0x7FE5C000 0x40
> >         7fe5c000:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
> >         7fe5c010:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
> >         7fe5c020:  00 00 00 00 00 00 00 00 33 22 11 00 55 44 77 66   ........3"..UDwf
> >         7fe5c030:  88 99 aa bb cc dd ee ff 00 00 00 00 00 00 00 00   ................
> 
> We can see that the GUID starts at 0x7FE5C000 + 0x28, and also that the
> byte-level representation matches the little endian one given in (1).
> 
> This proves that the initial blob download worked fine.
> 
> (6) Here I attached "gdb" to QEMU, set a breakpoint on
> vmgenid_handle_reset(), allowed the inferior process to continue
> execution.
> 
> Then I suspended and resumed the guest (ACPI S3). The breakpoint was hit
> during resume:
> 
> > Breakpoint 1, vmgenid_handle_reset (opaque=0x7f2bd03c36e0) at .../hw/acpi/vmgenid.c:205
> > 205         VmGenIdState *vms = VMGENID(opaque);
> 
> First of all, before allowing QEMU to zero out the address blob, I
> listed the address and the contents of the address blob (here exploiting
> that my host is also little endian):
> 
> > (gdb) print (void*)vms->vmgenid_addr_le
> > $2 = (void *) 0x7f2bd03c37b0
> 
> > (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le
> > $4 = 0x7fe5c000
> 
> This proves that QEMU has the right address, matching the firmware log
> from (2), and the ACPI dump from (3).
> 
> (7) At this point I allowed the inferior to proceed a bit:
> 
> > (gdb) n
> > 207         memset(vms->vmgenid_addr_le, 0, ARRAY_SIZE(vms->vmgenid_addr_le));
> > (gdb) n
> > 208     }
> 
> I verified that the blob was zeroed:
> 
> > (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le
> > $5 = 0x0
> 
> then allowed the inferior to run free.
> 
> > (gdb) cont
> > Continuing.
> 
> (8) New messages appeared in the firmware log:
> 
> > S3ResumeExecuteBootScript()
> > PeiS3ResumeState - 7FF92B18
> > transfer control to Standalone Boot Script Executor
> > S3BootScriptExecute:
> > TableHeader - 0x7E7A7000
> > TableHeader.Version - 0x0001
> > TableHeader.TableLength - 0x000000ED
> > ExecuteBootScript - 7E7A700D
> > EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE
> > BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000010, 0x00000000
> 
> Here the ACPI S3 Boot Script, prepared in
> TransferS3ContextToBootScript() -- see (2) -- creates a DMA access
> command for fw_cfg. The DMA access command is written to pre-reserved
> memory (see "ScratchBuffer" above).
> 
> > S3BootScriptWidthUint8 - 0x7FE4F018 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F019 (0x2B)
> 
> The fw_cfg selector is 0x2B. (See under (2).)
> 
> > S3BootScriptWidthUint8 - 0x7FE4F01A (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01B (0x0C)
> 
> This is a combined select+skip operation.
> 
> > S3BootScriptWidthUint8 - 0x7FE4F01C (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01D (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01E (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01F (0x00)
> 
> The skip size is 0 bytes.
> 
> > S3BootScriptWidthUint8 - 0x7FE4F020 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F021 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F022 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F023 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F024 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F025 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F026 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F027 (0x00)
> 
> The address is irrelevant for skip, so it's just nuleld.
> 
> > ExecuteBootScript - 7E7A7030
> > EFI_BOOT_SCRIPT_IO_WRITE_OPCODE
> > BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002
> > S3BootScriptWidthUint32 - 0x00000514 (0x00000000)
> > S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F)
> 
> The Boot Script passes the DMA command to QEMU, by writing the address
> of the command buffer to IO ports 0x514 and 0x518, in BE byte order.
> 
> > ExecuteBootScript - 7E7A704B
> > EFI_BOOT_SCRIPT_MEM_POLL_OPCODE
> > BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF, 0x0000000000000000
> > S3BootScriptWidthUint32 - 0x7FE4F018
> > ExecuteBootScript - 7E7A7072
> 
> This waits until the DMA command succeeds (reading back the Control
> field continuously until it reads as zero).
> 
> > EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE
> > BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000018, 0x00000000
> 
> This is another DMA access command for fw_cfg, prepared in the same
> pre-reserved buffer. This time
> 
> > S3BootScriptWidthUint8 - 0x7FE4F018 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F019 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01A (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01B (0x10)
> 
> we request a write operation,
> 
> > S3BootScriptWidthUint8 - 0x7FE4F01C (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01D (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01E (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F01F (0x08)
> 
> with a length of 8 bytes (big endian), matching the pointer size,
> 
> > S3BootScriptWidthUint8 - 0x7FE4F020 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F021 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F022 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F023 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F024 (0x7F)
> > S3BootScriptWidthUint8 - 0x7FE4F025 (0xE4)
> > S3BootScriptWidthUint8 - 0x7FE4F026 (0xF0)
> > S3BootScriptWidthUint8 - 0x7FE4F027 (0x28)
> 
> the data to transfer is located at 0x7FE4F028 (just below, tacked to the
> command buffer itself),
> 
> > S3BootScriptWidthUint8 - 0x7FE4F028 (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F029 (0xC0)
> > S3BootScriptWidthUint8 - 0x7FE4F02A (0xE5)
> > S3BootScriptWidthUint8 - 0x7FE4F02B (0x7F)
> > S3BootScriptWidthUint8 - 0x7FE4F02C (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F02D (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F02E (0x00)
> > S3BootScriptWidthUint8 - 0x7FE4F02F (0x00)
> 
> and the data to write is the original allocation address of the blob
> (0x7fe5c000).
> 
> > ExecuteBootScript - 7E7A709D
> > EFI_BOOT_SCRIPT_IO_WRITE_OPCODE
> > BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002
> > S3BootScriptWidthUint32 - 0x00000514 (0x00000000)
> > S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F)
> > ExecuteBootScript - 7E7A70B8
> > EFI_BOOT_SCRIPT_MEM_POLL_OPCODE
> > BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF, 0x0000000000000000
> > S3BootScriptWidthUint32 - 0x7FE4F018
> > ExecuteBootScript - 7E7A70DF
> 
> Same story as above: fire off the transfer and wait until it completes.
> 
> > EFI_BOOT_SCRIPT_INFORMATION_OPCODE
> > BootScriptExecuteInformation - 0x7E7A70E6
> > BootScriptInformation: DE AD BE EF
> > ExecuteBootScript - 7E7A70EA
> > S3_BOOT_SCRIPT_LIB_TERMINATE_OPCODE
> > S3BootScriptDone - Success
> > [...]
> 
> The DEADBEEF informational (no-op) opcode is something that OVMF appends
> to the very end for hysterical raisins.
> 
> (9) Okay, so the guest is now resumed and running, let's interrupt it in
> gdb again, and check the contents of address blob again (we know the
> address of the address blob from step (6)):
> 
> > ^C
> > Program received signal SIGINT, Interrupt.
> > 0x00007f2bbf1d1ebf in ppoll () from /lib64/libc.so.6
> > (gdb) print /x *(uint64_t*)0x7f2bd03c37b0
> > $6 = 0x7fe5c000
> 
> Et voila.
> 
> (10) I detached gdb from QEMU, and issued the following monitor command:
> 
> > $ virsh qemu-monitor-command ovmf.rhel7 --hmp 'info vm-generation-id'
> > 00112233-4455-6677-8899-aabbccddeeff
> 
> (11) I also booted a Windows Server 2012 R2 guest (Q35, broadcast SMI
> enabled) with a similar vmgenid device/parameter. According to Device
> Manager | System devices, "Microsoft Hyper-V Generation Counter" is
> working properly.
> 
> I also tested S3 briefly, it worked okay. (I mentioned the SMI broadcast
> above because for that, OVMF generates an independent S3 Boot Script
> fragment.)
> 
> 
> I'll let someone else test live migration.
> 
> For patches #1, #3, #4 and #5:
> 
> Tested-by: Laszlo Ersek <lersek@redhat.com>
> 
> I'll soon post the OVMF patches.
> 
> Thanks!
> Laszlo


How do you feel about Igor's request to change WRITE_POINTER to add
offset in there, so guest can pass in the address of GUID and
not start of table? Would that be a lot of work to add?

-- 
MST

Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID
Posted by Ben Warren 7 years, 1 month ago
> On Feb 15, 2017, at 12:09 PM, Michael S. Tsirkin <mst@redhat.com> wrote:
> 
> On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote:
>> On 02/15/17 07:15, ben@skyportsystems.com wrote:
>>> From: Ben Warren <ben@skyportsystems.com>
>>> 
>>> This patch set adds support for passing a GUID to Windows guests.  It
>>> is a re-implementation of previous patch sets written by Igor Mammedov
>>> et al, but this time passing the GUID data as a fw_cfg blob.
>>> 
>>> This patch set has dependencies on new guest functionality, in
>>> particular the support for a new linker-loader command and the ability
>>> to write back data to QEMU over a DMA link.  Work is in flight in both
>>> SeaBIOS and OVMF to support this.
>>> 
>>> v5->v6:
>>>    - Rebased to top of tree.
>>>    - Changed device from sysbus to a simple device.  This removed the need for
>>>      adding dynamic sysbus support to pc_piix boards.
>>>    - Removed patch that introduced QWORD patching of AML.
>>>    - Removed ability to set GUID via QMP/HMP.
>>>    - Improved comments/documentation in code.
>> 
>> So here's my testing with a RHEL-7 guest:
>> 
>> (1) The command line option passed to QEMU is
>> 
>>  -device vmgenid,guid=00112233-4455-6677-8899-AABBCCDDEEFF
>> 
>> This is the example GUID provided in the SMBIOS spec v3.0.0 (DSP0134),
>> section 7.2.1 "System -- UUID". (SMBIOS is only relevant here because it
>> codifies the fact that Microsoft consumes UUID in little-endian order.)
>> The expected representation, according to the SMBIOS spec, is
>> 
>>  33 22 11 00 55 44 77 66 88 99 AA BB CC DD EE FF
>> 
>> (2) Here's an excerpt from the OVMF log:
>> 
>>> ProcessCmdAllocate: File="etc/vmgenid_guid" Alignment=0x1000 Zone=1 Size=0x1000 Address=0x7FE5C000
>> 
>> This is where "etc/vmgenid_guid" is allocated and downloaded, the
>> allocation address is 0x7FE5C000.
>> 
>>> Select Item: 0x19
>>> Select Item: 0x22
>>> ProcessCmdAllocate: File="etc/acpi/tables" Alignment=0x40 Zone=1 Size=0x20000 Address=0x7E7AB000
>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x49 Start=0x40 Length=0x1403
>>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1467 PointerSize=4
>>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x146B PointerSize=4
>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x144C Start=0x1443 Length=0x74
>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x14C0 Start=0x14B7 Length=0x80
>>> Select Item: 0x19
>>> SaveCondensedWritePointerToS3Context: 0x002B/[0x00000000+8] := 0x7FE5C000 (0)
>> 
>> This is where OVMF stashes the WRITE_POINTER command in "condensed"
>> form, for S3. The fw_cfg selector value is 0x2B (for the fw_cfg file to
>> be rewritten), the pointer is located at offset 0, has size 0, and the
>> value to assign is 0x7FE5C000. And, this is #0 of the saved / condensed
>> WRITE_POINTER commands.
>> 
>>> Select Item: 0x2B
>>> ProcessCmdWritePointer: PointerFile="etc/vmgenid_addr" PointeeFile="etc/vmgenid_guid" PointerOffset=0x0 PointerSize=8
>> 
>> This is where the WRITE_POINTER command is actually executed, during
>> normal boot.
>> 
>>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/vmgenid_guid" PointerOffset=0x1561 PointerSize=4
>> 
>> This is where we link "etc/vmgenid_guid" into VGIA.
>> 
>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x1540 Start=0x1537 Length=0xCA
>>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1625 PointerSize=4
>>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x1629 PointerSize=4
>>> ProcessCmdAddPointer: PointerFile="etc/acpi/tables" PointeeFile="etc/acpi/tables" PointerOffset=0x162D PointerSize=4
>>> ProcessCmdAddChecksum: File="etc/acpi/tables" ResultOffset=0x160A Start=0x1601 Length=0x30
>>> ProcessCmdAddPointer: PointerFile="etc/acpi/rsdp" PointeeFile="etc/acpi/tables" PointerOffset=0x10 PointerSize=4
>>> ProcessCmdAddChecksum: File="etc/acpi/rsdp" ResultOffset=0x8 Start=0x0 Length=0x24
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> InstallQemuFwCfgTables: unknown loader command: 0x0
>>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AB000 (remaining: 0x20000): found "FACS" size 0x40
>>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AB040 (remaining: 0x1FFC0): found "DSDT" size 0x1403
>>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/vmgenid_guid" at 0x7FE5C000 (remaining: 0x1000): not found; marking fw_cfg blob as opaque
>> 
>> This is where the OVMF SDT Header Probe Suppressor does its job. (NB,
>> the "opaque marking" has happened already in ProcessCmdWritePointer()
>> too, above.)
>> 
>>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC443 (remaining: 0x1EBBD): found "FACP" size 0x74
>>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC4B7 (remaining: 0x1EB49): found "APIC" size 0x80
>>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC537 (remaining: 0x1EAC9): found "SSDT" size 0xCA
>>> Process2ndPassCmdAddPointer: checking for ACPI header in "etc/acpi/tables" at 0x7E7AC601 (remaining: 0x1E9FF): found "RSDT" size 0x30
>>> TransferS3ContextToBootScript: boot script fragment saved, ScratchBuffer=7FE4F018
>> 
>> This is where the WRITE_POINTER commands, stashed earlier in condensed
>> form, are translated to S3 Boot Script opcodes.
>> 
>>> InstallQemuFwCfgTables: installed 5 tables
>> 
>> Such as: FACS, DSDT, FACP, APIC, SSDT. OVMF recognizes RSDT and ignores
>> it (it's handled by edk2 automatically).
>> 
>>> InstallQemuFwCfgTables: freeing "etc/acpi/rsdp"
>>> InstallQemuFwCfgTables: freeing "etc/acpi/tables"
>> 
>> OVMF sees that the above two blobs have not been marked as "opaque" --
>> they only contained ACPI tables, judged from the ADD_POINTER commands
>> that pointed into them. So these two blobs are freed.
>> 
>> Note that "etc/vmgenid_guid" is not freed.
>> 
>> So, from the firmware log, everything looks OK.
>> 
>> (3) I dumped the SSDT in the RHEL-7 guest:
>> 
>>> /*
>>> * Intel ACPI Component Architecture
>>> * AML/ASL+ Disassembler version 20160527-64
>>> * Copyright (c) 2000 - 2016 Intel Corporation
>>> *
>>> * Disassembling to symbolic ASL+ operators
>>> *
>>> * Disassembly of ssdt.dat, Wed Feb 15 19:21:11 2017
>>> *
>>> * Original Table Header:
>>> *     Signature        "SSDT"
>>> *     Length           0x000000CA (202)
>>> *     Revision         0x01
>>> *     Checksum         0x1D
>>> *     OEM ID           "BOCHS "
>>> *     OEM Table ID     "VMGENID"
>>> *     OEM Revision     0x00000001 (1)
>>> *     Compiler ID      "BXPC"
>>> *     Compiler Version 0x00000001 (1)
>>> */
>>> DefinitionBlock ("", "SSDT", 1, "BOCHS ", "VMGENID", 0x00000001)
>>> {
>>>    Name (VGIA, 0x7FE5C000)
>> 
>> Note that the value matches the value logged by the firmware in (2).
>> 
>>>    Scope (\_SB)
>>>    {
>>>        Device (VGEN)
>>>        {
>>>            Name (_HID, "QEMUVGID")  // _HID: Hardware ID
>>>            Name (_CID, "VM_Gen_Counter")  // _CID: Compatible ID
>>>            Name (_DDN, "VM_Gen_Counter")  // _DDN: DOS Device Name
>>>            Method (_STA, 0, NotSerialized)  // _STA: Status
>>>            {
>>>                Local0 = 0x0F
>>>                If (VGIA == Zero)
>>>                {
>>>                    Local0 = Zero
>>>                }
>>> 
>>>                Return (Local0)
>>>            }
>>> 
>>>            Method (ADDR, 0, NotSerialized)
>>>            {
>>>                Local0 = Package (0x02) {}
>>>                Local0 [Zero] = (VGIA + 0x28)
>>>                Local0 [One] = Zero
>>>                Return (Local0)
>>>            }
>>>        }
>>>    }
>>> 
>>>    Method (\_GPE._E05, 0, NotSerialized)  // _Exx: Edge-Triggered GPE
>>>    {
>>>        Notify (\_SB.VGEN, 0x80) // Status Change
>>>    }
>>> }
>> 
>> Looks good and matches the documentation.
>> 
>> (4) To be sure, I checked the address against the guest dmesg, which
>> contains a dump of the UEFI memory map:
>> 
>>> [    0.000000] efi: mem52: type=10, attr=0xf, range=[0x000000007fe5a000-0x000000007fe5e000) (0MB)
>> 
>> The page (4096 bytes) at 0x7FE5C000 falls into this range. Type=10 means
>> EfiACPIMemoryNVS.
>> 
>> (5) At this point I dumped the guest RAM with the dump-guest-memory
>> monitor command, opened it with "crash", and listed it:
>> 
>>> crash> rd -p -8 0x7FE5C000 0x40
>>>        7fe5c000:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
>>>        7fe5c010:  00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00   ................
>>>        7fe5c020:  00 00 00 00 00 00 00 00 33 22 11 00 55 44 77 66   ........3"..UDwf
>>>        7fe5c030:  88 99 aa bb cc dd ee ff 00 00 00 00 00 00 00 00   ................
>> 
>> We can see that the GUID starts at 0x7FE5C000 + 0x28, and also that the
>> byte-level representation matches the little endian one given in (1).
>> 
>> This proves that the initial blob download worked fine.
>> 
>> (6) Here I attached "gdb" to QEMU, set a breakpoint on
>> vmgenid_handle_reset(), allowed the inferior process to continue
>> execution.
>> 
>> Then I suspended and resumed the guest (ACPI S3). The breakpoint was hit
>> during resume:
>> 
>>> Breakpoint 1, vmgenid_handle_reset (opaque=0x7f2bd03c36e0) at .../hw/acpi/vmgenid.c:205
>>> 205         VmGenIdState *vms = VMGENID(opaque);
>> 
>> First of all, before allowing QEMU to zero out the address blob, I
>> listed the address and the contents of the address blob (here exploiting
>> that my host is also little endian):
>> 
>>> (gdb) print (void*)vms->vmgenid_addr_le
>>> $2 = (void *) 0x7f2bd03c37b0
>> 
>>> (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le
>>> $4 = 0x7fe5c000
>> 
>> This proves that QEMU has the right address, matching the firmware log
>> from (2), and the ACPI dump from (3).
>> 
>> (7) At this point I allowed the inferior to proceed a bit:
>> 
>>> (gdb) n
>>> 207         memset(vms->vmgenid_addr_le, 0, ARRAY_SIZE(vms->vmgenid_addr_le));
>>> (gdb) n
>>> 208     }
>> 
>> I verified that the blob was zeroed:
>> 
>>> (gdb) print /x *(uint64_t*)vms->vmgenid_addr_le
>>> $5 = 0x0
>> 
>> then allowed the inferior to run free.
>> 
>>> (gdb) cont
>>> Continuing.
>> 
>> (8) New messages appeared in the firmware log:
>> 
>>> S3ResumeExecuteBootScript()
>>> PeiS3ResumeState - 7FF92B18
>>> transfer control to Standalone Boot Script Executor
>>> S3BootScriptExecute:
>>> TableHeader - 0x7E7A7000
>>> TableHeader.Version - 0x0001
>>> TableHeader.TableLength - 0x000000ED
>>> ExecuteBootScript - 7E7A700D
>>> EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE
>>> BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000010, 0x00000000
>> 
>> Here the ACPI S3 Boot Script, prepared in
>> TransferS3ContextToBootScript() -- see (2) -- creates a DMA access
>> command for fw_cfg. The DMA access command is written to pre-reserved
>> memory (see "ScratchBuffer" above).
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F018 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F019 (0x2B)
>> 
>> The fw_cfg selector is 0x2B. (See under (2).)
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F01A (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01B (0x0C)
>> 
>> This is a combined select+skip operation.
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F01C (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01D (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01E (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01F (0x00)
>> 
>> The skip size is 0 bytes.
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F020 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F021 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F022 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F023 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F024 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F025 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F026 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F027 (0x00)
>> 
>> The address is irrelevant for skip, so it's just nuleld.
>> 
>>> ExecuteBootScript - 7E7A7030
>>> EFI_BOOT_SCRIPT_IO_WRITE_OPCODE
>>> BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002
>>> S3BootScriptWidthUint32 - 0x00000514 (0x00000000)
>>> S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F)
>> 
>> The Boot Script passes the DMA command to QEMU, by writing the address
>> of the command buffer to IO ports 0x514 and 0x518, in BE byte order.
>> 
>>> ExecuteBootScript - 7E7A704B
>>> EFI_BOOT_SCRIPT_MEM_POLL_OPCODE
>>> BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF, 0x0000000000000000
>>> S3BootScriptWidthUint32 - 0x7FE4F018
>>> ExecuteBootScript - 7E7A7072
>> 
>> This waits until the DMA command succeeds (reading back the Control
>> field continuously until it reads as zero).
>> 
>>> EFI_BOOT_SCRIPT_MEM_WRITE_OPCODE
>>> BootScriptExecuteMemoryWrite - 0x7FE4F018, 0x00000018, 0x00000000
>> 
>> This is another DMA access command for fw_cfg, prepared in the same
>> pre-reserved buffer. This time
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F018 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F019 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01A (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01B (0x10)
>> 
>> we request a write operation,
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F01C (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01D (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01E (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F01F (0x08)
>> 
>> with a length of 8 bytes (big endian), matching the pointer size,
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F020 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F021 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F022 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F023 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F024 (0x7F)
>>> S3BootScriptWidthUint8 - 0x7FE4F025 (0xE4)
>>> S3BootScriptWidthUint8 - 0x7FE4F026 (0xF0)
>>> S3BootScriptWidthUint8 - 0x7FE4F027 (0x28)
>> 
>> the data to transfer is located at 0x7FE4F028 (just below, tacked to the
>> command buffer itself),
>> 
>>> S3BootScriptWidthUint8 - 0x7FE4F028 (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F029 (0xC0)
>>> S3BootScriptWidthUint8 - 0x7FE4F02A (0xE5)
>>> S3BootScriptWidthUint8 - 0x7FE4F02B (0x7F)
>>> S3BootScriptWidthUint8 - 0x7FE4F02C (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F02D (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F02E (0x00)
>>> S3BootScriptWidthUint8 - 0x7FE4F02F (0x00)
>> 
>> and the data to write is the original allocation address of the blob
>> (0x7fe5c000).
>> 
>>> ExecuteBootScript - 7E7A709D
>>> EFI_BOOT_SCRIPT_IO_WRITE_OPCODE
>>> BootScriptExecuteIoWrite - 0x00000514, 0x00000002, 0x00000002
>>> S3BootScriptWidthUint32 - 0x00000514 (0x00000000)
>>> S3BootScriptWidthUint32 - 0x00000518 (0x18F0E47F)
>>> ExecuteBootScript - 7E7A70B8
>>> EFI_BOOT_SCRIPT_MEM_POLL_OPCODE
>>> BootScriptExecuteMemPoll - 0x7FE4F018, 0x00000000FFFFFFFF, 0x0000000000000000
>>> S3BootScriptWidthUint32 - 0x7FE4F018
>>> ExecuteBootScript - 7E7A70DF
>> 
>> Same story as above: fire off the transfer and wait until it completes.
>> 
>>> EFI_BOOT_SCRIPT_INFORMATION_OPCODE
>>> BootScriptExecuteInformation - 0x7E7A70E6
>>> BootScriptInformation: DE AD BE EF
>>> ExecuteBootScript - 7E7A70EA
>>> S3_BOOT_SCRIPT_LIB_TERMINATE_OPCODE
>>> S3BootScriptDone - Success
>>> [...]
>> 
>> The DEADBEEF informational (no-op) opcode is something that OVMF appends
>> to the very end for hysterical raisins.
>> 
>> (9) Okay, so the guest is now resumed and running, let's interrupt it in
>> gdb again, and check the contents of address blob again (we know the
>> address of the address blob from step (6)):
>> 
>>> ^C
>>> Program received signal SIGINT, Interrupt.
>>> 0x00007f2bbf1d1ebf in ppoll () from /lib64/libc.so.6
>>> (gdb) print /x *(uint64_t*)0x7f2bd03c37b0
>>> $6 = 0x7fe5c000
>> 
>> Et voila.
>> 
>> (10) I detached gdb from QEMU, and issued the following monitor command:
>> 
>>> $ virsh qemu-monitor-command ovmf.rhel7 --hmp 'info vm-generation-id'
>>> 00112233-4455-6677-8899-aabbccddeeff
>> 
>> (11) I also booted a Windows Server 2012 R2 guest (Q35, broadcast SMI
>> enabled) with a similar vmgenid device/parameter. According to Device
>> Manager | System devices, "Microsoft Hyper-V Generation Counter" is
>> working properly.
>> 
>> I also tested S3 briefly, it worked okay. (I mentioned the SMI broadcast
>> above because for that, OVMF generates an independent S3 Boot Script
>> fragment.)
>> 
>> 
>> I'll let someone else test live migration.
>> 
>> For patches #1, #3, #4 and #5:
>> 
>> Tested-by: Laszlo Ersek <lersek@redhat.com>
>> 
>> I'll soon post the OVMF patches.
>> 
>> Thanks!
>> Laszlo
> 
> 
> How do you feel about Igor's request to change WRITE_POINTER to add
> offset in there, so guest can pass in the address of GUID and
> not start of table? Would that be a lot of work to add?
> 
I know you’re asking Laszlo, but hopefully the answer is “OK”.  I’m finished making those changes to QEMU and am coding up SeaBIOS right now. 
> -- 
> MST

Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID
Posted by Laszlo Ersek 7 years, 1 month ago
On 02/15/17 21:09, Michael S. Tsirkin wrote:
> On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote:

[snip]

>> For patches #1, #3, #4 and #5:
>>
>> Tested-by: Laszlo Ersek <lersek@redhat.com>
>>
>> I'll soon post the OVMF patches.
>>
>> Thanks!
>> Laszlo
> 
> 
> How do you feel about Igor's request to change WRITE_POINTER to add
> offset in there, so guest can pass in the address of GUID and
> not start of table? Would that be a lot of work to add?

I think it's doable in practice: simply add a constant from the command
itself, for passing the value back to QEMU, and also for saving the
fw_cfg write commend for S3 resume time.

But, I disagree with it from a design POV.

Igor's point is:

> Math complicates QEMU code though and not only QMEMU but AML code as
> well.

As I understand it, the goal is to push the addition to the firmware
(which is "one place"), rather than having to implement it twice in
QEMU, i.e., in two places ((a) native QEMU logic, (b) AML generator).

Here's my counter-argument:

(a) As I mentioned earlier, assume a complex communication structure
between the guest OS and QEMU. Currently our shared structure consists
of a single field (the GUID), but next time it might contain several fields.

For such a multi-field shared structure, QEMU will have to do manual
offsetting into the guest RAM anyway, for accessing fields F1, F2, and
F3. We will not create three separate WRITE_POINTER commands and let the
firmware calculate and return the absolute GPAs of the fields F1, F2 and
F3. Instead, there will be one WRITE_POINTER command, and QEMU will do
the offsetting manually, minimally for fields F2 and F3.

"src_offset" looks tempting now only because we have a shared structure
with only one field, the GUID at offset 40 decimal.

(b) Regarding the runtime addition in the AML code:

As discussed before, the main reason *now*, for not pointing VGIA (and
other named integer objects) with ADD_POINTER commands directly to
"meaningful" fields, is that OVMF probes the targets of ADD_POINTER
commands for patterns that look like ACPI table headers. And, for the
time being, we want to suppress any mis-recognitions by prepending some
padding.

Igor was right to dislike this, and we agreed that *down the road* we
should add allocation flags, or further allocation commands, to supplant
this kind of heuristics in OVMF. But:

- we don't have time to do it now, plus

- please observe that the runtime addition in AML relates to the
  ADD_POINTER and the ALLOCATE commands. It does not relate to
  WRITE_POINTER at all.

  Whatever we change on WRITE_POINTER will do nothing for suppressing
  OVMF's table header probing -- because that is tied to ADD_POINTER
  --, therefore WRITE_POINTER tweaks cannot eliminate the "need to add"
  in AML.


In summary, I think the proposed WRITE_POINTER modification is
implementable, but I think it will not pay off, because:

(a) for QEMU logic, it will not prove useful as soon as we have a
multi-field shared structure (QEMU will have to add field offsets anyway),

(b) and for eliminating the AML addition (which is a consequence of the
current ADD_POINTER handling in OVMF), it does nothing.

Thanks
Laszlo

Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID
Posted by Ben Warren 7 years, 1 month ago
> On Feb 15, 2017, at 12:52 PM, Laszlo Ersek <lersek@redhat.com> wrote:
> 
> On 02/15/17 21:09, Michael S. Tsirkin wrote:
>> On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote:
> 
> [snip]
> 
>>> For patches #1, #3, #4 and #5:
>>> 
>>> Tested-by: Laszlo Ersek <lersek@redhat.com>
>>> 
>>> I'll soon post the OVMF patches.
>>> 
>>> Thanks!
>>> Laszlo
>> 
>> 
>> How do you feel about Igor's request to change WRITE_POINTER to add
>> offset in there, so guest can pass in the address of GUID and
>> not start of table? Would that be a lot of work to add?
> 
> I think it's doable in practice: simply add a constant from the command
> itself, for passing the value back to QEMU, and also for saving the
> fw_cfg write commend for S3 resume time.
> 
> But, I disagree with it from a design POV.
> 
> Igor's point is:
> 
>> Math complicates QEMU code though and not only QMEMU but AML code as
>> well.
> 
> As I understand it, the goal is to push the addition to the firmware
> (which is "one place"), rather than having to implement it twice in
> QEMU, i.e., in two places ((a) native QEMU logic, (b) AML generator).
> 
> Here's my counter-argument:
> 
> (a) As I mentioned earlier, assume a complex communication structure
> between the guest OS and QEMU. Currently our shared structure consists
> of a single field (the GUID), but next time it might contain several fields.
> 
> For such a multi-field shared structure, QEMU will have to do manual
> offsetting into the guest RAM anyway, for accessing fields F1, F2, and
> F3. We will not create three separate WRITE_POINTER commands and let the
> firmware calculate and return the absolute GPAs of the fields F1, F2 and
> F3. Instead, there will be one WRITE_POINTER command, and QEMU will do
> the offsetting manually, minimally for fields F2 and F3.
> 
> "src_offset" looks tempting now only because we have a shared structure
> with only one field, the GUID at offset 40 decimal.
> 
> (b) Regarding the runtime addition in the AML code:
> 
> As discussed before, the main reason *now*, for not pointing VGIA (and
> other named integer objects) with ADD_POINTER commands directly to
> "meaningful" fields, is that OVMF probes the targets of ADD_POINTER
> commands for patterns that look like ACPI table headers. And, for the
> time being, we want to suppress any mis-recognitions by prepending some
> padding.
> 
> Igor was right to dislike this, and we agreed that *down the road* we
> should add allocation flags, or further allocation commands, to supplant
> this kind of heuristics in OVMF. But:
> 
> - we don't have time to do it now, plus
> 
> - please observe that the runtime addition in AML relates to the
>  ADD_POINTER and the ALLOCATE commands. It does not relate to
>  WRITE_POINTER at all.
> 
>  Whatever we change on WRITE_POINTER will do nothing for suppressing
>  OVMF's table header probing -- because that is tied to ADD_POINTER
>  --, therefore WRITE_POINTER tweaks cannot eliminate the "need to add"
>  in AML.
> 
> 
> In summary, I think the proposed WRITE_POINTER modification is
> implementable, but I think it will not pay off, because:
> 
> (a) for QEMU logic, it will not prove useful as soon as we have a
> multi-field shared structure (QEMU will have to add field offsets anyway),
> 
> (b) and for eliminating the AML addition (which is a consequence of the
> current ADD_POINTER handling in OVMF), it does nothing.
> 
OK Laszlo, in v7 (imminent)  I went ahead and implemented this src_offset.  If you are truly dead-set against it, it’s not very hard to remove.  To me it seems pretty harmless.

> Thanks
> Laszlo

Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID
Posted by Laszlo Ersek 7 years, 1 month ago
On 02/16/17 07:10, Ben Warren wrote:
> 
>> On Feb 15, 2017, at 12:52 PM, Laszlo Ersek <lersek@redhat.com> wrote:
>>
>> On 02/15/17 21:09, Michael S. Tsirkin wrote:
>>> On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote:
>>
>> [snip]
>>
>>>> For patches #1, #3, #4 and #5:
>>>>
>>>> Tested-by: Laszlo Ersek <lersek@redhat.com>
>>>>
>>>> I'll soon post the OVMF patches.
>>>>
>>>> Thanks!
>>>> Laszlo
>>>
>>>
>>> How do you feel about Igor's request to change WRITE_POINTER to add
>>> offset in there, so guest can pass in the address of GUID and
>>> not start of table? Would that be a lot of work to add?
>>
>> I think it's doable in practice: simply add a constant from the command
>> itself, for passing the value back to QEMU, and also for saving the
>> fw_cfg write commend for S3 resume time.
>>
>> But, I disagree with it from a design POV.
>>
>> Igor's point is:
>>
>>> Math complicates QEMU code though and not only QMEMU but AML code as
>>> well.
>>
>> As I understand it, the goal is to push the addition to the firmware
>> (which is "one place"), rather than having to implement it twice in
>> QEMU, i.e., in two places ((a) native QEMU logic, (b) AML generator).
>>
>> Here's my counter-argument:
>>
>> (a) As I mentioned earlier, assume a complex communication structure
>> between the guest OS and QEMU. Currently our shared structure consists
>> of a single field (the GUID), but next time it might contain several fields.
>>
>> For such a multi-field shared structure, QEMU will have to do manual
>> offsetting into the guest RAM anyway, for accessing fields F1, F2, and
>> F3. We will not create three separate WRITE_POINTER commands and let the
>> firmware calculate and return the absolute GPAs of the fields F1, F2 and
>> F3. Instead, there will be one WRITE_POINTER command, and QEMU will do
>> the offsetting manually, minimally for fields F2 and F3.
>>
>> "src_offset" looks tempting now only because we have a shared structure
>> with only one field, the GUID at offset 40 decimal.
>>
>> (b) Regarding the runtime addition in the AML code:
>>
>> As discussed before, the main reason *now*, for not pointing VGIA (and
>> other named integer objects) with ADD_POINTER commands directly to
>> "meaningful" fields, is that OVMF probes the targets of ADD_POINTER
>> commands for patterns that look like ACPI table headers. And, for the
>> time being, we want to suppress any mis-recognitions by prepending some
>> padding.
>>
>> Igor was right to dislike this, and we agreed that *down the road* we
>> should add allocation flags, or further allocation commands, to supplant
>> this kind of heuristics in OVMF. But:
>>
>> - we don't have time to do it now, plus
>>
>> - please observe that the runtime addition in AML relates to the
>>  ADD_POINTER and the ALLOCATE commands. It does not relate to
>>  WRITE_POINTER at all.
>>
>>  Whatever we change on WRITE_POINTER will do nothing for suppressing
>>  OVMF's table header probing -- because that is tied to ADD_POINTER
>>  --, therefore WRITE_POINTER tweaks cannot eliminate the "need to add"
>>  in AML.
>>
>>
>> In summary, I think the proposed WRITE_POINTER modification is
>> implementable, but I think it will not pay off, because:
>>
>> (a) for QEMU logic, it will not prove useful as soon as we have a
>> multi-field shared structure (QEMU will have to add field offsets anyway),
>>
>> (b) and for eliminating the AML addition (which is a consequence of the
>> current ADD_POINTER handling in OVMF), it does nothing.
>>
> OK Laszlo, in v7 (imminent)  I went ahead and implemented this
> src_offset.  If you are truly dead-set against it, it’s not very hard
> to remove.  To me it seems pretty harmless.

I'd like to hear Igor's opinion about my counter-argument above.

It would be a fair bit of work in OVMF:
- it introduces a new addition which has to be range checked (the sum
  cannot overflow either the pointer object or the size of the
  pointed-to blob),
- if the range check fails, that's a new error condition that needs
  handling,
- in edk2 we document error conditions carefully, so it affects
  documentation too, for the function that processes WRITE_POINTER,
- I'd have to redo all the testing.

After I post the OVMF patches, we can count on about one week of review
time (assuming I don't have to post a v2, that is!). That comes pretty
close to the end of February, which is when I want to do a downstream
OVMF rebase. (I can't do it in March due to a conference, and then it's
too late.) The downstream rebase will take a few days too. I would have
preferred to post the OVMF patches near the beginning of this week.

So, as I said, it is implementable, I'm not "dead-set" against it, but
the price for me to pay is definitely not zero. And, as it stands, I
disagree with the design -- there don't seem to be any general benefits.

I'd like to hear back from Igor. If he convincingly refutes my argument,
I'm game.

Thanks,
Laszlo

Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID
Posted by Igor Mammedov 7 years, 1 month ago
On Wed, 15 Feb 2017 21:52:55 +0100
Laszlo Ersek <lersek@redhat.com> wrote:

> On 02/15/17 21:09, Michael S. Tsirkin wrote:
> > On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote:  
> 
> [snip]
> 
> >> For patches #1, #3, #4 and #5:
> >>
> >> Tested-by: Laszlo Ersek <lersek@redhat.com>
> >>
> >> I'll soon post the OVMF patches.
> >>
> >> Thanks!
> >> Laszlo  
> > 
> > 
> > How do you feel about Igor's request to change WRITE_POINTER to add
> > offset in there, so guest can pass in the address of GUID and
> > not start of table? Would that be a lot of work to add?  
> 
> I think it's doable in practice: simply add a constant from the command
> itself, for passing the value back to QEMU, and also for saving the
> fw_cfg write commend for S3 resume time.
> 
> But, I disagree with it from a design POV.
> 
> Igor's point is:
> 
> > Math complicates QEMU code though and not only QMEMU but AML code as
> > well.  
> 
> As I understand it, the goal is to push the addition to the firmware
> (which is "one place"), rather than having to implement it twice in
> QEMU, i.e., in two places ((a) native QEMU logic, (b) AML generator).
> 
> Here's my counter-argument:
> 
> (a) As I mentioned earlier, assume a complex communication structure
> between the guest OS and QEMU. Currently our shared structure consists
> of a single field (the GUID), but next time it might contain several fields.
> 
> For such a multi-field shared structure, QEMU will have to do manual
> offsetting into the guest RAM anyway, for accessing fields F1, F2, and
> F3. We will not create three separate WRITE_POINTER commands and let the
> firmware calculate and return the absolute GPAs of the fields F1, F2 and
> F3. Instead, there will be one WRITE_POINTER command, and QEMU will do
> the offsetting manually, minimally for fields F2 and F3.
> 
> "src_offset" looks tempting now only because we have a shared structure
> with only one field, the GUID at offset 40 decimal.

benefits of having src_offset from QEMU POV I see are:
 a) (biggest one) firmware and device code are clearly separated where:
     - VMGENID_GUID_OFFSET would be used only by firmware side, such as:
         WRITE_POINTER and AML addition to help OVMF detect non ACPI blob
     - device doesn't have to assume/or have a knowledge about
       layout of GUID blob except of size of data it's needs
       to access at location provided by WRITE_POINTER as v7 shows it.

 b) wrt shared blob I've envisioned slightly different approach,
    where multiple WRITE_POINTER commands are used instead of one
    with following workflow to extend shared blob:
     1) firmware part of QEMU (acpi-build.c):
          if (device_foo_present) {
             fw_cfg_add_file_callback('/etc/device_foo_addr', device_foo->addr_storage)

             shared_off = device_foo->align(next_free_shared_offset)
             WRITE_POINTER('/etc/device_foo_addr', 0,
                           '/etc/shared_blob, shared_off)

             next_free_shared_offset = shared_off + device_foo->data_size;
          }
     2) device_foo accesses data at device_foo->addr_storage directly
        * there is no need to spread knowledge of shared_blob
          layout to device code anymore.
        * no need to care where in shared_blob data will be placed,
        * shared space is used only when device is present
        * since there is no shared_writeback_blob, there isn't 
          need in mechanism to propagate written data to device
          or notify device about write
     
   drawback in this approach is that a device would consume
   a file slot if fw_cfg and space for WRITE_POINTER in
   linker file when present.

 
> (b) Regarding the runtime addition in the AML code:
as you pointed out WRITE_POINTER has nothing to do with addition
on AML side which is influenced by ADD_POINTER and OVMF and could
be fixed with flags down the road, so there is nothing to argue
about on this bullet.


> As discussed before, the main reason *now*, for not pointing VGIA (and
> other named integer objects) with ADD_POINTER commands directly to
> "meaningful" fields, is that OVMF probes the targets of ADD_POINTER
> commands for patterns that look like ACPI table headers. And, for the
> time being, we want to suppress any mis-recognitions by prepending some
> padding.
> 
> Igor was right to dislike this, and we agreed that *down the road* we
> should add allocation flags, or further allocation commands, to supplant
> this kind of heuristics in OVMF. But:
> 
> - we don't have time to do it now, plus
> 
> - please observe that the runtime addition in AML relates to the
>   ADD_POINTER and the ALLOCATE commands. It does not relate to
>   WRITE_POINTER at all.
> 
>   Whatever we change on WRITE_POINTER will do nothing for suppressing
>   OVMF's table header probing -- because that is tied to ADD_POINTER
>   --, therefore WRITE_POINTER tweaks cannot eliminate the "need to add"
>   in AML.
> 
> 
> In summary, I think the proposed WRITE_POINTER modification is
> implementable, but I think it will not pay off, because:
> 
> (a) for QEMU logic, it will not prove useful as soon as we have a
> multi-field shared structure (QEMU will have to add field offsets anyway),
> 
> (b) and for eliminating the AML addition (which is a consequence of the
> current ADD_POINTER handling in OVMF), it does nothing.
> 
> Thanks
> Laszlo


Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID
Posted by Laszlo Ersek 7 years, 1 month ago
On 02/16/17 13:08, Igor Mammedov wrote:
> On Wed, 15 Feb 2017 21:52:55 +0100
> Laszlo Ersek <lersek@redhat.com> wrote:
> 
>> On 02/15/17 21:09, Michael S. Tsirkin wrote:
>>> On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote:  
>>
>> [snip]
>>
>>>> For patches #1, #3, #4 and #5:
>>>>
>>>> Tested-by: Laszlo Ersek <lersek@redhat.com>
>>>>
>>>> I'll soon post the OVMF patches.
>>>>
>>>> Thanks!
>>>> Laszlo  
>>>
>>>
>>> How do you feel about Igor's request to change WRITE_POINTER to add
>>> offset in there, so guest can pass in the address of GUID and
>>> not start of table? Would that be a lot of work to add?  
>>
>> I think it's doable in practice: simply add a constant from the command
>> itself, for passing the value back to QEMU, and also for saving the
>> fw_cfg write commend for S3 resume time.
>>
>> But, I disagree with it from a design POV.
>>
>> Igor's point is:
>>
>>> Math complicates QEMU code though and not only QMEMU but AML code as
>>> well.  
>>
>> As I understand it, the goal is to push the addition to the firmware
>> (which is "one place"), rather than having to implement it twice in
>> QEMU, i.e., in two places ((a) native QEMU logic, (b) AML generator).
>>
>> Here's my counter-argument:
>>
>> (a) As I mentioned earlier, assume a complex communication structure
>> between the guest OS and QEMU. Currently our shared structure consists
>> of a single field (the GUID), but next time it might contain several fields.
>>
>> For such a multi-field shared structure, QEMU will have to do manual
>> offsetting into the guest RAM anyway, for accessing fields F1, F2, and
>> F3. We will not create three separate WRITE_POINTER commands and let the
>> firmware calculate and return the absolute GPAs of the fields F1, F2 and
>> F3. Instead, there will be one WRITE_POINTER command, and QEMU will do
>> the offsetting manually, minimally for fields F2 and F3.
>>
>> "src_offset" looks tempting now only because we have a shared structure
>> with only one field, the GUID at offset 40 decimal.
> 
> benefits of having src_offset from QEMU POV I see are:
>  a) (biggest one) firmware and device code are clearly separated where:
>      - VMGENID_GUID_OFFSET would be used only by firmware side, such as:
>          WRITE_POINTER and AML addition to help OVMF detect non ACPI blob
>      - device doesn't have to assume/or have a knowledge about
>        layout of GUID blob except of size of data it's needs
>        to access at location provided by WRITE_POINTER as v7 shows it.
> 
>  b) wrt shared blob I've envisioned slightly different approach,
>     where multiple WRITE_POINTER commands are used instead of one
>     with following workflow to extend shared blob:
>      1) firmware part of QEMU (acpi-build.c):
>           if (device_foo_present) {
>              fw_cfg_add_file_callback('/etc/device_foo_addr', device_foo->addr_storage)
> 
>              shared_off = device_foo->align(next_free_shared_offset)
>              WRITE_POINTER('/etc/device_foo_addr', 0,
>                            '/etc/shared_blob, shared_off)
> 
>              next_free_shared_offset = shared_off + device_foo->data_size;
>           }
>      2) device_foo accesses data at device_foo->addr_storage directly
>         * there is no need to spread knowledge of shared_blob
>           layout to device code anymore.

This is where I disagree, I think. Above you mention
device_foo->data_size. If "data_size" covers multiple fields, then the
device code itself will have to add relative offsets, for accessing
those different fields in guest RAM.

With the current command, the only difference is that the device code
has to receive a "base offset" from the outside, pointing into the
shared blob, and then add the field offsets to that. Thus the addition
cannot be avoided anyway.

You do have a point about sharing the same area between different
devices though. The above pseudo-code looks like a good pattern. This
way "acpi-build.c" won't have to hand out the shared blob offsets to
existing device instances directly; instead, the blob offsets are handed
down to the firmware, and the devices will get their struct base
addresses from the firmware. Using one WRITE_POINTER command for that,
per device, seems fine.

I'll update the OVMF patches.

Thanks
Laszlo

>         * no need to care where in shared_blob data will be placed,
>         * shared space is used only when device is present
>         * since there is no shared_writeback_blob, there isn't 
>           need in mechanism to propagate written data to device
>           or notify device about write
>      
>    drawback in this approach is that a device would consume
>    a file slot if fw_cfg and space for WRITE_POINTER in
>    linker file when present.
> 
>  
>> (b) Regarding the runtime addition in the AML code:
> as you pointed out WRITE_POINTER has nothing to do with addition
> on AML side which is influenced by ADD_POINTER and OVMF and could
> be fixed with flags down the road, so there is nothing to argue
> about on this bullet.
> 
> 
>> As discussed before, the main reason *now*, for not pointing VGIA (and
>> other named integer objects) with ADD_POINTER commands directly to
>> "meaningful" fields, is that OVMF probes the targets of ADD_POINTER
>> commands for patterns that look like ACPI table headers. And, for the
>> time being, we want to suppress any mis-recognitions by prepending some
>> padding.
>>
>> Igor was right to dislike this, and we agreed that *down the road* we
>> should add allocation flags, or further allocation commands, to supplant
>> this kind of heuristics in OVMF. But:
>>
>> - we don't have time to do it now, plus
>>
>> - please observe that the runtime addition in AML relates to the
>>   ADD_POINTER and the ALLOCATE commands. It does not relate to
>>   WRITE_POINTER at all.
>>
>>   Whatever we change on WRITE_POINTER will do nothing for suppressing
>>   OVMF's table header probing -- because that is tied to ADD_POINTER
>>   --, therefore WRITE_POINTER tweaks cannot eliminate the "need to add"
>>   in AML.
>>
>>
>> In summary, I think the proposed WRITE_POINTER modification is
>> implementable, but I think it will not pay off, because:
>>
>> (a) for QEMU logic, it will not prove useful as soon as we have a
>> multi-field shared structure (QEMU will have to add field offsets anyway),
>>
>> (b) and for eliminating the AML addition (which is a consequence of the
>> current ADD_POINTER handling in OVMF), it does nothing.
>>
>> Thanks
>> Laszlo
> 


Re: [Qemu-devel] [PATCH v6 0/7] Add support for VM Generation ID
Posted by Igor Mammedov 7 years, 1 month ago
On Thu, 16 Feb 2017 14:29:28 +0100
Laszlo Ersek <lersek@redhat.com> wrote:

> On 02/16/17 13:08, Igor Mammedov wrote:
> > On Wed, 15 Feb 2017 21:52:55 +0100
> > Laszlo Ersek <lersek@redhat.com> wrote:
> >   
> >> On 02/15/17 21:09, Michael S. Tsirkin wrote:  
> >>> On Wed, Feb 15, 2017 at 08:47:48PM +0100, Laszlo Ersek wrote:    
> >>
> >> [snip]
> >>  
> >>>> For patches #1, #3, #4 and #5:
> >>>>
> >>>> Tested-by: Laszlo Ersek <lersek@redhat.com>
> >>>>
> >>>> I'll soon post the OVMF patches.
> >>>>
> >>>> Thanks!
> >>>> Laszlo    
> >>>
> >>>
> >>> How do you feel about Igor's request to change WRITE_POINTER to add
> >>> offset in there, so guest can pass in the address of GUID and
> >>> not start of table? Would that be a lot of work to add?    
> >>
> >> I think it's doable in practice: simply add a constant from the command
> >> itself, for passing the value back to QEMU, and also for saving the
> >> fw_cfg write commend for S3 resume time.
> >>
> >> But, I disagree with it from a design POV.
> >>
> >> Igor's point is:
> >>  
> >>> Math complicates QEMU code though and not only QMEMU but AML code as
> >>> well.    
> >>
> >> As I understand it, the goal is to push the addition to the firmware
> >> (which is "one place"), rather than having to implement it twice in
> >> QEMU, i.e., in two places ((a) native QEMU logic, (b) AML generator).
> >>
> >> Here's my counter-argument:
> >>
> >> (a) As I mentioned earlier, assume a complex communication structure
> >> between the guest OS and QEMU. Currently our shared structure consists
> >> of a single field (the GUID), but next time it might contain several fields.
> >>
> >> For such a multi-field shared structure, QEMU will have to do manual
> >> offsetting into the guest RAM anyway, for accessing fields F1, F2, and
> >> F3. We will not create three separate WRITE_POINTER commands and let the
> >> firmware calculate and return the absolute GPAs of the fields F1, F2 and
> >> F3. Instead, there will be one WRITE_POINTER command, and QEMU will do
> >> the offsetting manually, minimally for fields F2 and F3.
> >>
> >> "src_offset" looks tempting now only because we have a shared structure
> >> with only one field, the GUID at offset 40 decimal.  
> > 
> > benefits of having src_offset from QEMU POV I see are:
> >  a) (biggest one) firmware and device code are clearly separated where:
> >      - VMGENID_GUID_OFFSET would be used only by firmware side, such as:
> >          WRITE_POINTER and AML addition to help OVMF detect non ACPI blob
> >      - device doesn't have to assume/or have a knowledge about
> >        layout of GUID blob except of size of data it's needs
> >        to access at location provided by WRITE_POINTER as v7 shows it.
> > 
> >  b) wrt shared blob I've envisioned slightly different approach,
> >     where multiple WRITE_POINTER commands are used instead of one
> >     with following workflow to extend shared blob:
> >      1) firmware part of QEMU (acpi-build.c):
> >           if (device_foo_present) {
> >              fw_cfg_add_file_callback('/etc/device_foo_addr', device_foo->addr_storage)
> > 
> >              shared_off = device_foo->align(next_free_shared_offset)
> >              WRITE_POINTER('/etc/device_foo_addr', 0,
> >                            '/etc/shared_blob, shared_off)
> > 
> >              next_free_shared_offset = shared_off + device_foo->data_size;
> >           }
> >      2) device_foo accesses data at device_foo->addr_storage directly
> >         * there is no need to spread knowledge of shared_blob
> >           layout to device code anymore.  
> 
> This is where I disagree, I think. Above you mention
> device_foo->data_size. If "data_size" covers multiple fields, then the
> device code itself will have to add relative offsets, for accessing
> those different fields in guest RAM.
> 
> With the current command, the only difference is that the device code
> has to receive a "base offset" from the outside, pointing into the
> shared blob, and then add the field offsets to that. Thus the addition
> cannot be avoided anyway.
instead of explicit offsets for base offset, device will probably use
struct cast to access fields.


> You do have a point about sharing the same area between different
> devices though. The above pseudo-code looks like a good pattern. This
> way "acpi-build.c" won't have to hand out the shared blob offsets to
> existing device instances directly; instead, the blob offsets are handed
> down to the firmware, and the devices will get their struct base
> addresses from the firmware. Using one WRITE_POINTER command for that,
> per device, seems fine.
> 
> I'll update the OVMF patches.
Thanks!

> 
> Thanks
> Laszlo
> 
> >         * no need to care where in shared_blob data will be placed,
> >         * shared space is used only when device is present
> >         * since there is no shared_writeback_blob, there isn't 
> >           need in mechanism to propagate written data to device
> >           or notify device about write
> >      
> >    drawback in this approach is that a device would consume
> >    a file slot if fw_cfg and space for WRITE_POINTER in
> >    linker file when present.
> > 
> >    
> >> (b) Regarding the runtime addition in the AML code:  
> > as you pointed out WRITE_POINTER has nothing to do with addition
> > on AML side which is influenced by ADD_POINTER and OVMF and could
> > be fixed with flags down the road, so there is nothing to argue
> > about on this bullet.
> > 
> >   
> >> As discussed before, the main reason *now*, for not pointing VGIA (and
> >> other named integer objects) with ADD_POINTER commands directly to
> >> "meaningful" fields, is that OVMF probes the targets of ADD_POINTER
> >> commands for patterns that look like ACPI table headers. And, for the
> >> time being, we want to suppress any mis-recognitions by prepending some
> >> padding.
> >>
> >> Igor was right to dislike this, and we agreed that *down the road* we
> >> should add allocation flags, or further allocation commands, to supplant
> >> this kind of heuristics in OVMF. But:
> >>
> >> - we don't have time to do it now, plus
> >>
> >> - please observe that the runtime addition in AML relates to the
> >>   ADD_POINTER and the ALLOCATE commands. It does not relate to
> >>   WRITE_POINTER at all.
> >>
> >>   Whatever we change on WRITE_POINTER will do nothing for suppressing
> >>   OVMF's table header probing -- because that is tied to ADD_POINTER
> >>   --, therefore WRITE_POINTER tweaks cannot eliminate the "need to add"
> >>   in AML.
> >>
> >>
> >> In summary, I think the proposed WRITE_POINTER modification is
> >> implementable, but I think it will not pay off, because:
> >>
> >> (a) for QEMU logic, it will not prove useful as soon as we have a
> >> multi-field shared structure (QEMU will have to add field offsets anyway),
> >>
> >> (b) and for eliminating the AML addition (which is a consequence of the
> >> current ADD_POINTER handling in OVMF), it does nothing.
> >>
> >> Thanks
> >> Laszlo  
> >   
>