[PATCH 0/1] vmx: Fix <genid/> mapping

Michal Privoznik posted 1 patch 2 years, 7 months ago
Test syntax-check failed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/libvirt tags/patchew/cover.1632900578.git.mprivozn@redhat.com
src/vmx/vmx.c                            | 16 ++++++++++++----
tests/vmx2xmldata/esx-in-the-wild-10.xml |  2 +-
2 files changed, 13 insertions(+), 5 deletions(-)
[PATCH 0/1] vmx: Fix <genid/> mapping
Posted by Michal Privoznik 2 years, 7 months ago
Apparently, parsing vmx.genid is not as easy as I thought. Anyway, it
was brought up in a private thread that libvirt doesn't report correct
UUIDs. For instance for the following input:

  vm.genid = "-8536691797830587195"
  vm.genidX = "-1723453263670062919"

Libvirt would report:

  <genid>8987940a-0951-2cc5-e815-10634ff550b9</genid>

while virt-v2v would report:

  <genid>09512cc5-940a-8987-b950-f54f631015e8</genid>

Another example:

  vm.genid = "-6284418052148993891"
  vm.genidX = "-649955058627554545"

Libvirt:

  <genid>a8c94313-ed9b-609d-f6fa-e5515a89530f</genid>

virt-v2v:

  <genid>ed9b609d-4313-a8c9-0f53-895a51e5faf6</genid>


To test my patch I modified tests/vmx2xmldata/esx-in-the-wild-10.vmx
(because it already contains vmx.genid, but any file can be modified
really), and then ran:

  libvirt.git/_build/tests $ VIR_TEST_DEBUG=1 VIR_TEST_RANGE=58 ./vmx2xmltest

Mind you, the fix is almost direct rewrite of virt-v2v's algorithm,
except it's optimized for C and not OCaml :-) You can find various
attempts/versions of that on my gitlab:

  https://gitlab.com/MichalPrivoznik/libvirt/-/commits/vmx_genid/

I'm sending only the last one because that looks the best IMO.

Michal Prívozník (1):
  vmx: Fix <genid/> mapping

 src/vmx/vmx.c                            | 16 ++++++++++++----
 tests/vmx2xmldata/esx-in-the-wild-10.xml |  2 +-
 2 files changed, 13 insertions(+), 5 deletions(-)

-- 
2.32.0

Re: [PATCH 0/1] vmx: Fix <genid/> mapping
Posted by Richard W.M. Jones 2 years, 7 months ago
On Wed, Sep 29, 2021 at 10:01:55AM +0200, Michal Privoznik wrote:
> Apparently, parsing vmx.genid is not as easy as I thought. Anyway, it
> was brought up in a private thread that libvirt doesn't report correct
> UUIDs. For instance for the following input:
> 
>   vm.genid = "-8536691797830587195"
>   vm.genidX = "-1723453263670062919"

(The two lines above come from a VMware .vmx file)

The only thing that really matters is what the guest sees.  I ran
VMGENID.EXE (https://bugzilla.redhat.com/show_bug.cgi?id=1598350#c3
https://docs.microsoft.com/en-gb/windows/win32/hyperv_v2/virtual-machine-generation-identifier)
inside the guest and it showed:

  8987940a09512cc5:e81510634ff550b9

Note these numbers are the hex equivalents of the VMware .vmx numbers:

>>> print("%x" % (2**64-8536691797830587195))
8987940a09512cc5
>>> print("%x" % (2**64-1723453263670062919))
e81510634ff550b9

> Libvirt would report:
> 
>   <genid>8987940a-0951-2cc5-e815-10634ff550b9</genid>
> 
> while virt-v2v would report:
> 
>   <genid>09512cc5-940a-8987-b950-f54f631015e8</genid>

After thinking about this a bit more, IMHO the real problem is
with qemu.  If you pass this to qemu:

  -device vmgenid,guid=8987940a-0951-2cc5-e815-10634ff550b9,id=vmgenid0

then inside the guest VMGENID shows 2cc509518987940a:b950f54f631015e8 (wrong)

If you pass this to qemu:

  ...guid=09512cc5-940a-8987-b950-f54f631015e8...

then inside the guest it sees 8987940a09512cc5:e81510634ff550b9
(the correct values, matching VMware).

This is the reason why virt-v2v mangles the GUID, in order to trick
libvirt into passing a mangled GUID which gets mangled again by qemu
which is the same as unmangling it.

So another way to fix this could be for us to fix qemu.  We could just
pass the two numbers to qemu instead of using GUIDs, eg:

  -device vmgenid,low=0x8987940a09512cc5,high=0xe81510634ff550b9,id=vmgenid0

and then we'd fix the implementation in qemu so that the low/high
values match what is seen by VMGENID.EXE in the guest.

Or we could have libvirt use the current GUID in <genid> but to do the
mangling when it gets passed through to qemu (instead of virt-v2v
doing the mangling).

Adding qemu-devel because I'm interesting to see if the qemu
developers would prefer to fix this properly in qemu.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html

Re: [PATCH 0/1] vmx: Fix <genid/> mapping
Posted by Daniel P. Berrangé 2 years, 7 months ago
On Wed, Sep 29, 2021 at 10:20:44AM +0100, Richard W.M. Jones wrote:
> On Wed, Sep 29, 2021 at 10:01:55AM +0200, Michal Privoznik wrote:
> > Apparently, parsing vmx.genid is not as easy as I thought. Anyway, it
> > was brought up in a private thread that libvirt doesn't report correct
> > UUIDs. For instance for the following input:
> > 
> >   vm.genid = "-8536691797830587195"
> >   vm.genidX = "-1723453263670062919"
> 
> (The two lines above come from a VMware .vmx file)
> 
> The only thing that really matters is what the guest sees.  I ran
> VMGENID.EXE (https://bugzilla.redhat.com/show_bug.cgi?id=1598350#c3
> https://docs.microsoft.com/en-gb/windows/win32/hyperv_v2/virtual-machine-generation-identifier)
> inside the guest and it showed:
> 
>   8987940a09512cc5:e81510634ff550b9
> 
> Note these numbers are the hex equivalents of the VMware .vmx numbers:
> 
> >>> print("%x" % (2**64-8536691797830587195))
> 8987940a09512cc5
> >>> print("%x" % (2**64-1723453263670062919))
> e81510634ff550b9
> 
> > Libvirt would report:
> > 
> >   <genid>8987940a-0951-2cc5-e815-10634ff550b9</genid>
> > 
> > while virt-v2v would report:
> > 
> >   <genid>09512cc5-940a-8987-b950-f54f631015e8</genid>
> 
> After thinking about this a bit more, IMHO the real problem is
> with qemu.  If you pass this to qemu:
> 
>   -device vmgenid,guid=8987940a-0951-2cc5-e815-10634ff550b9,id=vmgenid0
> 
> then inside the guest VMGENID shows 2cc509518987940a:b950f54f631015e8 (wrong)
> 
> If you pass this to qemu:
> 
>   ...guid=09512cc5-940a-8987-b950-f54f631015e8...
> 
> then inside the guest it sees 8987940a09512cc5:e81510634ff550b9
> (the correct values, matching VMware).
> 
> This is the reason why virt-v2v mangles the GUID, in order to trick
> libvirt into passing a mangled GUID which gets mangled again by qemu
> which is the same as unmangling it.
> 
> So another way to fix this could be for us to fix qemu.  We could just
> pass the two numbers to qemu instead of using GUIDs, eg:
> 
>   -device vmgenid,low=0x8987940a09512cc5,high=0xe81510634ff550b9,id=vmgenid0
> 
> and then we'd fix the implementation in qemu so that the low/high
> values match what is seen by VMGENID.EXE in the guest.
> 
> Or we could have libvirt use the current GUID in <genid> but to do the
> mangling when it gets passed through to qemu (instead of virt-v2v
> doing the mangling).

On the libvirt side, the #1 most important thing is that a given
XML string

  <genid>8987940a-0951-2cc5-e815-10634ff550b9</genid>

results in the same value being exposed to the guest OS, for both
the QEMU and VMWare drivers.  Whehter the guest sees the bytes in
the same order is not essential, but it would be less of a surprise
if it did match.

Ultimately as long as the mapping from libvirt XML to guest visible
string is consistent across drivers, that's sufficient.

> Adding qemu-devel because I'm interesting to see if the qemu
> developers would prefer to fix this properly in qemu.

No matter what order QEMU accepts the data in, it can be said to be
functionally correct. If the order is "unusual" though, it ought to
at least be documented how the QEMU order corresponds to guest visible
order.

Incidentally I wonder how much to trust VMGENID.EXE and whether that
actally reports what it gets out of guest memory "as is", or whether
it has done any byte re-ordering.

I'm curious what Linux kernel sees for the genid on Vmware vs KVM
hosts, as probably I'd trust that data more ?

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH 0/1] vmx: Fix <genid/> mapping
Posted by Richard W.M. Jones 2 years, 7 months ago
On Wed, Sep 29, 2021 at 10:33:43AM +0100, Daniel P. Berrangé wrote:
> On Wed, Sep 29, 2021 at 10:20:44AM +0100, Richard W.M. Jones wrote:
> > On Wed, Sep 29, 2021 at 10:01:55AM +0200, Michal Privoznik wrote:
> > > Apparently, parsing vmx.genid is not as easy as I thought. Anyway, it
> > > was brought up in a private thread that libvirt doesn't report correct
> > > UUIDs. For instance for the following input:
> > > 
> > >   vm.genid = "-8536691797830587195"
> > >   vm.genidX = "-1723453263670062919"
> > 
> > (The two lines above come from a VMware .vmx file)
> > 
> > The only thing that really matters is what the guest sees.  I ran
> > VMGENID.EXE (https://bugzilla.redhat.com/show_bug.cgi?id=1598350#c3
> > https://docs.microsoft.com/en-gb/windows/win32/hyperv_v2/virtual-machine-generation-identifier)
> > inside the guest and it showed:
> > 
> >   8987940a09512cc5:e81510634ff550b9
> > 
> > Note these numbers are the hex equivalents of the VMware .vmx numbers:
> > 
> > >>> print("%x" % (2**64-8536691797830587195))
> > 8987940a09512cc5
> > >>> print("%x" % (2**64-1723453263670062919))
> > e81510634ff550b9
> > 
> > > Libvirt would report:
> > > 
> > >   <genid>8987940a-0951-2cc5-e815-10634ff550b9</genid>
> > > 
> > > while virt-v2v would report:
> > > 
> > >   <genid>09512cc5-940a-8987-b950-f54f631015e8</genid>
> > 
> > After thinking about this a bit more, IMHO the real problem is
> > with qemu.  If you pass this to qemu:
> > 
> >   -device vmgenid,guid=8987940a-0951-2cc5-e815-10634ff550b9,id=vmgenid0
> > 
> > then inside the guest VMGENID shows 2cc509518987940a:b950f54f631015e8 (wrong)
> > 
> > If you pass this to qemu:
> > 
> >   ...guid=09512cc5-940a-8987-b950-f54f631015e8...
> > 
> > then inside the guest it sees 8987940a09512cc5:e81510634ff550b9
> > (the correct values, matching VMware).
> > 
> > This is the reason why virt-v2v mangles the GUID, in order to trick
> > libvirt into passing a mangled GUID which gets mangled again by qemu
> > which is the same as unmangling it.
> > 
> > So another way to fix this could be for us to fix qemu.  We could just
> > pass the two numbers to qemu instead of using GUIDs, eg:
> > 
> >   -device vmgenid,low=0x8987940a09512cc5,high=0xe81510634ff550b9,id=vmgenid0
> > 
> > and then we'd fix the implementation in qemu so that the low/high
> > values match what is seen by VMGENID.EXE in the guest.
> > 
> > Or we could have libvirt use the current GUID in <genid> but to do the
> > mangling when it gets passed through to qemu (instead of virt-v2v
> > doing the mangling).
> 
> On the libvirt side, the #1 most important thing is that a given
> XML string
> 
>   <genid>8987940a-0951-2cc5-e815-10634ff550b9</genid>
> 
> results in the same value being exposed to the guest OS, for both
> the QEMU and VMWare drivers.  Whehter the guest sees the bytes in
> the same order is not essential, but it would be less of a surprise
> if it did match.

I don't know why we decided to use a GUID for this.  The feature
itself (https://go.microsoft.com/fwlink/?LinkId=260709) defines it as
an 128 bit / 8 byte number.  The only connection to GUIDs is the size.

> Ultimately as long as the mapping from libvirt XML to guest visible
> string is consistent across drivers, that's sufficient.
> 
> > Adding qemu-devel because I'm interesting to see if the qemu
> > developers would prefer to fix this properly in qemu.
> 
> No matter what order QEMU accepts the data in, it can be said to be
> functionally correct. If the order is "unusual" though, it ought to
> at least be documented how the QEMU order corresponds to guest visible
> order.
> 
> Incidentally I wonder how much to trust VMGENID.EXE and whether that
> actally reports what it gets out of guest memory "as is", or whether
> it has done any byte re-ordering.
> 
> I'm curious what Linux kernel sees for the genid on Vmware vs KVM
> hosts, as probably I'd trust that data more ?

As far as I can tell no driver has successfully made it upstream,
although there have been a few attempts:

  https://lkml.org/lkml/2018/3/1/498

Here's a more recent one from 10 months ago:

  https://lore.kernel.org/linux-doc/3E05451B-A9CD-4719-99D0-72750A304044@amazon.com/

If I have time I'll patch a Linux kernel with the second patch and see
how it relates to the VMware and KVM parameters, but it probably won't
be today.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
Fedora Windows cross-compiler. Compile Windows programs, test, and
build Windows installers. Over 100 libraries supported.
http://fedoraproject.org/wiki/MinGW

Re: [PATCH 0/1] vmx: Fix <genid/> mapping
Posted by Daniel P. Berrangé 2 years, 7 months ago
On Wed, Sep 29, 2021 at 10:46:39AM +0100, Richard W.M. Jones wrote:
> On Wed, Sep 29, 2021 at 10:33:43AM +0100, Daniel P. Berrangé wrote:
> > Ultimately as long as the mapping from libvirt XML to guest visible
> > string is consistent across drivers, that's sufficient.
> > 
> > > Adding qemu-devel because I'm interesting to see if the qemu
> > > developers would prefer to fix this properly in qemu.
> > 
> > No matter what order QEMU accepts the data in, it can be said to be
> > functionally correct. If the order is "unusual" though, it ought to
> > at least be documented how the QEMU order corresponds to guest visible
> > order.
> > 
> > Incidentally I wonder how much to trust VMGENID.EXE and whether that
> > actally reports what it gets out of guest memory "as is", or whether
> > it has done any byte re-ordering.
> > 
> > I'm curious what Linux kernel sees for the genid on Vmware vs KVM
> > hosts, as probably I'd trust that data more ?
> 
> As far as I can tell no driver has successfully made it upstream,
> although there have been a few attempts:
> 
>   https://lkml.org/lkml/2018/3/1/498
> 
> Here's a more recent one from 10 months ago:
> 
>   https://lore.kernel.org/linux-doc/3E05451B-A9CD-4719-99D0-72750A304044@amazon.com/
> 
> If I have time I'll patch a Linux kernel with the second patch and see
> how it relates to the VMware and KVM parameters, but it probably won't
> be today.

I'm not sure if we actually need the full driver or not for testing
purposes. The the GenID is just in memory somewhere, and the somewhere
is reported via ACPI table entry. For QEMU its easy as the data is
exposed via fw_cfg which can be read from sysfs directly without
even needing to look at ACPI entries to find it. Not sure how we
find it with VMWare/HyperV though.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH 0/1] vmx: Fix <genid/> mapping
Posted by Richard W.M. Jones 2 years, 7 months ago
On Wed, Sep 29, 2021 at 11:07:30AM +0100, Daniel P. Berrangé wrote:
> I'm not sure if we actually need the full driver or not for testing
> purposes. The the GenID is just in memory somewhere, and the somewhere
> is reported via ACPI table entry. For QEMU its easy as the data is
> exposed via fw_cfg which can be read from sysfs directly without
> even needing to look at ACPI entries to find it. Not sure how we
> find it with VMWare/HyperV though.

This still has the problem that qemu is mangling the vmgenid.
Nevertheless, on qemu-6.1.0-5.fc36.x86_64 I added this to a Linux
guest:

  <genid>11223344-5566-7788-99aa-bbccddeeff00</genid>

which turned into:

  -device vmgenid,guid=11223344-5566-7788-99aa-bbccddeeff00,id=vmgenid0

Inside the guest:

# ls /sys/firmware/qemu_fw_cfg/by_name/etc/vmgenid_guid/ -l
total 0
-r--------. 1 root root 4096 Sep 29 11:16 key
-r--------. 1 root root 4096 Sep 29 11:16 name
-r--------. 1 root root    0 Sep 29 11:16 raw
-r--------. 1 root root 4096 Sep 29 11:16 size
# hexdump -C /sys/firmware/qemu_fw_cfg/by_name/etc/vmgenid_guid/raw 
00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000020  00 00 00 00 00 00 00 00  44 33 22 11 66 55 88 77  |........D3".fU.w|
00000030  99 aa bb cc dd ee ff 00  00 00 00 00 00 00 00 00  |................|
00000040  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00001000


But I think what I really need to do is look at the raw physical
address:

# hexdump -C /sys/firmware/qemu_fw_cfg/by_name/etc/vmgenid_addr/raw 
00000000  28 f0 ff 7f 00 00 00 00                           |(.......|
00000008

# dd if=/dev/mem bs=1 skip=$(( 0x7ffff028 )) count=16 | hexdump -C
16+0 records in
16+0 records out
16 bytes copied, 6.0392e-05 s, 265 kB/s
00000000  44 33 22 11 66 55 88 77  99 aa bb cc dd ee ff 00  |D3".fU.w........|
00000010


I think for VMware I'm really going to need the kernel driver, unless
there's some way that iasl can be used to extract the information?

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
libguestfs lets you edit virtual machines.  Supports shell scripting,
bindings from many languages.  http://libguestfs.org

Re: [PATCH 0/1] vmx: Fix <genid/> mapping
Posted by Richard W.M. Jones 2 years, 7 months ago
On Wed, Sep 29, 2021 at 10:46:38AM +0100, Richard W.M. Jones wrote:
> I don't know why we decided to use a GUID for this.  The feature
> itself (https://go.microsoft.com/fwlink/?LinkId=260709) defines it as
> an 128 bit / 8 byte number.  The only connection to GUIDs is the size.

*cough* .. 16 bytes :-)

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v

Re: [PATCH 0/1] vmx: Fix <genid/> mapping
Posted by Richard W.M. Jones 2 years, 7 months ago
Looking at the qemu code the problem IMHO is:

https://gitlab.com/qemu-project/qemu/-/blob/6b54a31bf7b403672a798b6443b1930ae6c74dea/docs/specs/vmgenid.txt#L189
https://gitlab.com/qemu-project/qemu/-/blob/6b54a31bf7b403672a798b6443b1930ae6c74dea/hw/acpi/vmgenid.c#L37

This byte swapping makes no sense to me.  How do we know that the
guest is little endian?  What will this code do for BE guests?  I
think qemu would be better off treating the "GUID" as a list of bytes
and writing that exactly into the guest memory.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html