.../devicetree/bindings/vmgenid/vmgenid.yaml | 57 +++++ MAINTAINERS | 1 + drivers/virt/Kconfig | 2 +- drivers/virt/vmgenid.c | 197 ++++++++++++++---- 4 files changed, 221 insertions(+), 36 deletions(-) create mode 100644 Documentation/devicetree/bindings/vmgenid/vmgenid.yaml
This small series of patches aims to add devicetree bindings support for
the Virtual Machine Generation ID (vmgenid) driver.
Virtual Machine Generation ID driver was introduced in commit af6b54e2b5ba
("virt: vmgenid: notify RNG of VM fork and supply generation ID") as an
ACPI only device.
We would like to extend vmgenid to support devicetree bindings because:
1. A device should not be defined as an ACPI or DT only device.
2. Technically there's no issue with adding devicetree support to vmgenid.
3. This would allow Hypervisors to use vmgenid without the need to
enable ACPI. This is important for hypervisors that want to
keep things minimalistic and enable ACPI only when they have
no other alternative.
While adding the devicetree support we considered re-using existing
structures/code to avoid duplication code and reduce maintenance; so,
we used the same driver to be configured either by ACPI or by DT.
This also meant reimplementing the existing vmgenid ACPI bus driver as a
platform driver and making it discoverable using `driver.of_match_table`
and `driver.acpi_match_table`.
There is no user impact or change in vmgenid functionality when used
with ACPI. We verified ACPI support of these patches on X86 and DT
support on ARM using Firecracker hypervisor
https://github.com/firecracker-microvm/firecracker.
To check schema and syntax errors, the bindings file is verified with:
```
make dt_binding_check \
DT_SCHEMA_FILES=Documentation/devicetree/bindings/vmgenid/vmgenid.yaml
```
and the patches were verified with:
`scripts/checkpatch.pl --strict v1-000*`.
Sudan Landge (4):
virt: vmgenid: rearrange code to make review easier
virt: vmgenid: change implementation to use a platform driver
dt-bindings: Add bindings for vmgenid
virt: vmgenid: add support for devicetree bindings
.../devicetree/bindings/vmgenid/vmgenid.yaml | 57 +++++
MAINTAINERS | 1 +
drivers/virt/Kconfig | 2 +-
drivers/virt/vmgenid.c | 197 ++++++++++++++----
4 files changed, 221 insertions(+), 36 deletions(-)
create mode 100644 Documentation/devicetree/bindings/vmgenid/vmgenid.yaml
--
2.40.1
On 19/03/2024 15:32, Sudan Landge wrote: > > To check schema and syntax errors, the bindings file is verified with: > ``` > make dt_binding_check \ > DT_SCHEMA_FILES=Documentation/devicetree/bindings/vmgenid/vmgenid.yaml > ``` > and the patches were verified with: > `scripts/checkpatch.pl --strict v1-000*`. BTW, if you insist on that and claim that this is a real hardware thing, please upstream your hardware DTS... Best regards, Krzysztof
On 19/03/2024 15:32, Sudan Landge wrote:
> This small series of patches aims to add devicetree bindings support for
> the Virtual Machine Generation ID (vmgenid) driver.
>
> Virtual Machine Generation ID driver was introduced in commit af6b54e2b5ba
> ("virt: vmgenid: notify RNG of VM fork and supply generation ID") as an
> ACPI only device.
> We would like to extend vmgenid to support devicetree bindings because:
> 1. A device should not be defined as an ACPI or DT only device.
Virtual stuff is not a device, so your first assumption or rationale is
not correct.
Virtual stuff can be ACPI only, because DT is not for Virtual stuff.
Best regards,
Krzysztof
On Tue, 2024-03-19 at 16:24 +0100, Krzysztof Kozlowski wrote:
> On 19/03/2024 15:32, Sudan Landge wrote:
> > This small series of patches aims to add devicetree bindings support for
> > the Virtual Machine Generation ID (vmgenid) driver.
> >
> > Virtual Machine Generation ID driver was introduced in commit af6b54e2b5ba
> > ("virt: vmgenid: notify RNG of VM fork and supply generation ID") as an
> > ACPI only device.
> > We would like to extend vmgenid to support devicetree bindings because:
> > 1. A device should not be defined as an ACPI or DT only device.
>
> Virtual stuff is not a device, so your first assumption or rationale is
> not correct.
>
> Virtual stuff can be ACPI only, because DT is not for Virtual stuff.
I strongly disagree with this.
Discovering things is what the device-tree is *for*.
We don't want to add extra complexity and overhead on both host and
guest side to make things discoverable in a *less* efficient way.
The device-tree isn't just a last-resort for when we can't possibly do
things differently in a discoverable way. The device-tree is a first-
class citizen and perfectly valid choice as a way to discover things.
We shouldn't be forcing people to turn things into PCI devices just to
avoid adding DT bindings for them.
And we *certainly* shouldn't be directing people towards all the
awfulness of ACPI, and in-kernel bytecode interpreters, and all that
horridness, just because we don't want to use DT to... describe things.
On Wed, Mar 20, 2024 at 01:50:43PM +0000, David Woodhouse wrote:
> On Tue, 2024-03-19 at 16:24 +0100, Krzysztof Kozlowski wrote:
> > On 19/03/2024 15:32, Sudan Landge wrote:
> > > This small series of patches aims to add devicetree bindings support for
> > > the Virtual Machine Generation ID (vmgenid) driver.
> > >
> > > Virtual Machine Generation ID driver was introduced in commit af6b54e2b5ba
> > > ("virt: vmgenid: notify RNG of VM fork and supply generation ID") as an
> > > ACPI only device.
> > > We would like to extend vmgenid to support devicetree bindings because:
> > > 1. A device should not be defined as an ACPI or DT only device.
This (and the binding patch) tells me nothing about what "Virtual
Machine Generation ID driver" is and isn't really justification for
"why".
> >
> > Virtual stuff is not a device, so your first assumption or rationale is
> > not correct.
> >
> > Virtual stuff can be ACPI only, because DT is not for Virtual stuff.
>
> I strongly disagree with this.
>
> Discovering things is what the device-tree is *for*.
DT/ACPI is for discovering what hardware folks failed to make
discoverable. But here, both sides are software. Can't the software
folks do better?
This is just the latest in $hypervisor bindings[1][2][3]. The value add
must be hypervisors because every SoC vendor seems to be creating their
own with their own interfaces.
> We don't want to add extra complexity and overhead on both host and
> guest side to make things discoverable in a *less* efficient way.
>
> The device-tree isn't just a last-resort for when we can't possibly do
> things differently in a discoverable way. The device-tree is a first-
> class citizen and perfectly valid choice as a way to discover things.
>
> We shouldn't be forcing people to turn things into PCI devices just to
> avoid adding DT bindings for them.
>
> And we *certainly* shouldn't be directing people towards all the
> awfulness of ACPI, and in-kernel bytecode interpreters, and all that
> horridness, just because we don't want to use DT to... describe things.
I assume you have other calls into the hypervisor and notifications from
the hypervisor? Are you going to add DT nodes for each one? I'd be more
comfortable with DT describing THE communication channel with the
hypervisor than what sounds like a singular function. Otherwise, what's
the next binding?
Rob
[1] https://lore.kernel.org/all/20240222-gunyah-v17-2-1e9da6763d38@quicinc.com/
[2] https://lore.kernel.org/all/20240129083302.26044-4-yi-de.wu@mediatek.com/
[3] https://lore.kernel.org/all/20240127004321.1902477-2-davidai@google.com/
On Wed, 2024-03-20 at 11:15 -0500, Rob Herring wrote:
> On Wed, Mar 20, 2024 at 01:50:43PM +0000, David Woodhouse wrote:
> > On Tue, 2024-03-19 at 16:24 +0100, Krzysztof Kozlowski wrote:
> > > On 19/03/2024 15:32, Sudan Landge wrote:
> > > > This small series of patches aims to add devicetree bindings support for
> > > > the Virtual Machine Generation ID (vmgenid) driver.
> > > >
> > > > Virtual Machine Generation ID driver was introduced in commit af6b54e2b5ba
> > > > ("virt: vmgenid: notify RNG of VM fork and supply generation ID") as an
> > > > ACPI only device.
> > > > We would like to extend vmgenid to support devicetree bindings because:
> > > > 1. A device should not be defined as an ACPI or DT only device.
>
> This (and the binding patch) tells me nothing about what "Virtual
> Machine Generation ID driver" is and isn't really justification for
> "why".
It's a reference to a memory area which the OS can use to tell whether
it's been snapshotted and restored (or 'forked'). A future submission
should have a reference to something like
https://www.qemu.org/docs/master/specs/vmgenid.html or the Microsoft
doc which is linked from there.
> DT/ACPI is for discovering what hardware folks failed to make
> discoverable. But here, both sides are software. Can't the software
> folks do better?
We are. Using device-tree *is* better. :)
> This is just the latest in $hypervisor bindings[1][2][3]. The value add
> must be hypervisors because every SoC vendor seems to be creating their
> own with their own interfaces.
The VMGenId one is cross-platform; we don't *want* to reinvent the
wheel there. We just want to discover that same memory area with
precisely the same semantics, but through the device-tree instead of
being forced to shoe-horn the whole of the ACPI horridness into a
platform which doesn't need it. (Or make it the BAR of a newly-invented
PCI device and have to add PCI to a microVM platform which doesn't
otherwise need it, etc.)
> I assume you have other calls into the hypervisor and notifications from
> the hypervisor? Are you going to add DT nodes for each one? I'd be more
> comfortable with DT describing THE communication channel with the
> hypervisor than what sounds like a singular function.
This isn't hypervisor-specific. There is a memory region with certain
semantics which may exist on all kinds of platforms, and we're just
allowing the guest to discover where it is. I don't see how it fits
into the model you're describing above.
> Otherwise, what's the next binding?
You meant that last as a rhetorical question, but I'll answer it
anyway. The thing I'm actually working on this week is a mechanism to
expose clock synchronisation (since it's kind of pointless for *all* of
the guests running on a host to run NTP/PTP/PPS/whatever to calibrate
the *same* underlying oscillator).
As far as the *discoverability* is concerned, it's fundamentally the
same thing — just a memory region with certain defined semantics, and
probably an interrupt for when the contents change.
There *isn't* an ACPI specification for that one already; I was
thinking of describing it *only* in DT and if someone wants it on a
platform which is afflicted with ACPI, they can just do it in a PRP0001
device.
As with vmgenid, there's really very little benefit to wrapping a whole
bunch of pointless emulated "hardware discoverability" around it, when
it can just be described by ACPI/DT directly. That's what they're
*for*.
On Wed, Mar 20, 2024 at 04:55:45PM +0000, David Woodhouse wrote:
> On Wed, 2024-03-20 at 11:15 -0500, Rob Herring wrote:
> > On Wed, Mar 20, 2024 at 01:50:43PM +0000, David Woodhouse wrote:
> > > On Tue, 2024-03-19 at 16:24 +0100, Krzysztof Kozlowski wrote:
> > > > On 19/03/2024 15:32, Sudan Landge wrote:
> > > > > This small series of patches aims to add devicetree bindings support for
> > > > > the Virtual Machine Generation ID (vmgenid) driver.
> > > > >
> > > > > Virtual Machine Generation ID driver was introduced in commit af6b54e2b5ba
> > > > > ("virt: vmgenid: notify RNG of VM fork and supply generation ID") as an
> > > > > ACPI only device.
> > > > > We would like to extend vmgenid to support devicetree bindings because:
> > > > > 1. A device should not be defined as an ACPI or DT only device.
> >
> > This (and the binding patch) tells me nothing about what "Virtual
> > Machine Generation ID driver" is and isn't really justification for
> > "why".
>
> It's a reference to a memory area which the OS can use to tell whether
> it's been snapshotted and restored (or 'forked'). A future submission
> should have a reference to something like
> https://www.qemu.org/docs/master/specs/vmgenid.html or the Microsoft
> doc which is linked from there.
That doc mentions fw_cfg for which we already have a binding. Why can't
it be used/extended here?
Rob
[1] https://www.kernel.org/doc/Documentation/devicetree/bindings/firmware/qemu%2Cfw-cfg-mmio.yaml
On 21/03/2024 13:32, Rob Herring wrote:
> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>
>
>
> On Wed, Mar 20, 2024 at 04:55:45PM +0000, David Woodhouse wrote:
>> On Wed, 2024-03-20 at 11:15 -0500, Rob Herring wrote:
>>> On Wed, Mar 20, 2024 at 01:50:43PM +0000, David Woodhouse wrote:
>>>> On Tue, 2024-03-19 at 16:24 +0100, Krzysztof Kozlowski wrote:
>>>>> On 19/03/2024 15:32, Sudan Landge wrote:
>>>>>> This small series of patches aims to add devicetree bindings support for
>>>>>> the Virtual Machine Generation ID (vmgenid) driver.
>>>>>>
>>>>>> Virtual Machine Generation ID driver was introduced in commit af6b54e2b5ba
>>>>>> ("virt: vmgenid: notify RNG of VM fork and supply generation ID") as an
>>>>>> ACPI only device.
>>>>>> We would like to extend vmgenid to support devicetree bindings because:
>>>>>> 1. A device should not be defined as an ACPI or DT only device.
>>>
>>> This (and the binding patch) tells me nothing about what "Virtual
>>> Machine Generation ID driver" is and isn't really justification for
>>> "why".
>>
>> It's a reference to a memory area which the OS can use to tell whether
>> it's been snapshotted and restored (or 'forked'). A future submission
>> should have a reference to something like
>> https://www.qemu.org/docs/master/specs/vmgenid.html or the Microsoft
>> doc which is linked from there.
>
> That doc mentions fw_cfg for which we already have a binding. Why can't
> it be used/extended here?
QEMU has support for vmgenid but even they do not pass vmgenid directly
to the guest kernel using fw_cfg. QEMU passes the vmgenid/UUID via
fw_cfg to an intermediate UEFI firmware. This UEFI firmware, running as
a guest in QEMU, reads the UUID from fw_cfg and creates ACPI tables for
it. The UEFI firmware then passes the UUID information to the guest
kernel via ACPI.
This approach requires yet another intermediary which is UEFI firmware
and adds more complexity than ACPI for minimalist hypervisors that do
not have an intermediate bootloader today.
>
> Rob
>
> [1] https://www.kernel.org/doc/Documentation/devicetree/bindings/firmware/qemu%2Cfw-cfg-mmio.yaml
On 21/03/2024 18:39, Landge, Sudan wrote:
>
>
> On 21/03/2024 13:32, Rob Herring wrote:
>> CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
>>
>>
>>
>> On Wed, Mar 20, 2024 at 04:55:45PM +0000, David Woodhouse wrote:
>>> On Wed, 2024-03-20 at 11:15 -0500, Rob Herring wrote:
>>>> On Wed, Mar 20, 2024 at 01:50:43PM +0000, David Woodhouse wrote:
>>>>> On Tue, 2024-03-19 at 16:24 +0100, Krzysztof Kozlowski wrote:
>>>>>> On 19/03/2024 15:32, Sudan Landge wrote:
>>>>>>> This small series of patches aims to add devicetree bindings support for
>>>>>>> the Virtual Machine Generation ID (vmgenid) driver.
>>>>>>>
>>>>>>> Virtual Machine Generation ID driver was introduced in commit af6b54e2b5ba
>>>>>>> ("virt: vmgenid: notify RNG of VM fork and supply generation ID") as an
>>>>>>> ACPI only device.
>>>>>>> We would like to extend vmgenid to support devicetree bindings because:
>>>>>>> 1. A device should not be defined as an ACPI or DT only device.
>>>>
>>>> This (and the binding patch) tells me nothing about what "Virtual
>>>> Machine Generation ID driver" is and isn't really justification for
>>>> "why".
>>>
>>> It's a reference to a memory area which the OS can use to tell whether
>>> it's been snapshotted and restored (or 'forked'). A future submission
>>> should have a reference to something like
>>> https://www.qemu.org/docs/master/specs/vmgenid.html or the Microsoft
>>> doc which is linked from there.
>>
>> That doc mentions fw_cfg for which we already have a binding. Why can't
>> it be used/extended here?
> QEMU has support for vmgenid but even they do not pass vmgenid directly
> to the guest kernel using fw_cfg. QEMU passes the vmgenid/UUID via
> fw_cfg to an intermediate UEFI firmware. This UEFI firmware, running as
> a guest in QEMU, reads the UUID from fw_cfg and creates ACPI tables for
> it. The UEFI firmware then passes the UUID information to the guest
> kernel via ACPI.
> This approach requires yet another intermediary which is UEFI firmware
> and adds more complexity than ACPI for minimalist hypervisors that do
> not have an intermediate bootloader today.
What stops you from passing fw_cfg not to UEFI FW? BTW, no actual VM
name was used in your posting, but now suddenly it is a talk about QEMU.
Best regards,
Krzysztof
On Fri, 2024-03-22 at 06:40 +0100, Krzysztof Kozlowski wrote:
> On 21/03/2024 18:39, Landge, Sudan wrote:
> >
> >
> > On 21/03/2024 13:32, Rob Herring wrote:
> > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> > >
> > >
> > >
> > > On Wed, Mar 20, 2024 at 04:55:45PM +0000, David Woodhouse wrote:
> > > > On Wed, 2024-03-20 at 11:15 -0500, Rob Herring wrote:
> > > > > On Wed, Mar 20, 2024 at 01:50:43PM +0000, David Woodhouse wrote:
> > > > > > On Tue, 2024-03-19 at 16:24 +0100, Krzysztof Kozlowski wrote:
> > > > > > > On 19/03/2024 15:32, Sudan Landge wrote:
> > > > > > > > This small series of patches aims to add devicetree bindings support for
> > > > > > > > the Virtual Machine Generation ID (vmgenid) driver.
> > > > > > > >
> > > > > > > > Virtual Machine Generation ID driver was introduced in commit af6b54e2b5ba
> > > > > > > > ("virt: vmgenid: notify RNG of VM fork and supply generation ID") as an
> > > > > > > > ACPI only device.
> > > > > > > > We would like to extend vmgenid to support devicetree bindings because:
> > > > > > > > 1. A device should not be defined as an ACPI or DT only device.
> > > > >
> > > > > This (and the binding patch) tells me nothing about what "Virtual
> > > > > Machine Generation ID driver" is and isn't really justification for
> > > > > "why".
> > > >
> > > > It's a reference to a memory area which the OS can use to tell whether
> > > > it's been snapshotted and restored (or 'forked'). A future submission
> > > > should have a reference to something like
> > > > https://www.qemu.org/docs/master/specs/vmgenid.html or the Microsoft
> > > > doc which is linked from there.
> > >
> > > That doc mentions fw_cfg for which we already have a binding. Why can't
> > > it be used/extended here?
> > QEMU has support for vmgenid but even they do not pass vmgenid directly
> > to the guest kernel using fw_cfg. QEMU passes the vmgenid/UUID via
> > fw_cfg to an intermediate UEFI firmware. This UEFI firmware, running as
> > a guest in QEMU, reads the UUID from fw_cfg and creates ACPI tables for
> > it. The UEFI firmware then passes the UUID information to the guest
> > kernel via ACPI.
> > This approach requires yet another intermediary which is UEFI firmware
> > and adds more complexity than ACPI for minimalist hypervisors that do
> > not have an intermediate bootloader today.
>
> What stops you from passing fw_cfg not to UEFI FW? BTW, no actual VM
> name was used in your posting, but now suddenly it is a talk about QEMU.
That would be possible. But not ideal. Just as exposing it via PCI
would be possible, but not ideal. Or forcing ACPI onto the guests in
question, and various other less efficient options.
If what we're really looking at here is a hostile takeover of the DT
bindings repository, with a blanket "No, DT is dead. Go use something
else, preferably ACPI", than all those other options are possible. We
*never* have to add a new binding to DT ever again. Let's just set the
existing bindings in stone and move on.
But hopefully that isn't the case. DT is the simplest and most
effective way to provide discovery, especially for embedded and microVM
systems. It isn't just a *workaround* for broken hardware which *can't*
do a slower and more complex form of discovery.
And it's absolutely the right thing to do for exposing the equivalent
of the ACPI vmgenid device in a system which isn't afflicted by ACPI
and doesn't *want* to be.
On Fri, Mar 22, 2024 at 3:21 AM David Woodhouse <dwmw2@infradead.org> wrote:
>
> On Fri, 2024-03-22 at 06:40 +0100, Krzysztof Kozlowski wrote:
> > On 21/03/2024 18:39, Landge, Sudan wrote:
> > >
> > >
> > > On 21/03/2024 13:32, Rob Herring wrote:
> > > > CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.
> > > >
> > > >
> > > >
> > > > On Wed, Mar 20, 2024 at 04:55:45PM +0000, David Woodhouse wrote:
> > > > > On Wed, 2024-03-20 at 11:15 -0500, Rob Herring wrote:
> > > > > > On Wed, Mar 20, 2024 at 01:50:43PM +0000, David Woodhouse wrote:
> > > > > > > On Tue, 2024-03-19 at 16:24 +0100, Krzysztof Kozlowski wrote:
> > > > > > > > On 19/03/2024 15:32, Sudan Landge wrote:
> > > > > > > > > This small series of patches aims to add devicetree bindings support for
> > > > > > > > > the Virtual Machine Generation ID (vmgenid) driver.
> > > > > > > > >
> > > > > > > > > Virtual Machine Generation ID driver was introduced in commit af6b54e2b5ba
> > > > > > > > > ("virt: vmgenid: notify RNG of VM fork and supply generation ID") as an
> > > > > > > > > ACPI only device.
> > > > > > > > > We would like to extend vmgenid to support devicetree bindings because:
> > > > > > > > > 1. A device should not be defined as an ACPI or DT only device.
> > > > > >
> > > > > > This (and the binding patch) tells me nothing about what "Virtual
> > > > > > Machine Generation ID driver" is and isn't really justification for
> > > > > > "why".
> > > > >
> > > > > It's a reference to a memory area which the OS can use to tell whether
> > > > > it's been snapshotted and restored (or 'forked'). A future submission
> > > > > should have a reference to something like
> > > > > https://www.qemu.org/docs/master/specs/vmgenid.html or the Microsoft
> > > > > doc which is linked from there.
> > > >
> > > > That doc mentions fw_cfg for which we already have a binding. Why can't
> > > > it be used/extended here?
> > > QEMU has support for vmgenid but even they do not pass vmgenid directly
> > > to the guest kernel using fw_cfg. QEMU passes the vmgenid/UUID via
> > > fw_cfg to an intermediate UEFI firmware. This UEFI firmware, running as
> > > a guest in QEMU, reads the UUID from fw_cfg and creates ACPI tables for
> > > it. The UEFI firmware then passes the UUID information to the guest
> > > kernel via ACPI.
> > > This approach requires yet another intermediary which is UEFI firmware
> > > and adds more complexity than ACPI for minimalist hypervisors that do
> > > not have an intermediate bootloader today.
> >
> > What stops you from passing fw_cfg not to UEFI FW? BTW, no actual VM
> > name was used in your posting, but now suddenly it is a talk about QEMU.
>
> That would be possible. But not ideal.
Why not ideal?
To rephrase the question, why is it fine for UEFI to read the vmgenid
from fw_cfg, but the kernel can't use the same mechanism? The response
that you'd have to use UEFI to use fw_cfg makes no sense to me. The
only reason I can think of is just being lazy and wanting to have
minimal changes to some existing driver. It looks to me like you could
implement this entirely in userspace already with zero kernel or
binding changes. From a quick look, we already have a fw_cfg driver
exposing UUID (that's the same thing as vmgenid AIUI) to userspace,
and you can feed that back into the random pool.
I am concerned that we already have a mechanism and you want to add a
second way. When do we ever think that's a good idea? What happens on
the next piece of fw_cfg data? We add yet another binding?
> Just as exposing it via PCI
> would be possible, but not ideal. Or forcing ACPI onto the guests in
> question, and various other less efficient options.
>
> If what we're really looking at here is a hostile takeover of the DT
> bindings repository, with a blanket "No, DT is dead. Go use something
> else, preferably ACPI", than all those other options are possible. We
> *never* have to add a new binding to DT ever again. Let's just set the
> existing bindings in stone and move on.
I'll refrain from all my snarky replies to this that aren't helpful to
the discussion.
Rob
On Fri, 2024-03-22 at 08:22 -0500, Rob Herring wrote: > > > > What stops you from passing fw_cfg not to UEFI FW? BTW, no actual VM > > > name was used in your posting, but now suddenly it is a talk about QEMU. (Forgot to address the second part of that last time. No specific VMM was mentioned in the first place because this isn't VMM-specific) > > That would be possible. But not ideal. > > Why not ideal? > > To rephrase the question, why is it fine for UEFI to read the vmgenid > from fw_cfg, but the kernel can't use the same mechanism? Because fw_cfg an incestuous way to get data from the VMM into the BIOS (both SeaBIOS and UEFI). It's the way we pass the ACPI tables and things like that. It *isn't* designed as a general-purpose way of doing device discovery for use by various operating systems. I'm also not sure Firecracker, which is the VMM Sudan is working on, even *has* fw_cfg. Especially on ARM. If we're going to be forced to add some complicated device with MMIO and DMA just to be able to advertise the existence of a simple memory region, that's just as bad as being forced to expose it as an emulated PCI device. This is what DT is *for*. > The response > that you'd have to use UEFI to use fw_cfg makes no sense to me. The > only reason I can think of is just being lazy and wanting to have > minimal changes to some existing driver. It looks to me like you could > implement this entirely in userspace already with zero kernel or > binding changes. From a quick look, we already have a fw_cfg driver > exposing UUID (that's the same thing as vmgenid AIUI) to userspace, > and you can feed that back into the random pool. > > I am concerned that we already have a mechanism and you want to add a > second way. When do we ever think that's a good idea? What happens > on the next piece of fw_cfg data? We add yet another binding? No, because fw_cfg is a way for the VMM to give configuration information to the firmware. There's a clue in the name. The firmware then sets up ACPI tables or DT to pass information in a more coherent and structured fashion to general-purpose operating systems. And some VMMs *don't* use fw_cfg at all because for the minimal microvm case it's overkill.
On 22/03/2024 14:27, David Woodhouse wrote: > On Fri, 2024-03-22 at 08:22 -0500, Rob Herring wrote: >> >>>> What stops you from passing fw_cfg not to UEFI FW? BTW, no actual VM >>>> name was used in your posting, but now suddenly it is a talk about QEMU. > > (Forgot to address the second part of that last time. No specific VMM > was mentioned in the first place because this isn't VMM-specific) > QEMU is referenced to explain `vmgenid` which they are also using and have more documentation on it. We mentioned the hypervisor we tested the changes with in the cover letter which is https://github.com/firecracker-microvm/firecracker but this change isn't VMM specific. >>> That would be possible. But not ideal. >> >> Why not ideal? >> >> To rephrase the question, why is it fine for UEFI to read the vmgenid >> from fw_cfg, but the kernel can't use the same mechanism? > > Because fw_cfg an incestuous way to get data from the VMM into the BIOS > (both SeaBIOS and UEFI). It's the way we pass the ACPI tables and > things like that. > > It *isn't* designed as a general-purpose way of doing device discovery > for use by various operating systems. > > I'm also not sure Firecracker, which is the VMM Sudan is working on, > even *has* fw_cfg. Especially on ARM. If we're going to be forced to > add some complicated device with MMIO and DMA just to be able to > advertise the existence of a simple memory region, that's just as bad > as being forced to expose it as an emulated PCI device. > > This is what DT is *for*. > > >> The response >> that you'd have to use UEFI to use fw_cfg makes no sense to me. The >> only reason I can think of is just being lazy and wanting to have >> minimal changes to some existing driver. It looks to me like you could >> implement this entirely in userspace already with zero kernel or >> binding changes. From a quick look, we already have a fw_cfg driver >> exposing UUID (that's the same thing as vmgenid AIUI) to userspace, >> and you can feed that back into the random pool. >> >> I am concerned that we already have a mechanism and you want to add a >> second way. When do we ever think that's a good idea? What happens >> on the next piece of fw_cfg data? We add yet another binding? > > No, because fw_cfg is a way for the VMM to give configuration > information to the firmware. There's a clue in the name. The firmware > then sets up ACPI tables or DT to pass information in a more coherent > and structured fashion to general-purpose operating systems. > > And some VMMs *don't* use fw_cfg at all because for the minimal microvm > case it's overkill. > The hypervisor we work on (https://github.com/firecracker-microvm/firecracker) does not have fw_cfg, it loads kernel directly without the need for UEFI or any intermediate firmware. It is, as said, an overkill to enable UEFI and fw_cfg just to support `vmgenid` specially when there is an alternative available which could keep things simple for the vmm and for the linux driver.
© 2016 - 2026 Red Hat, Inc.