OvmfPkg/Include/Library/PlatformInitLib.h | 1 + OvmfPkg/Library/PlatformInitLib/MemDetect.c | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-)
If the user specified a size for the PCI MMIO window via the option:
-fw_cfg name=opt/ovmf/X-PciMmio64Mb,string=32768
then this patch ensures that the mmio window is not resized again.
Essentially, this prevents the change introduced in the following
patch from taking effect:
commit ecb778d0ac62560aa172786ba19521f27bc3f650
Author: Gerd Hoffmann <kraxel@redhat.com>
Date: Tue Oct 4 15:47:27 2022 +0200
OvmfPkg/PlatformInitLib: dynamic mmio window size
In case we have a reliable PhysMemAddressWidth use that to dynamically
size the 64bit address window. Allocate 1/8 of the physical address
space and place the window at the upper end of the address space.
The problem this patch is trying to solve is the VFIO mapping failures:
VFIO_MAP_DMA failed: Invalid argument
vfio_dma_map(0x557b2f2736d0, 0x380000000000, 0x1000000, 0x7f98ac400000) = -22 (Invalid argument)
that occur when we try to passthrough the graphics device to the guest:
qemu-system-x86_64 -m 4096 -enable-kvm -cpu host -smp cores=4,threads=2,sockets=1
-device vfio-pci,host=0000:00:02.0 -bios OVMF.fd -nographic
The above failures seem to occur because of a mismatch between the
PhysMemAddressWidth and the Host IOMMU address width. More specifically,
if the PhysMemAddressWidth is bigger than the IOMMU address width,
VFIO fails to map the MMIO regions as the IOVAs would be larger
than the IOMMU aperture regions. When tested on modern Intel platforms
such as ADL, MTL, etc, OVMF determines PhysMemAddressWidth = 46 which
matches the Host address width but the IOMMU address width seems to
range anywhere from 38 to 48 depending on the IOMMU hardware
capabilities, version, etc.
One way to address this issue is if we ensure that PhysMemAddressWidth
matches IOMMU address width:
-cpu host,host-phys-bits=on,host-phys-bits-limit=<IOMMU address width>
However, this requires the user to figure out the IOMMU address width;
which can be determined by looking at the 16-21 bits of the cap value:
cat /sys/devices/virtual/iommu/dmar0/intel-iommu/cap
or by reading the DMAR_CAP_REG register. But this does not seem like
a reasonable approach to solve this problem.
Therefore, this problem requires an OVMF specific solution to retain
the prior behavior. To this end, this patch reuses the X-PciMmio64Mb
option to opt-out of the behavior introduced in the above commit
instead of adding a new option or mechanism.
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Ard Biesheuvel <ardb@kernel.org>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Dongwon Kim <dongwon.kim@intel.com>
Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>
---
OvmfPkg/Include/Library/PlatformInitLib.h | 1 +
OvmfPkg/Library/PlatformInitLib/MemDetect.c | 4 +++-
2 files changed, 4 insertions(+), 1 deletion(-)
diff --git a/OvmfPkg/Include/Library/PlatformInitLib.h b/OvmfPkg/Include/Library/PlatformInitLib.h
index 57b18b94d9..e8ea3defa2 100644
--- a/OvmfPkg/Include/Library/PlatformInitLib.h
+++ b/OvmfPkg/Include/Library/PlatformInitLib.h
@@ -55,6 +55,7 @@ typedef struct {
BOOLEAN QemuFwCfgChecked;
BOOLEAN QemuFwCfgSupported;
BOOLEAN QemuFwCfgDmaSupported;
+ BOOLEAN QemuFwCfgSizeSpecified;
} EFI_HOB_PLATFORM_INFO;
#pragma pack()
diff --git a/OvmfPkg/Library/PlatformInitLib/MemDetect.c b/OvmfPkg/Library/PlatformInitLib/MemDetect.c
index 662e7e85bb..a53a1e24e4 100644
--- a/OvmfPkg/Library/PlatformInitLib/MemDetect.c
+++ b/OvmfPkg/Library/PlatformInitLib/MemDetect.c
@@ -485,6 +485,7 @@ PlatformGetFirstNonAddress (
case EFI_SUCCESS:
if (FwCfgPciMmio64Mb <= 0x1000000) {
PlatformInfoHob->PcdPciMmio64Size = LShiftU64 (FwCfgPciMmio64Mb, 20);
+ PlatformInfoHob->QemuFwCfgSizeSpecified = TRUE;
break;
}
@@ -861,7 +862,8 @@ PlatformAddressWidthInitialization (
}
PlatformAddressWidthFromCpuid (PlatformInfoHob, TRUE);
- if (PlatformInfoHob->PhysMemAddressWidth != 0) {
+ if (PlatformInfoHob->PhysMemAddressWidth != 0 &&
+ !PlatformInfoHob->QemuFwCfgSizeSpecified) {
// physical address width is known
PlatformDynamicMmioWindow (PlatformInfoHob);
return;
--
2.39.2
-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.
View/Reply Online (#110609): https://edk2.groups.io/g/devel/message/110609
Mute This Topic: https://groups.io/mt/102359124/1787277
Group Owner: devel+owner@edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org]
-=-=-=-=-=-=-=-=-=-=-=-
On 11/3/23 06:15, Vivek Kasireddy wrote: > If the user specified a size for the PCI MMIO window via the option: > -fw_cfg name=opt/ovmf/X-PciMmio64Mb,string=32768 > then this patch ensures that the mmio window is not resized again. > > Essentially, this prevents the change introduced in the following > patch from taking effect: > commit ecb778d0ac62560aa172786ba19521f27bc3f650 > Author: Gerd Hoffmann <kraxel@redhat.com> > Date: Tue Oct 4 15:47:27 2022 +0200 > > OvmfPkg/PlatformInitLib: dynamic mmio window size > > In case we have a reliable PhysMemAddressWidth use that to dynamically > size the 64bit address window. Allocate 1/8 of the physical address > space and place the window at the upper end of the address space. > > The problem this patch is trying to solve is the VFIO mapping failures: > VFIO_MAP_DMA failed: Invalid argument > vfio_dma_map(0x557b2f2736d0, 0x380000000000, 0x1000000, 0x7f98ac400000) = -22 (Invalid argument) > that occur when we try to passthrough the graphics device to the guest: > qemu-system-x86_64 -m 4096 -enable-kvm -cpu host -smp cores=4,threads=2,sockets=1 > -device vfio-pci,host=0000:00:02.0 -bios OVMF.fd -nographic > > The above failures seem to occur because of a mismatch between the > PhysMemAddressWidth and the Host IOMMU address width. More specifically, > if the PhysMemAddressWidth is bigger than the IOMMU address width, > VFIO fails to map the MMIO regions as the IOVAs would be larger > than the IOMMU aperture regions. When tested on modern Intel platforms > such as ADL, MTL, etc, OVMF determines PhysMemAddressWidth = 46 which > matches the Host address width but the IOMMU address width seems to > range anywhere from 38 to 48 depending on the IOMMU hardware > capabilities, version, etc. > > One way to address this issue is if we ensure that PhysMemAddressWidth > matches IOMMU address width: > -cpu host,host-phys-bits=on,host-phys-bits-limit=<IOMMU address width> > However, this requires the user to figure out the IOMMU address width; > which can be determined by looking at the 16-21 bits of the cap value: > cat /sys/devices/virtual/iommu/dmar0/intel-iommu/cap > or by reading the DMAR_CAP_REG register. But this does not seem like > a reasonable approach to solve this problem. Very nice problem description, already outlining the solution as well. > > Therefore, this problem requires an OVMF specific solution to retain > the prior behavior. To this end, this patch reuses the X-PciMmio64Mb > option to opt-out of the behavior introduced in the above commit > instead of adding a new option or mechanism. No, the right solution is to enhance QEMU to query the host IOMMU address width. Then the following options arise: - either pass *both* the host CPU address width *and* the host IOMMU address width down to OVMF, and teach OVMF to pick the stricter limitation, for dynamically sizing the MMIO window - or let QEMU calulate the stricter width internally, and pass that (sole, scalar) piece of information down to OVMF. Teach OVMF to query this new piece of information, and size the MMIO window accordingly. Basically the QEMU command line-based workaround that you describe is what we need to automate (except we need a new information channel for it, because presenting the strict host IOMMU address width as the VCPU address width (via CPUID) to the guest is smelly). I agree that the proposed patch can function as a stop-gap, but the QEMU command line hack is already a stop-gap. And for the long term, this patch is not good enough. We should enhance the dynamic sizing, now that Gerd has put it into place. Thanks Laszlo > > Cc: Gerd Hoffmann <kraxel@redhat.com> > Cc: Ard Biesheuvel <ardb@kernel.org> > Cc: Laszlo Ersek <lersek@redhat.com> > Cc: Dongwon Kim <dongwon.kim@intel.com> > Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com> > --- > OvmfPkg/Include/Library/PlatformInitLib.h | 1 + > OvmfPkg/Library/PlatformInitLib/MemDetect.c | 4 +++- > 2 files changed, 4 insertions(+), 1 deletion(-) > > diff --git a/OvmfPkg/Include/Library/PlatformInitLib.h b/OvmfPkg/Include/Library/PlatformInitLib.h > index 57b18b94d9..e8ea3defa2 100644 > --- a/OvmfPkg/Include/Library/PlatformInitLib.h > +++ b/OvmfPkg/Include/Library/PlatformInitLib.h > @@ -55,6 +55,7 @@ typedef struct { > BOOLEAN QemuFwCfgChecked; > BOOLEAN QemuFwCfgSupported; > BOOLEAN QemuFwCfgDmaSupported; > + BOOLEAN QemuFwCfgSizeSpecified; > } EFI_HOB_PLATFORM_INFO; > #pragma pack() > > diff --git a/OvmfPkg/Library/PlatformInitLib/MemDetect.c b/OvmfPkg/Library/PlatformInitLib/MemDetect.c > index 662e7e85bb..a53a1e24e4 100644 > --- a/OvmfPkg/Library/PlatformInitLib/MemDetect.c > +++ b/OvmfPkg/Library/PlatformInitLib/MemDetect.c > @@ -485,6 +485,7 @@ PlatformGetFirstNonAddress ( > case EFI_SUCCESS: > if (FwCfgPciMmio64Mb <= 0x1000000) { > PlatformInfoHob->PcdPciMmio64Size = LShiftU64 (FwCfgPciMmio64Mb, 20); > + PlatformInfoHob->QemuFwCfgSizeSpecified = TRUE; > break; > } > > @@ -861,7 +862,8 @@ PlatformAddressWidthInitialization ( > } > > PlatformAddressWidthFromCpuid (PlatformInfoHob, TRUE); > - if (PlatformInfoHob->PhysMemAddressWidth != 0) { > + if (PlatformInfoHob->PhysMemAddressWidth != 0 && > + !PlatformInfoHob->QemuFwCfgSizeSpecified) { > // physical address width is known > PlatformDynamicMmioWindow (PlatformInfoHob); > return; -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110621): https://edk2.groups.io/g/devel/message/110621 Mute This Topic: https://groups.io/mt/102359124/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/leave/3901457/1787277/102458076/xyzzy [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
On 11/3/23 14:15, Laszlo Ersek wrote: > On 11/3/23 06:15, Vivek Kasireddy wrote: >> If the user specified a size for the PCI MMIO window via the option: >> -fw_cfg name=opt/ovmf/X-PciMmio64Mb,string=32768 >> then this patch ensures that the mmio window is not resized again. >> >> Essentially, this prevents the change introduced in the following >> patch from taking effect: >> commit ecb778d0ac62560aa172786ba19521f27bc3f650 >> Author: Gerd Hoffmann <kraxel@redhat.com> >> Date: Tue Oct 4 15:47:27 2022 +0200 >> >> OvmfPkg/PlatformInitLib: dynamic mmio window size >> >> In case we have a reliable PhysMemAddressWidth use that to dynamically >> size the 64bit address window. Allocate 1/8 of the physical address >> space and place the window at the upper end of the address space. >> >> The problem this patch is trying to solve is the VFIO mapping failures: >> VFIO_MAP_DMA failed: Invalid argument >> vfio_dma_map(0x557b2f2736d0, 0x380000000000, 0x1000000, 0x7f98ac400000) = -22 (Invalid argument) >> that occur when we try to passthrough the graphics device to the guest: >> qemu-system-x86_64 -m 4096 -enable-kvm -cpu host -smp cores=4,threads=2,sockets=1 >> -device vfio-pci,host=0000:00:02.0 -bios OVMF.fd -nographic >> >> The above failures seem to occur because of a mismatch between the >> PhysMemAddressWidth and the Host IOMMU address width. More specifically, >> if the PhysMemAddressWidth is bigger than the IOMMU address width, >> VFIO fails to map the MMIO regions as the IOVAs would be larger >> than the IOMMU aperture regions. When tested on modern Intel platforms >> such as ADL, MTL, etc, OVMF determines PhysMemAddressWidth = 46 which >> matches the Host address width but the IOMMU address width seems to >> range anywhere from 38 to 48 depending on the IOMMU hardware >> capabilities, version, etc. >> >> One way to address this issue is if we ensure that PhysMemAddressWidth >> matches IOMMU address width: >> -cpu host,host-phys-bits=on,host-phys-bits-limit=<IOMMU address width> >> However, this requires the user to figure out the IOMMU address width; >> which can be determined by looking at the 16-21 bits of the cap value: >> cat /sys/devices/virtual/iommu/dmar0/intel-iommu/cap >> or by reading the DMAR_CAP_REG register. But this does not seem like >> a reasonable approach to solve this problem. > > Very nice problem description, already outlining the solution as well. > >> >> Therefore, this problem requires an OVMF specific solution to retain >> the prior behavior. To this end, this patch reuses the X-PciMmio64Mb >> option to opt-out of the behavior introduced in the above commit >> instead of adding a new option or mechanism. > > No, the right solution is to enhance QEMU to query the host IOMMU > address width. Then the following options arise: > > - either pass *both* the host CPU address width *and* the host IOMMU > address width down to OVMF, and teach OVMF to pick the stricter > limitation, for dynamically sizing the MMIO window > > - or let QEMU calulate the stricter width internally, and pass that > (sole, scalar) piece of information down to OVMF. Teach OVMF to query > this new piece of information, and size the MMIO window accordingly. > > Basically the QEMU command line-based workaround that you describe is > what we need to automate (except we need a new information channel for > it, because presenting the strict host IOMMU address width as the VCPU > address width (via CPUID) to the guest is smelly). > > I agree that the proposed patch can function as a stop-gap, but the QEMU > command line hack is already a stop-gap. And for the long term, this > patch is not good enough. We should enhance the dynamic sizing, now that > Gerd has put it into place. ... I do agree however that the current behavior is strange -- the user specifies an explicit fw_cfg knob for OVMF, and OVMF ignores it (for whatever reason). I'd like to know what Gerd thinks of this. Personally I'd ACK the *code* in this patch, just to restore the correct priority between the dynamic sizing and the explicit fw_cfg knob, if: (a) the commit message referred to that exactly (i.e., the to the proper priority between these two configuration avenues), and (b) there were a promise to enhance QEMU and OVMF as I suggest above. I don't want the fw_cfg knob to stick around as a permanent excuse for not improving the dynamic sizing -- now that we *do have* dynamic sizing. BTW: I'd suggest renaming "QemuFwCfgSizeSpecified" to "PciMmio64MbOverride" or something like that. The important trait is that the value is an override (or direct setting) from the user; fw_cfg is incidental / irrelevant where the value is consumed. Thanks! Laszlo -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110624): https://edk2.groups.io/g/devel/message/110624 Mute This Topic: https://groups.io/mt/102359124/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/leave/3901457/1787277/102458076/xyzzy [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Hi, > > I agree that the proposed patch can function as a stop-gap, but the QEMU > > command line hack is already a stop-gap. And for the long term, this > > patch is not good enough. We should enhance the dynamic sizing, now that > > Gerd has put it into place. > > ... I do agree however that the current behavior is strange -- the user > specifies an explicit fw_cfg knob for OVMF, and OVMF ignores it (for > whatever reason). > > I'd like to know what Gerd thinks of this. The current code effectively considers the user-specified PciMmio64Size as a lower limit, it will never be smaller but might be larger in case OVMF figures it has enough space. Being more strict here and use the user-specified PciMmio64Size as-is no matter what is fine with me. The independent but related question is where the window should be placed in case we have a valid vcpu address space size and PciMmio64Size specified by the user. > (b) there were a promise to enhance QEMU and OVMF as I suggest above. Fully agree. We should explicitly communicate requirements (in this case: iommu address space size) instead of depending on side effects of unrelated config options. Strictly speaking you don't care that much about the size of the mmio window, but where it gets placed. Moving it to the end of the vcpu address space is what breaks your use case in case the iommu address space happens to be too small for that. take care, Gerd -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110748): https://edk2.groups.io/g/devel/message/110748 Mute This Topic: https://groups.io/mt/102359124/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Hi Gerd, > > Hi, > > > > I agree that the proposed patch can function as a stop-gap, but the QEMU > > > command line hack is already a stop-gap. And for the long term, this > > > patch is not good enough. We should enhance the dynamic sizing, now > that > > > Gerd has put it into place. > > > > ... I do agree however that the current behavior is strange -- the user > > specifies an explicit fw_cfg knob for OVMF, and OVMF ignores it (for > > whatever reason). > > > > I'd like to know what Gerd thinks of this. > > The current code effectively considers the user-specified PciMmio64Size > as a lower limit, it will never be smaller but might be larger in case > OVMF figures it has enough space. > > Being more strict here and use the user-specified PciMmio64Size as-is no > matter what is fine with me. > > The independent but related question is where the window should be > placed in case we have a valid vcpu address space size and PciMmio64Size > specified by the user. > > > (b) there were a promise to enhance QEMU and OVMF as I suggest above. > > Fully agree. We should explicitly communicate requirements (in this > case: iommu address space size) instead of depending on side effects > of unrelated config options. > > Strictly speaking you don't care that much about the size of the mmio > window, but where it gets placed. Moving it to the end of the vcpu > address space is what breaks your use case in case the iommu address > space happens to be too small for that. Right, the crux of the problem is indeed the placement of the window and not the size. Given this, do you see any problem if the mmio window were to be placed at the lower end of the address space? Or, do you think ensuring that PhysMemAddressWidth = <IOMMU address width> automatically via Qemu/OVMF like Laszlo suggested is a better solution for this problem? Thanks, Vivek > > take care, > Gerd -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110805): https://edk2.groups.io/g/devel/message/110805 Mute This Topic: https://groups.io/mt/102359124/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Hi, > > Strictly speaking you don't care that much about the size of the mmio > > window, but where it gets placed. Moving it to the end of the vcpu > > address space is what breaks your use case in case the iommu address > > space happens to be too small for that. > Right, the crux of the problem is indeed the placement of the window > and not the size. Given this, do you see any problem if the mmio window > were to be placed at the lower end of the address space? That might still not work because OVMF scales the window size with the address space. > Or, do you think > ensuring that PhysMemAddressWidth = <IOMMU address width> > automatically via Qemu/OVMF like Laszlo suggested is a better solution > for this problem? The best temporary stop gap IMHO is using -cpu host-phys-bits=on,host-phys-bits-limit=<iommu-address-bits> Side note: The qemu master branch has a precompiled seabios version which also does iommu window scaling and will need the same treatment. How to solve that long-term should be discussed on the qemu-devel list. take care, Gerd -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110835): https://edk2.groups.io/g/devel/message/110835 Mute This Topic: https://groups.io/mt/102359124/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
Hi Gerd, > > > > Strictly speaking you don't care that much about the size of the mmio > > > window, but where it gets placed. Moving it to the end of the vcpu > > > address space is what breaks your use case in case the iommu address > > > space happens to be too small for that. > > Right, the crux of the problem is indeed the placement of the window > > and not the size. Given this, do you see any problem if the mmio window > > were to be placed at the lower end of the address space? > > That might still not work because OVMF scales the window size with the > address space. > > > Or, do you think > > ensuring that PhysMemAddressWidth = <IOMMU address width> > > automatically via Qemu/OVMF like Laszlo suggested is a better solution > > for this problem? > > The best temporary stop gap IMHO is using > > -cpu host-phys-bits=on,host-phys-bits-limit=<iommu-address-bits> > > Side note: The qemu master branch has a precompiled seabios version > which also does iommu window scaling and will need the same treatment. > > How to solve that long-term should be discussed on the qemu-devel list. Ok, I'll look into how this problem can be solved in Qemu and will further discuss about this issue on qemu-devel. Thanks, Vivek > > take care, > Gerd -=-=-=-=-=-=-=-=-=-=-=- Groups.io Links: You receive all messages sent to this group. View/Reply Online (#110900): https://edk2.groups.io/g/devel/message/110900 Mute This Topic: https://groups.io/mt/102359124/1787277 Group Owner: devel+owner@edk2.groups.io Unsubscribe: https://edk2.groups.io/g/devel/unsub [importer@patchew.org] -=-=-=-=-=-=-=-=-=-=-=-
© 2016 - 2024 Red Hat, Inc.