From: Mario Limonciello <mario.limonciello@amd.com>
On a mobile system with an AMD integrated GPU + NVIDIA discrete GPU the
AMD GPU is not being selected by some desktop environments for any
rendering tasks. This is because neither GPU is being treated as
"boot_vga" but that is what some environments use to select a GPU [1].
The VGA arbiter driver only looks at devices that report as PCI display
VGA class devices. Neither GPU on the system is a PCI display VGA class
device:
c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1)
c6:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 150e (rev d1)
If the GPUs were looked at the vga_is_firmware_default() function actually
does do a good job at recognizing the case from the device used for the
firmware framebuffer.
Modify the VGA arbiter code and matching sysfs file entries to examine all
PCI display class devices. The existing logic stays the same.
This will cause all GPUs to gain a `boot_vga` file, but the correct device
(AMD GPU in this case) will now show `1` and the incorrect device shows `0`.
Userspace then picks the right device as well.
Link: https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 [1]
Suggested-by: Daniel Dadap <ddadap@nvidia.com>
Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
drivers/pci/pci-sysfs.c | 2 +-
drivers/pci/vgaarb.c | 8 ++++----
2 files changed, 5 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index 268c69daa4d57..c314ee1b3f9ac 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1707,7 +1707,7 @@ static umode_t pci_dev_attrs_are_visible(struct kobject *kobj,
struct device *dev = kobj_to_dev(kobj);
struct pci_dev *pdev = to_pci_dev(dev);
- if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev))
+ if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev))
return a->mode;
return 0;
diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
index 78748e8d2dbae..63216e5787d73 100644
--- a/drivers/pci/vgaarb.c
+++ b/drivers/pci/vgaarb.c
@@ -1499,8 +1499,8 @@ static int pci_notify(struct notifier_block *nb, unsigned long action,
vgaarb_dbg(dev, "%s\n", __func__);
- /* Only deal with VGA class devices */
- if (!pci_is_vga(pdev))
+ /* Only deal with PCI display class devices */
+ if (!pci_is_display(pdev))
return 0;
/*
@@ -1546,12 +1546,12 @@ static int __init vga_arb_device_init(void)
bus_register_notifier(&pci_bus_type, &pci_notifier);
- /* Add all VGA class PCI devices by default */
+ /* Add all PCI display class devices by default */
pdev = NULL;
while ((pdev =
pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
PCI_ANY_ID, pdev)) != NULL) {
- if (pci_is_vga(pdev))
+ if (pci_is_display(pdev))
vga_arbiter_add_pci_device(pdev);
}
--
2.43.0
On Tue, 17 Jun 2025 12:59:10 -0500 Mario Limonciello <superm1@kernel.org> wrote: > From: Mario Limonciello <mario.limonciello@amd.com> > > On a mobile system with an AMD integrated GPU + NVIDIA discrete GPU the > AMD GPU is not being selected by some desktop environments for any > rendering tasks. This is because neither GPU is being treated as > "boot_vga" but that is what some environments use to select a GPU [1]. > > The VGA arbiter driver only looks at devices that report as PCI display > VGA class devices. Neither GPU on the system is a PCI display VGA class > device: > > c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1) > c6:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 150e (rev d1) > > If the GPUs were looked at the vga_is_firmware_default() function actually > does do a good job at recognizing the case from the device used for the > firmware framebuffer. > > Modify the VGA arbiter code and matching sysfs file entries to examine all > PCI display class devices. The existing logic stays the same. > > This will cause all GPUs to gain a `boot_vga` file, but the correct device > (AMD GPU in this case) will now show `1` and the incorrect device shows `0`. > Userspace then picks the right device as well. > > Link: https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 [1] > Suggested-by: Daniel Dadap <ddadap@nvidia.com> > Acked-by: Thomas Zimmermann <tzimmermann@suse.de> > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > --- > drivers/pci/pci-sysfs.c | 2 +- > drivers/pci/vgaarb.c | 8 ++++---- > 2 files changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > index 268c69daa4d57..c314ee1b3f9ac 100644 > --- a/drivers/pci/pci-sysfs.c > +++ b/drivers/pci/pci-sysfs.c > @@ -1707,7 +1707,7 @@ static umode_t pci_dev_attrs_are_visible(struct kobject *kobj, > struct device *dev = kobj_to_dev(kobj); > struct pci_dev *pdev = to_pci_dev(dev); > > - if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev)) > + if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev)) > return a->mode; > > return 0; > diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c > index 78748e8d2dbae..63216e5787d73 100644 > --- a/drivers/pci/vgaarb.c > +++ b/drivers/pci/vgaarb.c > @@ -1499,8 +1499,8 @@ static int pci_notify(struct notifier_block *nb, unsigned long action, > > vgaarb_dbg(dev, "%s\n", __func__); > > - /* Only deal with VGA class devices */ > - if (!pci_is_vga(pdev)) > + /* Only deal with PCI display class devices */ > + if (!pci_is_display(pdev)) > return 0; > > /* > @@ -1546,12 +1546,12 @@ static int __init vga_arb_device_init(void) > > bus_register_notifier(&pci_bus_type, &pci_notifier); > > - /* Add all VGA class PCI devices by default */ > + /* Add all PCI display class devices by default */ > pdev = NULL; > while ((pdev = > pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, > PCI_ANY_ID, pdev)) != NULL) { > - if (pci_is_vga(pdev)) > + if (pci_is_display(pdev)) > vga_arbiter_add_pci_device(pdev); > } > At the very least a non-VGA device should not mark that it decodes legacy resources, marking the boot VGA device is only a part of what the VGA arbiter does. It seems none of the actual VGA arbitration interfaces have been considered here though. I still think this is a bad idea and I'm not sure Thomas didn't withdraw his ack in the previous round[1]. Thanks, Alex [1]https://lore.kernel.org/all/bc0a3ac2-c86c-43b8-b83f-edfdfa5ee184@suse.de/
On 6/17/25 2:22 PM, Alex Williamson wrote: > On Tue, 17 Jun 2025 12:59:10 -0500 > Mario Limonciello <superm1@kernel.org> wrote: > >> From: Mario Limonciello <mario.limonciello@amd.com> >> >> On a mobile system with an AMD integrated GPU + NVIDIA discrete GPU the >> AMD GPU is not being selected by some desktop environments for any >> rendering tasks. This is because neither GPU is being treated as >> "boot_vga" but that is what some environments use to select a GPU [1]. >> >> The VGA arbiter driver only looks at devices that report as PCI display >> VGA class devices. Neither GPU on the system is a PCI display VGA class >> device: >> >> c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1) >> c6:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 150e (rev d1) >> >> If the GPUs were looked at the vga_is_firmware_default() function actually >> does do a good job at recognizing the case from the device used for the >> firmware framebuffer. >> >> Modify the VGA arbiter code and matching sysfs file entries to examine all >> PCI display class devices. The existing logic stays the same. >> >> This will cause all GPUs to gain a `boot_vga` file, but the correct device >> (AMD GPU in this case) will now show `1` and the incorrect device shows `0`. >> Userspace then picks the right device as well. >> >> Link: https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 [1] >> Suggested-by: Daniel Dadap <ddadap@nvidia.com> >> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >> --- >> drivers/pci/pci-sysfs.c | 2 +- >> drivers/pci/vgaarb.c | 8 ++++---- >> 2 files changed, 5 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c >> index 268c69daa4d57..c314ee1b3f9ac 100644 >> --- a/drivers/pci/pci-sysfs.c >> +++ b/drivers/pci/pci-sysfs.c >> @@ -1707,7 +1707,7 @@ static umode_t pci_dev_attrs_are_visible(struct kobject *kobj, >> struct device *dev = kobj_to_dev(kobj); >> struct pci_dev *pdev = to_pci_dev(dev); >> >> - if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev)) >> + if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev)) >> return a->mode; >> >> return 0; >> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c >> index 78748e8d2dbae..63216e5787d73 100644 >> --- a/drivers/pci/vgaarb.c >> +++ b/drivers/pci/vgaarb.c >> @@ -1499,8 +1499,8 @@ static int pci_notify(struct notifier_block *nb, unsigned long action, >> >> vgaarb_dbg(dev, "%s\n", __func__); >> >> - /* Only deal with VGA class devices */ >> - if (!pci_is_vga(pdev)) >> + /* Only deal with PCI display class devices */ >> + if (!pci_is_display(pdev)) >> return 0; >> >> /* >> @@ -1546,12 +1546,12 @@ static int __init vga_arb_device_init(void) >> >> bus_register_notifier(&pci_bus_type, &pci_notifier); >> >> - /* Add all VGA class PCI devices by default */ >> + /* Add all PCI display class devices by default */ >> pdev = NULL; >> while ((pdev = >> pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, >> PCI_ANY_ID, pdev)) != NULL) { >> - if (pci_is_vga(pdev)) >> + if (pci_is_display(pdev)) >> vga_arbiter_add_pci_device(pdev); >> } >> > > At the very least a non-VGA device should not mark that it decodes > legacy resources, marking the boot VGA device is only a part of what > the VGA arbiter does. It seems none of the actual VGA arbitration > interfaces have been considered here though. > > I still think this is a bad idea and I'm not sure Thomas didn't > withdraw his ack in the previous round[1]. Thanks, Ah; I didn't realize that was intended to be a withdrawl. If there's another version of this I'll remove it. Dave, What is your current temperature on this approach? Do you still think it's best for something in the kernel or is this better done in libpciaccess? Mutter, Kwin, and Cosmic all handle this case in the compositor. > > Alex > > [1]https://lore.kernel.org/all/bc0a3ac2-c86c-43b8-b83f-edfdfa5ee184@suse.de/ >
Hi Am 17.06.25 um 22:22 schrieb Mario Limonciello: > > > On 6/17/25 2:22 PM, Alex Williamson wrote: >> On Tue, 17 Jun 2025 12:59:10 -0500 >> Mario Limonciello <superm1@kernel.org> wrote: >> >>> From: Mario Limonciello <mario.limonciello@amd.com> >>> >>> On a mobile system with an AMD integrated GPU + NVIDIA discrete GPU the >>> AMD GPU is not being selected by some desktop environments for any >>> rendering tasks. This is because neither GPU is being treated as >>> "boot_vga" but that is what some environments use to select a GPU [1]. >>> >>> The VGA arbiter driver only looks at devices that report as PCI display >>> VGA class devices. Neither GPU on the system is a PCI display VGA class >>> device: >>> >>> c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1) >>> c6:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] >>> Device 150e (rev d1) >>> >>> If the GPUs were looked at the vga_is_firmware_default() function >>> actually >>> does do a good job at recognizing the case from the device used for the >>> firmware framebuffer. >>> >>> Modify the VGA arbiter code and matching sysfs file entries to >>> examine all >>> PCI display class devices. The existing logic stays the same. >>> >>> This will cause all GPUs to gain a `boot_vga` file, but the correct >>> device >>> (AMD GPU in this case) will now show `1` and the incorrect device >>> shows `0`. >>> Userspace then picks the right device as well. >>> >>> Link: >>> https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 >>> [1] >>> Suggested-by: Daniel Dadap <ddadap@nvidia.com> >>> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> >>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >>> --- >>> drivers/pci/pci-sysfs.c | 2 +- >>> drivers/pci/vgaarb.c | 8 ++++---- >>> 2 files changed, 5 insertions(+), 5 deletions(-) >>> >>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c >>> index 268c69daa4d57..c314ee1b3f9ac 100644 >>> --- a/drivers/pci/pci-sysfs.c >>> +++ b/drivers/pci/pci-sysfs.c >>> @@ -1707,7 +1707,7 @@ static umode_t >>> pci_dev_attrs_are_visible(struct kobject *kobj, >>> struct device *dev = kobj_to_dev(kobj); >>> struct pci_dev *pdev = to_pci_dev(dev); >>> - if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev)) >>> + if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev)) >>> return a->mode; >>> return 0; >>> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c >>> index 78748e8d2dbae..63216e5787d73 100644 >>> --- a/drivers/pci/vgaarb.c >>> +++ b/drivers/pci/vgaarb.c >>> @@ -1499,8 +1499,8 @@ static int pci_notify(struct notifier_block >>> *nb, unsigned long action, >>> vgaarb_dbg(dev, "%s\n", __func__); >>> - /* Only deal with VGA class devices */ >>> - if (!pci_is_vga(pdev)) >>> + /* Only deal with PCI display class devices */ >>> + if (!pci_is_display(pdev)) >>> return 0; >>> /* >>> @@ -1546,12 +1546,12 @@ static int __init vga_arb_device_init(void) >>> bus_register_notifier(&pci_bus_type, &pci_notifier); >>> - /* Add all VGA class PCI devices by default */ >>> + /* Add all PCI display class devices by default */ >>> pdev = NULL; >>> while ((pdev = >>> pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, >>> PCI_ANY_ID, pdev)) != NULL) { >>> - if (pci_is_vga(pdev)) >>> + if (pci_is_display(pdev)) >>> vga_arbiter_add_pci_device(pdev); >>> } >> >> At the very least a non-VGA device should not mark that it decodes >> legacy resources, marking the boot VGA device is only a part of what >> the VGA arbiter does. It seems none of the actual VGA arbitration >> interfaces have been considered here though. >> >> I still think this is a bad idea and I'm not sure Thomas didn't >> withdraw his ack in the previous round[1]. Thanks, > > Ah; I didn't realize that was intended to be a withdrawl. > If there's another version of this I'll remove it. Then let me formally withdraw the A-b. I think this updated patch doesn't address the concerns raised in the previous reviews. AFAIU vgaarb is really only about VGA devices. Best regards Thomas > > Dave, > > What is your current temperature on this approach? > > Do you still think it's best for something in the kernel or is this > better done in libpciaccess? > > Mutter, Kwin, and Cosmic all handle this case in the compositor. > > >> >> Alex >> >> [1]https://lore.kernel.org/all/bc0a3ac2-c86c-43b8-b83f-edfdfa5ee184@suse.de/ >> >> > -- -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg)
On Wed, Jun 18, 2025 at 11:11:26AM +0200, Thomas Zimmermann wrote: > Hi > > Am 17.06.25 um 22:22 schrieb Mario Limonciello: > > > > > > On 6/17/25 2:22 PM, Alex Williamson wrote: > > > On Tue, 17 Jun 2025 12:59:10 -0500 > > > Mario Limonciello <superm1@kernel.org> wrote: > > > > > > > From: Mario Limonciello <mario.limonciello@amd.com> > > > > > > > > On a mobile system with an AMD integrated GPU + NVIDIA discrete GPU the > > > > AMD GPU is not being selected by some desktop environments for any > > > > rendering tasks. This is because neither GPU is being treated as > > > > "boot_vga" but that is what some environments use to select a GPU [1]. > > > > > > > > The VGA arbiter driver only looks at devices that report as PCI display > > > > VGA class devices. Neither GPU on the system is a PCI display VGA class > > > > device: > > > > > > > > c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1) > > > > c6:00.0 Display controller: Advanced Micro Devices, Inc. > > > > [AMD/ATI] Device 150e (rev d1) > > > > > > > > If the GPUs were looked at the vga_is_firmware_default() > > > > function actually > > > > does do a good job at recognizing the case from the device used for the > > > > firmware framebuffer. > > > > > > > > Modify the VGA arbiter code and matching sysfs file entries to > > > > examine all > > > > PCI display class devices. The existing logic stays the same. > > > > > > > > This will cause all GPUs to gain a `boot_vga` file, but the > > > > correct device > > > > (AMD GPU in this case) will now show `1` and the incorrect > > > > device shows `0`. > > > > Userspace then picks the right device as well. > > > > > > > > Link: https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 > > > > [1] > > > > Suggested-by: Daniel Dadap <ddadap@nvidia.com> > > > > Acked-by: Thomas Zimmermann <tzimmermann@suse.de> > > > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > > > > --- > > > > drivers/pci/pci-sysfs.c | 2 +- > > > > drivers/pci/vgaarb.c | 8 ++++---- > > > > 2 files changed, 5 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > > > > index 268c69daa4d57..c314ee1b3f9ac 100644 > > > > --- a/drivers/pci/pci-sysfs.c > > > > +++ b/drivers/pci/pci-sysfs.c > > > > @@ -1707,7 +1707,7 @@ static umode_t > > > > pci_dev_attrs_are_visible(struct kobject *kobj, > > > > struct device *dev = kobj_to_dev(kobj); > > > > struct pci_dev *pdev = to_pci_dev(dev); > > > > - if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev)) > > > > + if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev)) > > > > return a->mode; > > > > return 0; > > > > diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c > > > > index 78748e8d2dbae..63216e5787d73 100644 > > > > --- a/drivers/pci/vgaarb.c > > > > +++ b/drivers/pci/vgaarb.c > > > > @@ -1499,8 +1499,8 @@ static int pci_notify(struct > > > > notifier_block *nb, unsigned long action, > > > > vgaarb_dbg(dev, "%s\n", __func__); > > > > - /* Only deal with VGA class devices */ > > > > - if (!pci_is_vga(pdev)) > > > > + /* Only deal with PCI display class devices */ > > > > + if (!pci_is_display(pdev)) > > > > return 0; > > > > /* > > > > @@ -1546,12 +1546,12 @@ static int __init vga_arb_device_init(void) > > > > bus_register_notifier(&pci_bus_type, &pci_notifier); > > > > - /* Add all VGA class PCI devices by default */ > > > > + /* Add all PCI display class devices by default */ > > > > pdev = NULL; > > > > while ((pdev = > > > > pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, > > > > PCI_ANY_ID, pdev)) != NULL) { > > > > - if (pci_is_vga(pdev)) > > > > + if (pci_is_display(pdev)) > > > > vga_arbiter_add_pci_device(pdev); > > > > } > > > > > > At the very least a non-VGA device should not mark that it decodes > > > legacy resources, marking the boot VGA device is only a part of what > > > the VGA arbiter does. It seems none of the actual VGA arbitration > > > interfaces have been considered here though. > > > > > > I still think this is a bad idea and I'm not sure Thomas didn't > > > withdraw his ack in the previous round[1]. Thanks, > > > > Ah; I didn't realize that was intended to be a withdrawl. > > If there's another version of this I'll remove it. > > Then let me formally withdraw the A-b. > > I think this updated patch doesn't address the concerns raised in the > previous reviews. AFAIU vgaarb is really only about VGA devices. I missed the earlier version, but wanted to chime in that I concur. vgaarb is about vga decoding, and modern gpu drivers are trying pretty hard to disable that since it can cause pain. If we mix in the meaning of "default display device" into this, we have a mess. I guess what does make sense is if the kernel exposes its notion of "default display device", since we do have that in some sense with simpledrm. At least on systems where simpledrm is a thing, but I think you need some really old machines for that to not be the case. Cheers, Sima > Best regards > Thomas > > > > > Dave, > > > > What is your current temperature on this approach? > > > > Do you still think it's best for something in the kernel or is this > > better done in libpciaccess? > > > > Mutter, Kwin, and Cosmic all handle this case in the compositor. > > > > > > > > > > Alex > > > > > > [1]https://lore.kernel.org/all/bc0a3ac2-c86c-43b8-b83f-edfdfa5ee184@suse.de/ > > > > > > > > > > -- > -- > Thomas Zimmermann > Graphics Driver Developer > SUSE Software Solutions Germany GmbH > Frankenstrasse 146, 90461 Nuernberg, Germany > GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman > HRB 36809 (AG Nuernberg) > -- Simona Vetter Software Engineer, Intel Corporation http://blog.ffwll.ch
On 6/18/2025 9:12 AM, Simona Vetter wrote: > On Wed, Jun 18, 2025 at 11:11:26AM +0200, Thomas Zimmermann wrote: >> Hi >> >> Am 17.06.25 um 22:22 schrieb Mario Limonciello: >>> >>> >>> On 6/17/25 2:22 PM, Alex Williamson wrote: >>>> On Tue, 17 Jun 2025 12:59:10 -0500 >>>> Mario Limonciello <superm1@kernel.org> wrote: >>>> >>>>> From: Mario Limonciello <mario.limonciello@amd.com> >>>>> >>>>> On a mobile system with an AMD integrated GPU + NVIDIA discrete GPU the >>>>> AMD GPU is not being selected by some desktop environments for any >>>>> rendering tasks. This is because neither GPU is being treated as >>>>> "boot_vga" but that is what some environments use to select a GPU [1]. >>>>> >>>>> The VGA arbiter driver only looks at devices that report as PCI display >>>>> VGA class devices. Neither GPU on the system is a PCI display VGA class >>>>> device: >>>>> >>>>> c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1) >>>>> c6:00.0 Display controller: Advanced Micro Devices, Inc. >>>>> [AMD/ATI] Device 150e (rev d1) >>>>> >>>>> If the GPUs were looked at the vga_is_firmware_default() >>>>> function actually >>>>> does do a good job at recognizing the case from the device used for the >>>>> firmware framebuffer. >>>>> >>>>> Modify the VGA arbiter code and matching sysfs file entries to >>>>> examine all >>>>> PCI display class devices. The existing logic stays the same. >>>>> >>>>> This will cause all GPUs to gain a `boot_vga` file, but the >>>>> correct device >>>>> (AMD GPU in this case) will now show `1` and the incorrect >>>>> device shows `0`. >>>>> Userspace then picks the right device as well. >>>>> >>>>> Link: https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 >>>>> [1] >>>>> Suggested-by: Daniel Dadap <ddadap@nvidia.com> >>>>> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> >>>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >>>>> --- >>>>> drivers/pci/pci-sysfs.c | 2 +- >>>>> drivers/pci/vgaarb.c | 8 ++++---- >>>>> 2 files changed, 5 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c >>>>> index 268c69daa4d57..c314ee1b3f9ac 100644 >>>>> --- a/drivers/pci/pci-sysfs.c >>>>> +++ b/drivers/pci/pci-sysfs.c >>>>> @@ -1707,7 +1707,7 @@ static umode_t >>>>> pci_dev_attrs_are_visible(struct kobject *kobj, >>>>> struct device *dev = kobj_to_dev(kobj); >>>>> struct pci_dev *pdev = to_pci_dev(dev); >>>>> - if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev)) >>>>> + if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev)) >>>>> return a->mode; >>>>> return 0; >>>>> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c >>>>> index 78748e8d2dbae..63216e5787d73 100644 >>>>> --- a/drivers/pci/vgaarb.c >>>>> +++ b/drivers/pci/vgaarb.c >>>>> @@ -1499,8 +1499,8 @@ static int pci_notify(struct >>>>> notifier_block *nb, unsigned long action, >>>>> vgaarb_dbg(dev, "%s\n", __func__); >>>>> - /* Only deal with VGA class devices */ >>>>> - if (!pci_is_vga(pdev)) >>>>> + /* Only deal with PCI display class devices */ >>>>> + if (!pci_is_display(pdev)) >>>>> return 0; >>>>> /* >>>>> @@ -1546,12 +1546,12 @@ static int __init vga_arb_device_init(void) >>>>> bus_register_notifier(&pci_bus_type, &pci_notifier); >>>>> - /* Add all VGA class PCI devices by default */ >>>>> + /* Add all PCI display class devices by default */ >>>>> pdev = NULL; >>>>> while ((pdev = >>>>> pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, >>>>> PCI_ANY_ID, pdev)) != NULL) { >>>>> - if (pci_is_vga(pdev)) >>>>> + if (pci_is_display(pdev)) >>>>> vga_arbiter_add_pci_device(pdev); >>>>> } >>>> >>>> At the very least a non-VGA device should not mark that it decodes >>>> legacy resources, marking the boot VGA device is only a part of what >>>> the VGA arbiter does. It seems none of the actual VGA arbitration >>>> interfaces have been considered here though. >>>> >>>> I still think this is a bad idea and I'm not sure Thomas didn't >>>> withdraw his ack in the previous round[1]. Thanks, >>> >>> Ah; I didn't realize that was intended to be a withdrawl. >>> If there's another version of this I'll remove it. >> >> Then let me formally withdraw the A-b. >> >> I think this updated patch doesn't address the concerns raised in the >> previous reviews. AFAIU vgaarb is really only about VGA devices. > > I missed the earlier version, but wanted to chime in that I concur. vgaarb > is about vga decoding, and modern gpu drivers are trying pretty hard to > disable that since it can cause pain. If we mix in the meaning of "default > display device" into this, we have a mess. > > I guess what does make sense is if the kernel exposes its notion of > "default display device", since we do have that in some sense with > simpledrm. At least on systems where simpledrm is a thing, but I think you > need some really old machines for that to not be the case. > > Cheers, Sima Thanks guys. Let's discard patch 6. Here's a spin of an approach for userspace that does something similar to what the compositors are doing. We can iterate on that. https://gitlab.freedesktop.org/xorg/lib/libpciaccess/-/merge_requests/38 I think patches 1-5 still are valuable though. So please add reviews to those and we can take those without patch 6 if there is agreement.
Hi Am 18.06.25 um 20:45 schrieb Mario Limonciello: > On 6/18/2025 9:12 AM, Simona Vetter wrote: >> On Wed, Jun 18, 2025 at 11:11:26AM +0200, Thomas Zimmermann wrote: >>> Hi >>> >>> Am 17.06.25 um 22:22 schrieb Mario Limonciello: >>>> >>>> >>>> On 6/17/25 2:22 PM, Alex Williamson wrote: >>>>> On Tue, 17 Jun 2025 12:59:10 -0500 >>>>> Mario Limonciello <superm1@kernel.org> wrote: >>>>> >>>>>> From: Mario Limonciello <mario.limonciello@amd.com> >>>>>> >>>>>> On a mobile system with an AMD integrated GPU + NVIDIA discrete >>>>>> GPU the >>>>>> AMD GPU is not being selected by some desktop environments for any >>>>>> rendering tasks. This is because neither GPU is being treated as >>>>>> "boot_vga" but that is what some environments use to select a GPU >>>>>> [1]. >>>>>> >>>>>> The VGA arbiter driver only looks at devices that report as PCI >>>>>> display >>>>>> VGA class devices. Neither GPU on the system is a PCI display VGA >>>>>> class >>>>>> device: >>>>>> >>>>>> c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1) >>>>>> c6:00.0 Display controller: Advanced Micro Devices, Inc. >>>>>> [AMD/ATI] Device 150e (rev d1) >>>>>> >>>>>> If the GPUs were looked at the vga_is_firmware_default() >>>>>> function actually >>>>>> does do a good job at recognizing the case from the device used >>>>>> for the >>>>>> firmware framebuffer. >>>>>> >>>>>> Modify the VGA arbiter code and matching sysfs file entries to >>>>>> examine all >>>>>> PCI display class devices. The existing logic stays the same. >>>>>> >>>>>> This will cause all GPUs to gain a `boot_vga` file, but the >>>>>> correct device >>>>>> (AMD GPU in this case) will now show `1` and the incorrect >>>>>> device shows `0`. >>>>>> Userspace then picks the right device as well. >>>>>> >>>>>> Link: >>>>>> https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 >>>>>> [1] >>>>>> Suggested-by: Daniel Dadap <ddadap@nvidia.com> >>>>>> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> >>>>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >>>>>> --- >>>>>> drivers/pci/pci-sysfs.c | 2 +- >>>>>> drivers/pci/vgaarb.c | 8 ++++---- >>>>>> 2 files changed, 5 insertions(+), 5 deletions(-) >>>>>> >>>>>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c >>>>>> index 268c69daa4d57..c314ee1b3f9ac 100644 >>>>>> --- a/drivers/pci/pci-sysfs.c >>>>>> +++ b/drivers/pci/pci-sysfs.c >>>>>> @@ -1707,7 +1707,7 @@ static umode_t >>>>>> pci_dev_attrs_are_visible(struct kobject *kobj, >>>>>> struct device *dev = kobj_to_dev(kobj); >>>>>> struct pci_dev *pdev = to_pci_dev(dev); >>>>>> - if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev)) >>>>>> + if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev)) >>>>>> return a->mode; >>>>>> return 0; >>>>>> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c >>>>>> index 78748e8d2dbae..63216e5787d73 100644 >>>>>> --- a/drivers/pci/vgaarb.c >>>>>> +++ b/drivers/pci/vgaarb.c >>>>>> @@ -1499,8 +1499,8 @@ static int pci_notify(struct >>>>>> notifier_block *nb, unsigned long action, >>>>>> vgaarb_dbg(dev, "%s\n", __func__); >>>>>> - /* Only deal with VGA class devices */ >>>>>> - if (!pci_is_vga(pdev)) >>>>>> + /* Only deal with PCI display class devices */ >>>>>> + if (!pci_is_display(pdev)) >>>>>> return 0; >>>>>> /* >>>>>> @@ -1546,12 +1546,12 @@ static int __init vga_arb_device_init(void) >>>>>> bus_register_notifier(&pci_bus_type, &pci_notifier); >>>>>> - /* Add all VGA class PCI devices by default */ >>>>>> + /* Add all PCI display class devices by default */ >>>>>> pdev = NULL; >>>>>> while ((pdev = >>>>>> pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, >>>>>> PCI_ANY_ID, pdev)) != NULL) { >>>>>> - if (pci_is_vga(pdev)) >>>>>> + if (pci_is_display(pdev)) >>>>>> vga_arbiter_add_pci_device(pdev); >>>>>> } >>>>> >>>>> At the very least a non-VGA device should not mark that it decodes >>>>> legacy resources, marking the boot VGA device is only a part of what >>>>> the VGA arbiter does. It seems none of the actual VGA arbitration >>>>> interfaces have been considered here though. >>>>> >>>>> I still think this is a bad idea and I'm not sure Thomas didn't >>>>> withdraw his ack in the previous round[1]. Thanks, >>>> >>>> Ah; I didn't realize that was intended to be a withdrawl. >>>> If there's another version of this I'll remove it. >>> >>> Then let me formally withdraw the A-b. >>> >>> I think this updated patch doesn't address the concerns raised in the >>> previous reviews. AFAIU vgaarb is really only about VGA devices. >> >> I missed the earlier version, but wanted to chime in that I concur. >> vgaarb >> is about vga decoding, and modern gpu drivers are trying pretty hard to >> disable that since it can cause pain. If we mix in the meaning of >> "default >> display device" into this, we have a mess. >> >> I guess what does make sense is if the kernel exposes its notion of >> "default display device", since we do have that in some sense with >> simpledrm. At least on systems where simpledrm is a thing, but I >> think you >> need some really old machines for that to not be the case. >> >> Cheers, Sima > > Thanks guys. Let's discard patch 6. Here's a spin of an approach for > userspace that does something similar to what the compositors are doing. > We can iterate on that. > > https://gitlab.freedesktop.org/xorg/lib/libpciaccess/-/merge_requests/38 Did you look at my suggestion to extend the kernel's video_is_primary_device()? This would also benefit fbcon. It is architecture specific and currently uses vgaarb on x86. I think it could be extended with the current patch's logic. Best regards Thomas > > I think patches 1-5 still are valuable though. So please add reviews > to those and we can take those without patch 6 if there is agreement. -- -- Thomas Zimmermann Graphics Driver Developer SUSE Software Solutions Germany GmbH Frankenstrasse 146, 90461 Nuernberg, Germany GF: Ivo Totev, Andrew Myers, Andrew McDonald, Boudien Moerman HRB 36809 (AG Nuernberg)
On Wed, Jun 18, 2025 at 01:45:40PM -0500, Mario Limonciello wrote: > On 6/18/2025 9:12 AM, Simona Vetter wrote: > > On Wed, Jun 18, 2025 at 11:11:26AM +0200, Thomas Zimmermann wrote: > > > Hi > > > > > > Am 17.06.25 um 22:22 schrieb Mario Limonciello: > > > > > > > > > > > > On 6/17/25 2:22 PM, Alex Williamson wrote: > > > > > On Tue, 17 Jun 2025 12:59:10 -0500 > > > > > Mario Limonciello <superm1@kernel.org> wrote: > > > > > > > > > > > From: Mario Limonciello <mario.limonciello@amd.com> > > > > > > > > > > > > On a mobile system with an AMD integrated GPU + NVIDIA discrete GPU the > > > > > > AMD GPU is not being selected by some desktop environments for any > > > > > > rendering tasks. This is because neither GPU is being treated as > > > > > > "boot_vga" but that is what some environments use to select a GPU [1]. > > > > > > > > > > > > The VGA arbiter driver only looks at devices that report as PCI display > > > > > > VGA class devices. Neither GPU on the system is a PCI display VGA class > > > > > > device: > > > > > > > > > > > > c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1) > > > > > > c6:00.0 Display controller: Advanced Micro Devices, Inc. > > > > > > [AMD/ATI] Device 150e (rev d1) > > > > > > > > > > > > If the GPUs were looked at the vga_is_firmware_default() > > > > > > function actually > > > > > > does do a good job at recognizing the case from the device used for the > > > > > > firmware framebuffer. > > > > > > > > > > > > Modify the VGA arbiter code and matching sysfs file entries to > > > > > > examine all > > > > > > PCI display class devices. The existing logic stays the same. > > > > > > > > > > > > This will cause all GPUs to gain a `boot_vga` file, but the > > > > > > correct device > > > > > > (AMD GPU in this case) will now show `1` and the incorrect > > > > > > device shows `0`. > > > > > > Userspace then picks the right device as well. > > > > > > > > > > > > Link: https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 > > > > > > [1] > > > > > > Suggested-by: Daniel Dadap <ddadap@nvidia.com> > > > > > > Acked-by: Thomas Zimmermann <tzimmermann@suse.de> > > > > > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > > > > > > --- > > > > > > drivers/pci/pci-sysfs.c | 2 +- > > > > > > drivers/pci/vgaarb.c | 8 ++++---- > > > > > > 2 files changed, 5 insertions(+), 5 deletions(-) > > > > > > > > > > > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > > > > > > index 268c69daa4d57..c314ee1b3f9ac 100644 > > > > > > --- a/drivers/pci/pci-sysfs.c > > > > > > +++ b/drivers/pci/pci-sysfs.c > > > > > > @@ -1707,7 +1707,7 @@ static umode_t > > > > > > pci_dev_attrs_are_visible(struct kobject *kobj, > > > > > > struct device *dev = kobj_to_dev(kobj); > > > > > > struct pci_dev *pdev = to_pci_dev(dev); > > > > > > - if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev)) > > > > > > + if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev)) > > > > > > return a->mode; > > > > > > return 0; > > > > > > diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c > > > > > > index 78748e8d2dbae..63216e5787d73 100644 > > > > > > --- a/drivers/pci/vgaarb.c > > > > > > +++ b/drivers/pci/vgaarb.c > > > > > > @@ -1499,8 +1499,8 @@ static int pci_notify(struct > > > > > > notifier_block *nb, unsigned long action, > > > > > > vgaarb_dbg(dev, "%s\n", __func__); > > > > > > - /* Only deal with VGA class devices */ > > > > > > - if (!pci_is_vga(pdev)) > > > > > > + /* Only deal with PCI display class devices */ > > > > > > + if (!pci_is_display(pdev)) > > > > > > return 0; > > > > > > /* > > > > > > @@ -1546,12 +1546,12 @@ static int __init vga_arb_device_init(void) > > > > > > bus_register_notifier(&pci_bus_type, &pci_notifier); > > > > > > - /* Add all VGA class PCI devices by default */ > > > > > > + /* Add all PCI display class devices by default */ > > > > > > pdev = NULL; > > > > > > while ((pdev = > > > > > > pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, > > > > > > PCI_ANY_ID, pdev)) != NULL) { > > > > > > - if (pci_is_vga(pdev)) > > > > > > + if (pci_is_display(pdev)) > > > > > > vga_arbiter_add_pci_device(pdev); > > > > > > } > > > > > > > > > > At the very least a non-VGA device should not mark that it decodes > > > > > legacy resources, marking the boot VGA device is only a part of what > > > > > the VGA arbiter does. It seems none of the actual VGA arbitration > > > > > interfaces have been considered here though. > > > > > > > > > > I still think this is a bad idea and I'm not sure Thomas didn't > > > > > withdraw his ack in the previous round[1]. Thanks, > > > > > > > > Ah; I didn't realize that was intended to be a withdrawl. > > > > If there's another version of this I'll remove it. > > > > > > Then let me formally withdraw the A-b. > > > > > > I think this updated patch doesn't address the concerns raised in the > > > previous reviews. AFAIU vgaarb is really only about VGA devices. > > > > I missed the earlier version, but wanted to chime in that I concur. vgaarb > > is about vga decoding, and modern gpu drivers are trying pretty hard to > > disable that since it can cause pain. If we mix in the meaning of "default > > display device" into this, we have a mess. > > > > I guess what does make sense is if the kernel exposes its notion of > > "default display device", since we do have that in some sense with > > simpledrm. At least on systems where simpledrm is a thing, but I think you > > need some really old machines for that to not be the case. > > > > Cheers, Sima > > Thanks guys. Let's discard patch 6. Here's a spin of an approach for > userspace that does something similar to what the compositors are doing. > We can iterate on that. > > https://gitlab.freedesktop.org/xorg/lib/libpciaccess/-/merge_requests/38 > > I think patches 1-5 still are valuable though. So please add reviews to > those and we can take those without patch 6 if there is agreement. Sima already gave her Reviewed-by: for patches 1-5 in the thread for 5. I don't expect my Reviewed-by: is worth much for any of those files, but if you want it, you have mine as well. (In patch 4, the iommu/vt-d one, it is a little weird that it's no longer symmetrical with the other class test macros right next to it, but I still think I prefer using the helper.)
On Tue, Jun 17, 2025 at 12:59:10PM -0500, Mario Limonciello wrote: > From: Mario Limonciello <mario.limonciello@amd.com> > > On a mobile system with an AMD integrated GPU + NVIDIA discrete GPU the > AMD GPU is not being selected by some desktop environments for any > rendering tasks. This is because neither GPU is being treated as > "boot_vga" but that is what some environments use to select a GPU [1]. > > The VGA arbiter driver only looks at devices that report as PCI display > VGA class devices. Neither GPU on the system is a PCI display VGA class > device: > > c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1) > c6:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 150e (rev d1) > > If the GPUs were looked at the vga_is_firmware_default() function actually > does do a good job at recognizing the case from the device used for the > firmware framebuffer. > > Modify the VGA arbiter code and matching sysfs file entries to examine all > PCI display class devices. The existing logic stays the same. > > This will cause all GPUs to gain a `boot_vga` file, but the correct device > (AMD GPU in this case) will now show `1` and the incorrect device shows `0`. > Userspace then picks the right device as well. > > Link: https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 [1] > Suggested-by: Daniel Dadap <ddadap@nvidia.com> > Acked-by: Thomas Zimmermann <tzimmermann@suse.de> > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > --- > drivers/pci/pci-sysfs.c | 2 +- > drivers/pci/vgaarb.c | 8 ++++---- > 2 files changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > index 268c69daa4d57..c314ee1b3f9ac 100644 > --- a/drivers/pci/pci-sysfs.c > +++ b/drivers/pci/pci-sysfs.c > @@ -1707,7 +1707,7 @@ static umode_t pci_dev_attrs_are_visible(struct kobject *kobj, > struct device *dev = kobj_to_dev(kobj); > struct pci_dev *pdev = to_pci_dev(dev); > > - if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev)) > + if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev)) > return a->mode; I can't help but worry about userspace clients that might be checking for the presence of the boot_vga sysfs file but don't check its contents. I understand that it's the intention to expose the file for non-VGA display controllers in the case where none of the display controllers are of the VGA subclass, but one of them is the boot console device and should be considered "VGA" for the purposes of the overloaded meaning of "VGA", but if it isn't too much trouble to minimize the change to UAPI here, I'd be more comfortable with only exposing this file for devices that really are VGA and/or the firmware default. Maybe something like making the condition: if (a == &dev_attr_boot_vga.attr) { if (pci_is_vga(pdev) || (pci_is_display(pdev) && vga_default_device() == pdev)) return a->mode; } (maybe we don't even need the pci_is_display() check at that point? I feel more comfortable leaving it in, though) I'd expect that to do something like (assuming two-GPU hybrid system): * Systems with one VGA controller and one 3D controller: * VGA controller gets boot_vga file, contents are "1" * 3D controller does not get boot_vga file * Systems with no VGA controllers and two 3D controllers: * 3D controller driving the console gets boot_vga file: "1" * 3D controller not driving the console does not get boot_vga file * Systems with two VGA controllers and no 3D controllers: * VGA controller driving the console gets boot_vga file: "1" * VGA controller not driving the console gets boot_vga file: "0" i.e., the behavior would only be visibly different in the case with two 3D controllers, like the one targeted by this patch. You and I have seen the two VGA controller case in the wild, so we know it exists. The one 3D and one VGA controller case is what I'd expect to be the common one, and hopefully this will have the same behavior before and after this change regardless of whether a muxed system defaults to dGPU (like hybrid Mac notebooks) or iGPU (like other hybrid systems I'm accustomed to). > > return 0; > diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c > index 78748e8d2dbae..63216e5787d73 100644 > --- a/drivers/pci/vgaarb.c > +++ b/drivers/pci/vgaarb.c > @@ -1499,8 +1499,8 @@ static int pci_notify(struct notifier_block *nb, unsigned long action, > > vgaarb_dbg(dev, "%s\n", __func__); > > - /* Only deal with VGA class devices */ > - if (!pci_is_vga(pdev)) > + /* Only deal with PCI display class devices */ > + if (!pci_is_display(pdev)) > return 0; > > /* > @@ -1546,12 +1546,12 @@ static int __init vga_arb_device_init(void) > > bus_register_notifier(&pci_bus_type, &pci_notifier); > > - /* Add all VGA class PCI devices by default */ > + /* Add all PCI display class devices by default */ > pdev = NULL; > while ((pdev = > pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, > PCI_ANY_ID, pdev)) != NULL) { > - if (pci_is_vga(pdev)) > + if (pci_is_display(pdev)) > vga_arbiter_add_pci_device(pdev); > } > > -- > 2.43.0 >
On 6/17/25 2:20 PM, Daniel Dadap wrote: > On Tue, Jun 17, 2025 at 12:59:10PM -0500, Mario Limonciello wrote: >> From: Mario Limonciello <mario.limonciello@amd.com> >> >> On a mobile system with an AMD integrated GPU + NVIDIA discrete GPU the >> AMD GPU is not being selected by some desktop environments for any >> rendering tasks. This is because neither GPU is being treated as >> "boot_vga" but that is what some environments use to select a GPU [1]. >> >> The VGA arbiter driver only looks at devices that report as PCI display >> VGA class devices. Neither GPU on the system is a PCI display VGA class >> device: >> >> c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1) >> c6:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 150e (rev d1) >> >> If the GPUs were looked at the vga_is_firmware_default() function actually >> does do a good job at recognizing the case from the device used for the >> firmware framebuffer. >> >> Modify the VGA arbiter code and matching sysfs file entries to examine all >> PCI display class devices. The existing logic stays the same. >> >> This will cause all GPUs to gain a `boot_vga` file, but the correct device >> (AMD GPU in this case) will now show `1` and the incorrect device shows `0`. >> Userspace then picks the right device as well. >> >> Link: https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 [1] >> Suggested-by: Daniel Dadap <ddadap@nvidia.com> >> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> >> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >> --- >> drivers/pci/pci-sysfs.c | 2 +- >> drivers/pci/vgaarb.c | 8 ++++---- >> 2 files changed, 5 insertions(+), 5 deletions(-) >> >> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c >> index 268c69daa4d57..c314ee1b3f9ac 100644 >> --- a/drivers/pci/pci-sysfs.c >> +++ b/drivers/pci/pci-sysfs.c >> @@ -1707,7 +1707,7 @@ static umode_t pci_dev_attrs_are_visible(struct kobject *kobj, >> struct device *dev = kobj_to_dev(kobj); >> struct pci_dev *pdev = to_pci_dev(dev); >> >> - if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev)) >> + if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev)) >> return a->mode; > > I can't help but worry about userspace clients that might be checking for > the presence of the boot_vga sysfs file but don't check its contents. Wouldn't those clients "already" be broken by such an assumption? We know today that there are systems with two VGA devices in them too. I'd think those should have both GPUs exporting a file and one having a 0 the other 1. > I > understand that it's the intention to expose the file for non-VGA display > controllers in the case where none of the display controllers are of the > VGA subclass, but one of them is the boot console device and should be > considered "VGA" for the purposes of the overloaded meaning of "VGA", but > if it isn't too much trouble to minimize the change to UAPI here, I'd be > more comfortable with only exposing this file for devices that really are > VGA and/or the firmware default. > > Maybe something like making the condition: > > if (a == &dev_attr_boot_vga.attr) { > if (pci_is_vga(pdev) || > (pci_is_display(pdev) && vga_default_device() == pdev)) > return a->mode; > } > > (maybe we don't even need the pci_is_display() check at that point? I > feel more comfortable leaving it in, though) I suppose it depends upon call order whether the above works or not. I'm not sure 'off hand' right now. > > I'd expect that to do something like (assuming two-GPU hybrid system): > > * Systems with one VGA controller and one 3D controller: > * VGA controller gets boot_vga file, contents are "1" > * 3D controller does not get boot_vga file > * Systems with no VGA controllers and two 3D controllers: > * 3D controller driving the console gets boot_vga file: "1" > * 3D controller not driving the console does not get boot_vga file > * Systems with two VGA controllers and no 3D controllers: > * VGA controller driving the console gets boot_vga file: "1" > * VGA controller not driving the console gets boot_vga file: "0" > > i.e., the behavior would only be visibly different in the case with two > 3D controllers, like the one targeted by this patch. You and I have seen > the two VGA controller case in the wild, so we know it exists. Yeah I wish we had some more data from that reporter right now to potentially support a proposal that would help their system too. This patch as it is today will only help case 1 and 2. > The one 3D > and one VGA controller case is what I'd expect to be the common one, and > hopefully this will have the same behavior before and after this change > regardless of whether a muxed system defaults to dGPU (like hybrid Mac > notebooks) or iGPU (like other hybrid systems I'm accustomed to). > >> >> return 0; >> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c >> index 78748e8d2dbae..63216e5787d73 100644 >> --- a/drivers/pci/vgaarb.c >> +++ b/drivers/pci/vgaarb.c >> @@ -1499,8 +1499,8 @@ static int pci_notify(struct notifier_block *nb, unsigned long action, >> >> vgaarb_dbg(dev, "%s\n", __func__); >> >> - /* Only deal with VGA class devices */ >> - if (!pci_is_vga(pdev)) >> + /* Only deal with PCI display class devices */ >> + if (!pci_is_display(pdev)) >> return 0; >> >> /* >> @@ -1546,12 +1546,12 @@ static int __init vga_arb_device_init(void) >> >> bus_register_notifier(&pci_bus_type, &pci_notifier); >> >> - /* Add all VGA class PCI devices by default */ >> + /* Add all PCI display class devices by default */ >> pdev = NULL; >> while ((pdev = >> pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, >> PCI_ANY_ID, pdev)) != NULL) { >> - if (pci_is_vga(pdev)) >> + if (pci_is_display(pdev)) >> vga_arbiter_add_pci_device(pdev); >> } >> >> -- >> 2.43.0 >>
On Tue, Jun 17, 2025 at 03:15:35PM -0500, Mario Limonciello wrote: > > > On 6/17/25 2:20 PM, Daniel Dadap wrote: > > On Tue, Jun 17, 2025 at 12:59:10PM -0500, Mario Limonciello wrote: > > > From: Mario Limonciello <mario.limonciello@amd.com> > > > > > > On a mobile system with an AMD integrated GPU + NVIDIA discrete GPU the > > > AMD GPU is not being selected by some desktop environments for any > > > rendering tasks. This is because neither GPU is being treated as > > > "boot_vga" but that is what some environments use to select a GPU [1]. > > > > > > The VGA arbiter driver only looks at devices that report as PCI display > > > VGA class devices. Neither GPU on the system is a PCI display VGA class > > > device: > > > > > > c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1) > > > c6:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 150e (rev d1) > > > > > > If the GPUs were looked at the vga_is_firmware_default() function actually > > > does do a good job at recognizing the case from the device used for the > > > firmware framebuffer. > > > > > > Modify the VGA arbiter code and matching sysfs file entries to examine all > > > PCI display class devices. The existing logic stays the same. > > > > > > This will cause all GPUs to gain a `boot_vga` file, but the correct device > > > (AMD GPU in this case) will now show `1` and the incorrect device shows `0`. > > > Userspace then picks the right device as well. > > > > > > Link: https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 [1] > > > Suggested-by: Daniel Dadap <ddadap@nvidia.com> > > > Acked-by: Thomas Zimmermann <tzimmermann@suse.de> > > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > > > --- > > > drivers/pci/pci-sysfs.c | 2 +- > > > drivers/pci/vgaarb.c | 8 ++++---- > > > 2 files changed, 5 insertions(+), 5 deletions(-) > > > > > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > > > index 268c69daa4d57..c314ee1b3f9ac 100644 > > > --- a/drivers/pci/pci-sysfs.c > > > +++ b/drivers/pci/pci-sysfs.c > > > @@ -1707,7 +1707,7 @@ static umode_t pci_dev_attrs_are_visible(struct kobject *kobj, > > > struct device *dev = kobj_to_dev(kobj); > > > struct pci_dev *pdev = to_pci_dev(dev); > > > - if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev)) > > > + if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev)) > > > return a->mode; > > > > I can't help but worry about userspace clients that might be checking for > > the presence of the boot_vga sysfs file but don't check its contents. > > Wouldn't those clients "already" be broken by such an assumption? > We know today that there are systems with two VGA devices in them too. > Yes, for systems with multiple VGA devices, which is an uncommon case. I think that on systems with one VGA device and one 3D device, which is probably the most common case, this change might break such clients. > I'd think those should have both GPUs exporting a file and one having a 0 > the other 1. Yeah, agreed. I'd consider it a userspace bug if the client only tests for the presence of the file but doesn't look at its contents, but it's still preferable not to break (hypothetical, buggy) clients unnecessarily. One could make a philosophical argument that "boot_vga" should really mean VGA subclass, as the name implies, but even so I think that, in lieu of a new interface to report what the desktop environments are actually trying to test for (which nobody uses yet because it doesn't exist), exposing the boot_vga file for a non-VGA GPU in the special case of there being zero VGA GPUs on the system is a reasonable and practical compromise to allow existing code to work on the zero-VGA systems. I think it ultimately comes down to a semantic argument about what "VGA" is really supposed to mean here. There's the real, honest-to-goodness VGA interface with INT 10h and VBE, and then there's the common de facto sort of shorthand convention (commonly but not universally followed) where VGA means it can drive displays and 3D means it can't. It used to be the case (at least on x86) that display controllers which could drive real display hardware were always VGA-compatible, and display controllers were not VGA compatible could never drive real display hardware, which I think is how that convention originated, but on UEFI systems with no CSM support, it's not necessarily true any more. However, there's so much existing software out there that conflates VGA-ness with display-ness that some controllers with no actual VGA support get listed with the VGA controller subclass to avoid breaking such software. If you go by the language of the definitions for the subclasses of PCI base class 03h, it seems pretty clear that the VGA subclass is supposed to mean actually compatible with real honest-to-goodness VGA. So those non-VGA devices that pretend to be VGA for software compatibility aren't following the spec. I'd be willing to wager that the system in question is being accurate when it says that it has no VGA controllers. It is arguably a userspace bug that these desktop environments are testing for "VGA" when they really probably mean something else, but it will probably take some time to hunt down everything that's relying on boot_vga for possibly wrong reasons, and I think the pragmatic option is to lie about it until we have a better way to test for whatever the desktops really want to know, and that better way is widely used. But it would be nice to limit the lying to cases where it unbreaks things if we can. > > > I > > understand that it's the intention to expose the file for non-VGA display > > controllers in the case where none of the display controllers are of the > > VGA subclass, but one of them is the boot console device and should be > > considered "VGA" for the purposes of the overloaded meaning of "VGA", but > > if it isn't too much trouble to minimize the change to UAPI here, I'd be > > more comfortable with only exposing this file for devices that really are > > VGA and/or the firmware default. > > > > Maybe something like making the condition: > > > > if (a == &dev_attr_boot_vga.attr) { > > if (pci_is_vga(pdev) || > > (pci_is_display(pdev) && vga_default_device() == pdev)) > > return a->mode; > > } > > > > (maybe we don't even need the pci_is_display() check at that point? I > > feel more comfortable leaving it in, though) > > I suppose it depends upon call order whether the above works or not. > > I'm not sure 'off hand' right now. > > > > I'd expect that to do something like (assuming two-GPU hybrid system): > > > > * Systems with one VGA controller and one 3D controller: > > * VGA controller gets boot_vga file, contents are "1" > > * 3D controller does not get boot_vga file > > * Systems with no VGA controllers and two 3D controllers: > > * 3D controller driving the console gets boot_vga file: "1" > > * 3D controller not driving the console does not get boot_vga file > > * Systems with two VGA controllers and no 3D controllers: > > * VGA controller driving the console gets boot_vga file: "1" > > * VGA controller not driving the console gets boot_vga file: "0" > > > > i.e., the behavior would only be visibly different in the case with two > > 3D controllers, like the one targeted by this patch. You and I have seen > > the two VGA controller case in the wild, so we know it exists. > > Yeah I wish we had some more data from that reporter right now to > potentially support a proposal that would help their system too. > > This patch as it is today will only help case 1 and 2. I don't think case 1 needs any help: the behavior I describe above is what I expect the existing behavior to be. However if you expose boot_vga files for all display controllers, the behavior for case 1 (which I expect to be the common case) will be different after that change: both controllers get a boot_vga file with different contents, versus only the VGA controller having a boot_vga file previously. > > > The one 3D > > and one VGA controller case is what I'd expect to be the common one, and > > hopefully this will have the same behavior before and after this change > > regardless of whether a muxed system defaults to dGPU (like hybrid Mac > > notebooks) or iGPU (like other hybrid systems I'm accustomed to). > > > > > return 0; > > > diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c > > > index 78748e8d2dbae..63216e5787d73 100644 > > > --- a/drivers/pci/vgaarb.c > > > +++ b/drivers/pci/vgaarb.c > > > @@ -1499,8 +1499,8 @@ static int pci_notify(struct notifier_block *nb, unsigned long action, > > > vgaarb_dbg(dev, "%s\n", __func__); > > > - /* Only deal with VGA class devices */ > > > - if (!pci_is_vga(pdev)) > > > + /* Only deal with PCI display class devices */ > > > + if (!pci_is_display(pdev)) > > > return 0; > > > /* > > > @@ -1546,12 +1546,12 @@ static int __init vga_arb_device_init(void) > > > bus_register_notifier(&pci_bus_type, &pci_notifier); > > > - /* Add all VGA class PCI devices by default */ > > > + /* Add all PCI display class devices by default */ > > > pdev = NULL; > > > while ((pdev = > > > pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID, > > > PCI_ANY_ID, pdev)) != NULL) { > > > - if (pci_is_vga(pdev)) > > > + if (pci_is_display(pdev)) > > > vga_arbiter_add_pci_device(pdev); > > > } > > > -- > > > 2.43.0 > > > >
On Tue, 17 Jun 2025 15:56:26 -0500 Daniel Dadap <ddadap@nvidia.com> wrote: > On Tue, Jun 17, 2025 at 03:15:35PM -0500, Mario Limonciello wrote: > > > > > > On 6/17/25 2:20 PM, Daniel Dadap wrote: > > > On Tue, Jun 17, 2025 at 12:59:10PM -0500, Mario Limonciello wrote: > > > > From: Mario Limonciello <mario.limonciello@amd.com> > > > > > > > > On a mobile system with an AMD integrated GPU + NVIDIA discrete GPU the > > > > AMD GPU is not being selected by some desktop environments for any > > > > rendering tasks. This is because neither GPU is being treated as > > > > "boot_vga" but that is what some environments use to select a GPU [1]. > > > > > > > > The VGA arbiter driver only looks at devices that report as PCI display > > > > VGA class devices. Neither GPU on the system is a PCI display VGA class > > > > device: > > > > > > > > c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1) > > > > c6:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 150e (rev d1) > > > > > > > > If the GPUs were looked at the vga_is_firmware_default() function actually > > > > does do a good job at recognizing the case from the device used for the > > > > firmware framebuffer. > > > > > > > > Modify the VGA arbiter code and matching sysfs file entries to examine all > > > > PCI display class devices. The existing logic stays the same. > > > > > > > > This will cause all GPUs to gain a `boot_vga` file, but the correct device > > > > (AMD GPU in this case) will now show `1` and the incorrect device shows `0`. > > > > Userspace then picks the right device as well. > > > > > > > > Link: https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 [1] > > > > Suggested-by: Daniel Dadap <ddadap@nvidia.com> > > > > Acked-by: Thomas Zimmermann <tzimmermann@suse.de> > > > > Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> > > > > --- > > > > drivers/pci/pci-sysfs.c | 2 +- > > > > drivers/pci/vgaarb.c | 8 ++++---- > > > > 2 files changed, 5 insertions(+), 5 deletions(-) > > > > > > > > diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c > > > > index 268c69daa4d57..c314ee1b3f9ac 100644 > > > > --- a/drivers/pci/pci-sysfs.c > > > > +++ b/drivers/pci/pci-sysfs.c > > > > @@ -1707,7 +1707,7 @@ static umode_t pci_dev_attrs_are_visible(struct kobject *kobj, > > > > struct device *dev = kobj_to_dev(kobj); > > > > struct pci_dev *pdev = to_pci_dev(dev); > > > > - if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev)) > > > > + if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev)) > > > > return a->mode; > > > > > > I can't help but worry about userspace clients that might be checking for > > > the presence of the boot_vga sysfs file but don't check its contents. > > > > Wouldn't those clients "already" be broken by such an assumption? > > We know today that there are systems with two VGA devices in them too. > > > > Yes, for systems with multiple VGA devices, which is an uncommon case. I > think that on systems with one VGA device and one 3D device, which is > probably the most common case, this change might break such clients. FWIW, this is exactly the opposite of what I'd expect is the common case. IME, an integrated GPU and discrete GPU, or multiple discrete GPUs are all VGA devices. > > I'd think those should have both GPUs exporting a file and one having a 0 > > the other 1. > > Yeah, agreed. I'd consider it a userspace bug if the client only tests for > the presence of the file but doesn't look at its contents, but it's still > preferable not to break (hypothetical, buggy) clients unnecessarily. One > could make a philosophical argument that "boot_vga" should really mean VGA > subclass, as the name implies, but even so I think that, in lieu of a new > interface to report what the desktop environments are actually trying to > test for (which nobody uses yet because it doesn't exist), exposing the > boot_vga file for a non-VGA GPU in the special case of there being zero > VGA GPUs on the system is a reasonable and practical compromise to allow > existing code to work on the zero-VGA systems. > > I think it ultimately comes down to a semantic argument about what "VGA" > is really supposed to mean here. There's the real, honest-to-goodness VGA > interface with INT 10h and VBE, and then there's the common de facto sort > of shorthand convention (commonly but not universally followed) where VGA > means it can drive displays and 3D means it can't. It used to be the case > (at least on x86) that display controllers which could drive real display > hardware were always VGA-compatible, and display controllers were not VGA > compatible could never drive real display hardware, which I think is how > that convention originated, but on UEFI systems with no CSM support, it's > not necessarily true any more. However, there's so much existing software > out there that conflates VGA-ness with display-ness that some controllers > with no actual VGA support get listed with the VGA controller subclass to > avoid breaking such software. > > If you go by the language of the definitions for the subclasses of PCI > base class 03h, it seems pretty clear that the VGA subclass is supposed > to mean actually compatible with real honest-to-goodness VGA. So those > non-VGA devices that pretend to be VGA for software compatibility aren't > following the spec. I'd be willing to wager that the system in question > is being accurate when it says that it has no VGA controllers. It is > arguably a userspace bug that these desktop environments are testing for > "VGA" when they really probably mean something else, but it will probably > take some time to hunt down everything that's relying on boot_vga for > possibly wrong reasons, and I think the pragmatic option is to lie about > it until we have a better way to test for whatever the desktops really > want to know, and that better way is widely used. But it would be nice to > limit the lying to cases where it unbreaks things if we can. I don't know if you have wiggle room with boot_vga specifically, I generally take it at face value that it's a VGA device and imo seems inconsistent to suggest otherwise. I do note however that there's really no philosophical discussion related to the VGA arbiter, it is managing devices and routing among them according to the strict PCI definition of VGA. Elsewhere in the kernel we can see that vga_default_device() is being used for strictly VGA related things, the VGA shadow ROM and legacy VGA resource aperture resolution for instance. It's unfortunate that the x86 video_is_primary_device() relies on it, but that seems like a growing pain of introducing non-VGA displays on a largely legacy encumbered architecture and should be addressed. Note that it should probably be considered whether VGA_ARB_MAX_GPUS needs a new default value if all display adapters were to be included. Thanks, Alex
> On Jun 17, 2025, at 16:50, Alex Williamson <alex.williamson@redhat.com> wrote: > > On Tue, 17 Jun 2025 15:56:26 -0500 > Daniel Dadap <ddadap@nvidia.com> wrote: > >>> On Tue, Jun 17, 2025 at 03:15:35PM -0500, Mario Limonciello wrote: >>> >>> >>> On 6/17/25 2:20 PM, Daniel Dadap wrote: >>>> On Tue, Jun 17, 2025 at 12:59:10PM -0500, Mario Limonciello wrote: >>>>> From: Mario Limonciello <mario.limonciello@amd.com> >>>>> >>>>> On a mobile system with an AMD integrated GPU + NVIDIA discrete GPU the >>>>> AMD GPU is not being selected by some desktop environments for any >>>>> rendering tasks. This is because neither GPU is being treated as >>>>> "boot_vga" but that is what some environments use to select a GPU [1]. >>>>> >>>>> The VGA arbiter driver only looks at devices that report as PCI display >>>>> VGA class devices. Neither GPU on the system is a PCI display VGA class >>>>> device: >>>>> >>>>> c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1) >>>>> c6:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 150e (rev d1) >>>>> >>>>> If the GPUs were looked at the vga_is_firmware_default() function actually >>>>> does do a good job at recognizing the case from the device used for the >>>>> firmware framebuffer. >>>>> >>>>> Modify the VGA arbiter code and matching sysfs file entries to examine all >>>>> PCI display class devices. The existing logic stays the same. >>>>> >>>>> This will cause all GPUs to gain a `boot_vga` file, but the correct device >>>>> (AMD GPU in this case) will now show `1` and the incorrect device shows `0`. >>>>> Userspace then picks the right device as well. >>>>> >>>>> Link: https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 [1] >>>>> Suggested-by: Daniel Dadap <ddadap@nvidia.com> >>>>> Acked-by: Thomas Zimmermann <tzimmermann@suse.de> >>>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com> >>>>> --- >>>>> drivers/pci/pci-sysfs.c | 2 +- >>>>> drivers/pci/vgaarb.c | 8 ++++---- >>>>> 2 files changed, 5 insertions(+), 5 deletions(-) >>>>> >>>>> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c >>>>> index 268c69daa4d57..c314ee1b3f9ac 100644 >>>>> --- a/drivers/pci/pci-sysfs.c >>>>> +++ b/drivers/pci/pci-sysfs.c >>>>> @@ -1707,7 +1707,7 @@ static umode_t pci_dev_attrs_are_visible(struct kobject *kobj, >>>>> struct device *dev = kobj_to_dev(kobj); >>>>> struct pci_dev *pdev = to_pci_dev(dev); >>>>> - if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev)) >>>>> + if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev)) >>>>> return a->mode; >>>> >>>> I can't help but worry about userspace clients that might be checking for >>>> the presence of the boot_vga sysfs file but don't check its contents. >>> >>> Wouldn't those clients "already" be broken by such an assumption? >>> We know today that there are systems with two VGA devices in them too. >>> >> >> Yes, for systems with multiple VGA devices, which is an uncommon case. I >> think that on systems with one VGA device and one 3D device, which is >> probably the most common case, this change might break such clients. > > FWIW, this is exactly the opposite of what I'd expect is the common > case. IME, an integrated GPU and discrete GPU, or multiple discrete > GPUs are all VGA devices. > Yeah, I guess I should clarify the context. On a desktop system with an integrated GPU and one or more add-in-board discrete GPUs, I’d expect everything to be a VGA controller. On a hybrid notebook system with one iGPU and one dGPU, I’d expect either both to be VGA controllers, or the boot device to be VGA and the he secondary to be 3D, with the latter being much more common in my observation on newer systems. On a UEFI system with CSM, only the boot device needs to support VGA, so there’s no reason for additional GPUs beyond the boot device to also support it. The system Mario is talking about here, where both iGPU and dGPU are 3D controllers, is most likely a UEFI-only system with no CSM, where it’s not necessary for any of the GPUs to support VGA. I have seen a few notebook systems that ditch the CSM in recent days. Such systems will probably only become more common. I have not yet seen any CSM-less desktop systems, probably because manufacturers want to support people plugging in their old MBR hard drives, but in theory they should be possible and there is very likely already such a system out there that I just haven’t seen. >>> I'd think those should have both GPUs exporting a file and one having a 0 >>> the other 1. >> >> Yeah, agreed. I'd consider it a userspace bug if the client only tests for >> the presence of the file but doesn't look at its contents, but it's still >> preferable not to break (hypothetical, buggy) clients unnecessarily. One >> could make a philosophical argument that "boot_vga" should really mean VGA >> subclass, as the name implies, but even so I think that, in lieu of a new >> interface to report what the desktop environments are actually trying to >> test for (which nobody uses yet because it doesn't exist), exposing the >> boot_vga file for a non-VGA GPU in the special case of there being zero >> VGA GPUs on the system is a reasonable and practical compromise to allow >> existing code to work on the zero-VGA systems. >> >> I think it ultimately comes down to a semantic argument about what "VGA" >> is really supposed to mean here. There's the real, honest-to-goodness VGA >> interface with INT 10h and VBE, and then there's the common de facto sort >> of shorthand convention (commonly but not universally followed) where VGA >> means it can drive displays and 3D means it can't. It used to be the case >> (at least on x86) that display controllers which could drive real display >> hardware were always VGA-compatible, and display controllers were not VGA >> compatible could never drive real display hardware, which I think is how >> that convention originated, but on UEFI systems with no CSM support, it's >> not necessarily true any more. However, there's so much existing software >> out there that conflates VGA-ness with display-ness that some controllers >> with no actual VGA support get listed with the VGA controller subclass to >> avoid breaking such software. >> >> If you go by the language of the definitions for the subclasses of PCI >> base class 03h, it seems pretty clear that the VGA subclass is supposed >> to mean actually compatible with real honest-to-goodness VGA. So those >> non-VGA devices that pretend to be VGA for software compatibility aren't >> following the spec. I'd be willing to wager that the system in question >> is being accurate when it says that it has no VGA controllers. It is >> arguably a userspace bug that these desktop environments are testing for >> "VGA" when they really probably mean something else, but it will probably >> take some time to hunt down everything that's relying on boot_vga for >> possibly wrong reasons, and I think the pragmatic option is to lie about >> it until we have a better way to test for whatever the desktops really >> want to know, and that better way is widely used. But it would be nice to >> limit the lying to cases where it unbreaks things if we can. > > I don't know if you have wiggle room with boot_vga specifically, I > generally take it at face value that it's a VGA device and imo seems > inconsistent to suggest otherwise. I do note however that there's > really no philosophical discussion related to the VGA arbiter, it is > managing devices and routing among them according to the strict PCI > definition of VGA. > > Elsewhere in the kernel we can see that vga_default_device() is being > used for strictly VGA related things, the VGA shadow ROM and legacy VGA > resource aperture resolution for instance. It's unfortunate that the > x86 video_is_primary_device() relies on it, but that seems like a > growing pain of introducing non-VGA displays on a largely legacy > encumbered architecture and should be addressed. > > Note that it should probably be considered whether VGA_ARB_MAX_GPUS > needs a new default value if all display adapters were to be included. > Thanks, > > Alex >
© 2016 - 2025 Red Hat, Inc.