[RFC PATCH] amd64-agp: do not bind to pci driver if probing fails

Ahmed Salem posted 1 patch 3 months, 2 weeks ago
[RFC PATCH] amd64-agp: do not bind to pci driver if probing fails
Posted by Ahmed Salem 3 months, 2 weeks ago
Commit 3be5fa236649 ("Revert "iommu/amd: Prevent binding other PCI
drivers to IOMMU PCI devices"") had an unintended side effect in that
when looking for any AGP bridge, driver_attach() will try to bind to
IOMMU devices without preemptively checking for PCI_CAP_ID_AGP
capability. This happens during agp_amd64_init().

As a result, driver_attach() calls driver_probe_device(), which then
calls really_probe(), raising this critical condition:

 pci 0000:00:00.2: Resources present before probing

With the device being:

 00:00.2 IOMMU: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 IOMMU
	Subsystem: Advanced Micro Devices, Inc. [AMD] Raven/Raven2 IOMMU
	Flags: bus master, fast devsel, latency 0, IRQ 25
	Capabilities: [40] Secure device <?>
	Capabilities: [64] MSI: Enable+ Count=1/4 Maskable- 64bit+
	Capabilities: [74] HyperTransport: MSI Mapping Enable+ Fixed+

As pci_register_driver() calls the device's probing function, the latter
(agp_amd64_probe) tries to find the device's PCI_CAP_ID_AGP capability,
and returns -ENODEV if said capability is not found.

Do not attempt driver_attach() if agp_amd64_pci_driver.probe is non-zero
to avoid probing already present resources.

Link: https://lore.kernel.org/all/aFJOLZ88KIH5WBy2@wunner.de

Signed-off-by: Ahmed Salem <x0rw3ll@gmail.com>
---
I'm not quite sure whether I should have maintained the linked
conversation's Ccs, so please let me know if I should Cc anyone else.

Lukas, kindly let me know whether you want me to add a Suggested-by
trailer as well.


diff --git a/drivers/char/agp/amd64-agp.c b/drivers/char/agp/amd64-agp.c
index bf490967241a..e6a0d09e115a 100644
--- a/drivers/char/agp/amd64-agp.c
+++ b/drivers/char/agp/amd64-agp.c
@@ -768,10 +768,15 @@ int __init agp_amd64_init(void)
 
 		/* Look for any AGP bridge */
 		agp_amd64_pci_driver.id_table = agp_amd64_pci_promisc_table;
-		err = driver_attach(&agp_amd64_pci_driver.driver);
-		if (err == 0 && agp_bridges_found == 0) {
+		if ((int *)agp_amd64_pci_driver.probe != 0) {
 			pci_unregister_driver(&agp_amd64_pci_driver);
 			err = -ENODEV;
+		} else {
+			err = driver_attach(&agp_amd64_pci_driver.driver);
+			if (err == 0 && agp_bridges_found == 0) {
+				pci_unregister_driver(&agp_amd64_pci_driver);
+				err = -ENODEV;
+			}
 		}
 	}
 	return err;

base-commit: 11313e2f78128c948e9b4eb58b3dacfc30964700
-- 
2.47.2
Re: [RFC PATCH] amd64-agp: do not bind to pci driver if probing fails
Posted by Lukas Wunner 3 months, 2 weeks ago
On Sat, Jun 21, 2025 at 04:55:52AM +0300, Ahmed Salem wrote:
> --- a/drivers/char/agp/amd64-agp.c
> +++ b/drivers/char/agp/amd64-agp.c
> @@ -768,10 +768,15 @@ int __init agp_amd64_init(void)
>  
>  		/* Look for any AGP bridge */
>  		agp_amd64_pci_driver.id_table = agp_amd64_pci_promisc_table;
> -		err = driver_attach(&agp_amd64_pci_driver.driver);
> -		if (err == 0 && agp_bridges_found == 0) {
> +		if ((int *)agp_amd64_pci_driver.probe != 0) {
>  			pci_unregister_driver(&agp_amd64_pci_driver);
>  			err = -ENODEV;
> +		} else {
> +			err = driver_attach(&agp_amd64_pci_driver.driver);
> +			if (err == 0 && agp_bridges_found == 0) {
> +				pci_unregister_driver(&agp_amd64_pci_driver);
> +				err = -ENODEV;
> +			}

Is the "probe" member in agp_amd64_pci_driver overwritten with a
zero pointer anywhere?  I don't see that it is, so it seems the
else-branch is never entered.

I had already prepared a fix for this, but waited for 0-day to
crunch through it.  I've just submitted it, so that's what I had
in mind:

https://lore.kernel.org/r/f8ff40f35a9a5836d1371f60e85c09c5735e3c5e.1750497201.git.lukas@wunner.de/

Thanks,

Lukas
Re: [RFC PATCH] amd64-agp: do not bind to pci driver if probing fails
Posted by Ahmed Salem 3 months, 2 weeks ago
Hi Lukas,

On 25/06/21 11:46AM, Lukas Wunner wrote:
> On Sat, Jun 21, 2025 at 04:55:52AM +0300, Ahmed Salem wrote:
> > --- a/drivers/char/agp/amd64-agp.c
> > +++ b/drivers/char/agp/amd64-agp.c
> > @@ -768,10 +768,15 @@ int __init agp_amd64_init(void)
> >  
> >  		/* Look for any AGP bridge */
> >  		agp_amd64_pci_driver.id_table = agp_amd64_pci_promisc_table;
> > -		err = driver_attach(&agp_amd64_pci_driver.driver);
> > -		if (err == 0 && agp_bridges_found == 0) {
> > +		if ((int *)agp_amd64_pci_driver.probe != 0) {
> >  			pci_unregister_driver(&agp_amd64_pci_driver);
> >  			err = -ENODEV;
> > +		} else {
> > +			err = driver_attach(&agp_amd64_pci_driver.driver);
> > +			if (err == 0 && agp_bridges_found == 0) {
> > +				pci_unregister_driver(&agp_amd64_pci_driver);
> > +				err = -ENODEV;
> > +			}
> 
> Is the "probe" member in agp_amd64_pci_driver overwritten with a
> zero pointer anywhere?  I don't see that it is, so it seems the
> else-branch is never entered.
> 

That is a great question. I thought since pci_register_driver calls the
probe function, it would return with or without a zero, saving that
value in the .probe member. (I find my lack of C
experience...disturbing)

> I had already prepared a fix for this, but waited for 0-day to
> crunch through it.  I've just submitted it, so that's what I had
> in mind:
> 
> https://lore.kernel.org/r/f8ff40f35a9a5836d1371f60e85c09c5735e3c5e.1750497201.git.lukas@wunner.de/
> 

That one I've seen even prior to catching this one, and this is
originally what I had in mind based on what commit 6fd024893911
("amd64-agp: Probe unknown AGP devices the right way") removed (i.e.
!pci_find_capability) when you suggested checking for caps beforehand,
but I figured "why make other calls when .probe already does it right
off the bat?" Clearly, I was in the wrong there.

Nonetheless, thank you so much for the review, and Ccing me in the above
patch. Ultimately, what matters to me is getting the right fix in order,
and learning not only what I'd missed here (i.e. Reported-by trailer;
just wasn't quite certain how to approach the situation with SDN in
mind, as well as the original Cced developers), but also fundamental C
knowledge I've been trying to acquire. I appreciate you!

--
Best regards,
Ahmed Salem <x0rw3ll@gmail.com>

> Thanks,
> 
> Lukas
Re: [RFC PATCH] amd64-agp: do not bind to pci driver if probing fails
Posted by Lukas Wunner 3 months, 2 weeks ago
On Sat, Jun 21, 2025 at 07:15:31PM +0300, Ahmed Salem wrote:
> On 25/06/21 11:46AM, Lukas Wunner wrote:
> > On Sat, Jun 21, 2025 at 04:55:52AM +0300, Ahmed Salem wrote:
> > > --- a/drivers/char/agp/amd64-agp.c
> > > +++ b/drivers/char/agp/amd64-agp.c
> > > @@ -768,10 +768,15 @@ int __init agp_amd64_init(void)
> > >  
> > >  		/* Look for any AGP bridge */
> > >  		agp_amd64_pci_driver.id_table = agp_amd64_pci_promisc_table;
> > > -		err = driver_attach(&agp_amd64_pci_driver.driver);
> > > -		if (err == 0 && agp_bridges_found == 0) {
> > > +		if ((int *)agp_amd64_pci_driver.probe != 0) {
> > >  			pci_unregister_driver(&agp_amd64_pci_driver);
> > >  			err = -ENODEV;
> > > +		} else {
> > > +			err = driver_attach(&agp_amd64_pci_driver.driver);
> > > +			if (err == 0 && agp_bridges_found == 0) {
> > > +				pci_unregister_driver(&agp_amd64_pci_driver);
> > > +				err = -ENODEV;
> > > +			}
> > 
> > Is the "probe" member in agp_amd64_pci_driver overwritten with a
> > zero pointer anywhere?  I don't see that it is, so it seems the
> > else-branch is never entered.
> 
> That is a great question. I thought since pci_register_driver calls the
> probe function, it would return with or without a zero, saving that
> value in the .probe member.

You'd have to add parentheses and parameters, i.e.

  agp_amd64_pci_driver.probe(...)

to invoke the probe hook directly.  However, that wouldn't be an
acceptable approach, one needs to go through the API of the
driver core and not do things behind the driver core's back.


> > I had already prepared a fix for this, but waited for 0-day to
> > crunch through it.  I've just submitted it, so that's what I had
> > in mind:
> > 
> > https://lore.kernel.org/r/f8ff40f35a9a5836d1371f60e85c09c5735e3c5e.1750497201.git.lukas@wunner.de/
> 
> That one I've seen even prior to catching this one, and this is
> originally what I had in mind based on what commit 6fd024893911
> ("amd64-agp: Probe unknown AGP devices the right way") removed (i.e.
> !pci_find_capability) when you suggested checking for caps beforehand,
> but I figured "why make other calls when .probe already does it right
> off the bat?"

Right, it is somewhat silly, but this driver is for 20+ year old hardware
which is likely no longer in heavy use, the driver itself isn't actively
maintained anymore and might be dropped in a few years, so this approach
seemed like the least ugly and most acceptable option.

The real crime was to probe *any* PCI device and even make that the
default.  I think vfio_pci is probably the only other driver that
probes *any* PCI device and it does that only if requested by user
space I believe.  We'd risk regressing users if we changed the
"probe everything by default" behavior, so that's not a good option.

Thanks,

Lukas
Re: [RFC PATCH] amd64-agp: do not bind to pci driver if probing fails
Posted by Ahmed Salem 3 months, 2 weeks ago
On 25/06/21 09:21PM, Lukas Wunner wrote:
> On Sat, Jun 21, 2025 at 07:15:31PM +0300, Ahmed Salem wrote:
> > On 25/06/21 11:46AM, Lukas Wunner wrote:
> > > On Sat, Jun 21, 2025 at 04:55:52AM +0300, Ahmed Salem wrote:
> > > > --- a/drivers/char/agp/amd64-agp.c
> > > > +++ b/drivers/char/agp/amd64-agp.c
> > > > @@ -768,10 +768,15 @@ int __init agp_amd64_init(void)
> > > >  
> > > >  		/* Look for any AGP bridge */
> > > >  		agp_amd64_pci_driver.id_table = agp_amd64_pci_promisc_table;
> > > > -		err = driver_attach(&agp_amd64_pci_driver.driver);
> > > > -		if (err == 0 && agp_bridges_found == 0) {
> > > > +		if ((int *)agp_amd64_pci_driver.probe != 0) {
> > > >  			pci_unregister_driver(&agp_amd64_pci_driver);
> > > >  			err = -ENODEV;
> > > > +		} else {
> > > > +			err = driver_attach(&agp_amd64_pci_driver.driver);
> > > > +			if (err == 0 && agp_bridges_found == 0) {
> > > > +				pci_unregister_driver(&agp_amd64_pci_driver);
> > > > +				err = -ENODEV;
> > > > +			}
> > > 
> > > Is the "probe" member in agp_amd64_pci_driver overwritten with a
> > > zero pointer anywhere?  I don't see that it is, so it seems the
> > > else-branch is never entered.
> > 
> > That is a great question. I thought since pci_register_driver calls the
> > probe function, it would return with or without a zero, saving that
> > value in the .probe member.
> 
> You'd have to add parentheses and parameters, i.e.
> 
>   agp_amd64_pci_driver.probe(...)
> 
> to invoke the probe hook directly.  However, that wouldn't be an
> acceptable approach, one needs to go through the API of the
> driver core and not do things behind the driver core's back.
> 

Noted!

> 
> > > I had already prepared a fix for this, but waited for 0-day to
> > > crunch through it.  I've just submitted it, so that's what I had
> > > in mind:
> > > 
> > > https://lore.kernel.org/r/f8ff40f35a9a5836d1371f60e85c09c5735e3c5e.1750497201.git.lukas@wunner.de/
> > 
> > That one I've seen even prior to catching this one, and this is
> > originally what I had in mind based on what commit 6fd024893911
> > ("amd64-agp: Probe unknown AGP devices the right way") removed (i.e.
> > !pci_find_capability) when you suggested checking for caps beforehand,
> > but I figured "why make other calls when .probe already does it right
> > off the bat?"
> 
> Right, it is somewhat silly, but this driver is for 20+ year old hardware
> which is likely no longer in heavy use, the driver itself isn't actively
> maintained anymore and might be dropped in a few years, so this approach
> seemed like the least ugly and most acceptable option.
> 
> The real crime was to probe *any* PCI device and even make that the
> default.  I think vfio_pci is probably the only other driver that
> probes *any* PCI device and it does that only if requested by user
> space I believe.  We'd risk regressing users if we changed the
> "probe everything by default" behavior, so that's not a good option.
> 

Gotcha..That clarifies a whole lot, thank you so much!

--
Best regards,
Ahmed Salem <x0rw3ll@gmail.com>

> Thanks,
> 
> Lukas