[PATCH v5 3/4] gpu: nova-core: avoid probing non-display/compute PCI functions

John Hubbard posted 4 patches 1 month, 1 week ago
There is a newer version of this series
[PATCH v5 3/4] gpu: nova-core: avoid probing non-display/compute PCI functions
Posted by John Hubbard 1 month, 1 week ago
NovaCore has so far been too imprecise about figuring out if .probe()
has found a supported PCI PF (Physical Function). By that I mean:
.probe() sets up BAR0 (which involves a lot of very careful devres and
Device<Bound> details behind the scenes). And then if it is dealing with
a non-supported device such as the .1 audio PF on many GPUs, it fails
out due to an unexpected BAR0 size. We have been fortunate that the BAR0
sizes are different.

Really, we should be filtering on PCI class ID instead. These days I
think we can confidently pick out Nova's supported PF's via PCI class
ID. And if not, then we'll revisit.

The approach here is to filter on "Display VGA" or "Display 3D", which
is how PCI class IDs express "this is a modern GPU's PF".

Cc: Danilo Krummrich <dakr@kernel.org>
Cc: Alexandre Courbot <acourbot@nvidia.com>
Cc: Elle Rhumsaa <elle@weathered-steel.dev>
Signed-off-by: John Hubbard <jhubbard@nvidia.com>
---
 drivers/gpu/nova-core/driver.rs | 33 ++++++++++++++++++++++++++++-----
 rust/kernel/pci.rs              | 21 +++++++++++++++++++++
 2 files changed, 49 insertions(+), 5 deletions(-)

diff --git a/drivers/gpu/nova-core/driver.rs b/drivers/gpu/nova-core/driver.rs
index 274989ea1fb4..5d23a91f51dd 100644
--- a/drivers/gpu/nova-core/driver.rs
+++ b/drivers/gpu/nova-core/driver.rs
@@ -1,6 +1,14 @@
 // SPDX-License-Identifier: GPL-2.0
 
-use kernel::{auxiliary, bindings, c_str, device::Core, pci, prelude::*, sizes::SZ_16M, sync::Arc};
+use kernel::{
+    auxiliary, c_str,
+    device::Core,
+    pci,
+    pci::{Class, ClassMask, Vendor},
+    prelude::*,
+    sizes::SZ_16M,
+    sync::Arc,
+};
 
 use crate::gpu::Gpu;
 
@@ -18,10 +26,25 @@ pub(crate) struct NovaCore {
     PCI_TABLE,
     MODULE_PCI_TABLE,
     <NovaCore as pci::Driver>::IdInfo,
-    [(
-        pci::DeviceId::from_id(bindings::PCI_VENDOR_ID_NVIDIA, bindings::PCI_ANY_ID as u32),
-        ()
-    )]
+    [
+        // Modern NVIDIA GPUs will show up as either VGA or 3D controllers.
+        (
+            pci::DeviceId::from_class_and_vendor(
+                Class::DISPLAY_VGA,
+                ClassMask::ClassSubclass,
+                Vendor::NVIDIA
+            ),
+            ()
+        ),
+        (
+            pci::DeviceId::from_class_and_vendor(
+                Class::DISPLAY_3D,
+                ClassMask::ClassSubclass,
+                Vendor::NVIDIA
+            ),
+            ()
+        ),
+    ]
 );
 
 impl pci::Driver for NovaCore {
diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
index d4675b7d4a86..504593c882c9 100644
--- a/rust/kernel/pci.rs
+++ b/rust/kernel/pci.rs
@@ -161,6 +161,27 @@ pub const fn from_class(class: u32, class_mask: u32) -> Self {
             override_only: 0,
         })
     }
+
+    /// Create a new `pci::DeviceId` from a class number, mask, and specific vendor.
+    ///
+    /// This is more targeted than [`DeviceId::from_class`]: in addition to matching by Vendor, it
+    /// also matches the PCI Class (up to the entire 24 bits, depending on the mask).
+    pub const fn from_class_and_vendor(
+        class: Class,
+        class_mask: ClassMask,
+        vendor: Vendor,
+    ) -> Self {
+        Self(bindings::pci_device_id {
+            vendor: vendor.as_raw(),
+            device: DeviceId::PCI_ANY_ID,
+            subvendor: DeviceId::PCI_ANY_ID,
+            subdevice: DeviceId::PCI_ANY_ID,
+            class: class.as_raw(),
+            class_mask: class_mask.as_raw(),
+            driver_data: 0,
+            override_only: 0,
+        })
+    }
 }
 
 // SAFETY: `DeviceId` is a `#[repr(transparent)]` wrapper of `pci_device_id` and does not add
-- 
2.50.1
Re: [PATCH v5 3/4] gpu: nova-core: avoid probing non-display/compute PCI functions
Posted by Danilo Krummrich 1 month, 1 week ago
On Thu Aug 21, 2025 at 6:42 AM CEST, John Hubbard wrote:
> NovaCore has so far been too imprecise about figuring out if .probe()
> has found a supported PCI PF (Physical Function). By that I mean:
> .probe() sets up BAR0 (which involves a lot of very careful devres and
> Device<Bound> details behind the scenes). And then if it is dealing with
> a non-supported device such as the .1 audio PF on many GPUs, it fails
> out due to an unexpected BAR0 size. We have been fortunate that the BAR0
> sizes are different.
>
> Really, we should be filtering on PCI class ID instead. These days I
> think we can confidently pick out Nova's supported PF's via PCI class
> ID. And if not, then we'll revisit.
>
> The approach here is to filter on "Display VGA" or "Display 3D", which
> is how PCI class IDs express "this is a modern GPU's PF".
>
> Cc: Danilo Krummrich <dakr@kernel.org>
> Cc: Alexandre Courbot <acourbot@nvidia.com>
> Cc: Elle Rhumsaa <elle@weathered-steel.dev>
> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
> ---
>  drivers/gpu/nova-core/driver.rs | 33 ++++++++++++++++++++++++++++-----
>  rust/kernel/pci.rs              | 21 +++++++++++++++++++++

Can you please split this one up in two patches?

>  2 files changed, 49 insertions(+), 5 deletions(-)
Re: [PATCH v5 3/4] gpu: nova-core: avoid probing non-display/compute PCI functions
Posted by John Hubbard 1 month, 1 week ago
On 8/21/25 3:52 AM, Danilo Krummrich wrote:
> On Thu Aug 21, 2025 at 6:42 AM CEST, John Hubbard wrote:
>> NovaCore has so far been too imprecise about figuring out if .probe()
>> has found a supported PCI PF (Physical Function). By that I mean:
>> .probe() sets up BAR0 (which involves a lot of very careful devres and
>> Device<Bound> details behind the scenes). And then if it is dealing with
>> a non-supported device such as the .1 audio PF on many GPUs, it fails
>> out due to an unexpected BAR0 size. We have been fortunate that the BAR0
>> sizes are different.
>>
>> Really, we should be filtering on PCI class ID instead. These days I
>> think we can confidently pick out Nova's supported PF's via PCI class
>> ID. And if not, then we'll revisit.
>>
>> The approach here is to filter on "Display VGA" or "Display 3D", which
>> is how PCI class IDs express "this is a modern GPU's PF".
>>
>> Cc: Danilo Krummrich <dakr@kernel.org>
>> Cc: Alexandre Courbot <acourbot@nvidia.com>
>> Cc: Elle Rhumsaa <elle@weathered-steel.dev>
>> Signed-off-by: John Hubbard <jhubbard@nvidia.com>
>> ---
>>   drivers/gpu/nova-core/driver.rs | 33 ++++++++++++++++++++++++++++-----
>>   rust/kernel/pci.rs              | 21 +++++++++++++++++++++
> 
> Can you please split this one up in two patches?

Sure.

> 
>>   2 files changed, 49 insertions(+), 5 deletions(-)

thanks,
-- 
John Hubbard