drivers/gpu/nova-core/driver.rs | 5 +++++ rust/kernel/pci.rs | 6 ++++++ 2 files changed, 11 insertions(+)
Post-Kangrejos, the approach for NovaCore + VFIO has changed a bit: the idea now is that VFIO drivers, for NVIDIA GPUs that are supported by NovaCore, should bind directly to the GPU's VFs. (An earlier idea was to let NovaCore bind to the VFs, and then have NovaCore call into the upper (VFIO) module via Aux Bus, but this turns out to be awkward and is no longer in favor.) So, in order to support that: Nova-core must only bind to Physical Functions (PFs) and regular PCI devices, not to Virtual Functions (VFs) created through SR-IOV. Add a method to check if a PCI device is a Virtual Function (VF). This allows Rust drivers to determine whether a device is a VF created through SR-IOV. This is required in order to implement VFIO, because drivers such as NovaCore must only bind to Physical Functions (PFs) or regular PCI devices. The VFs must be left unclaimed, so that a VFIO kernel module can claim them. Use is_virtfn() in NovaCore, in preparation for it to be used in a VFIO scenario. I've based this on top of today's driver-core-next [1], because the first patch belongs there, and the second patch applies cleanly to either driver-core-next or drm-rust-next. So this seems like the easiest to work with. [1] https://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core.git/ John Hubbard (2): rust: pci: add is_virtfn(), to check for VFs gpu: nova-core: reject binding to SR-IOV Virtual Functions drivers/gpu/nova-core/driver.rs | 5 +++++ rust/kernel/pci.rs | 6 ++++++ 2 files changed, 11 insertions(+) base-commit: 6d97171ac6585de698df019b0bfea3f123fd8385 -- 2.51.0
On Wed Oct 1, 2025 at 7:07 AM JST, John Hubbard wrote: > Post-Kangrejos, the approach for NovaCore + VFIO has changed a bit: the > idea now is that VFIO drivers, for NVIDIA GPUs that are supported by > NovaCore, should bind directly to the GPU's VFs. (An earlier idea was to > let NovaCore bind to the VFs, and then have NovaCore call into the upper > (VFIO) module via Aux Bus, but this turns out to be awkward and is no > longer in favor.) So, in order to support that: > > Nova-core must only bind to Physical Functions (PFs) and regular PCI > devices, not to Virtual Functions (VFs) created through SR-IOV. Naive question: will guests also see the passed-through VF as a VF? If so, wouldn't this change also prevents guests from using Nova?
On 9/30/25 5:26 PM, Alexandre Courbot wrote: > On Wed Oct 1, 2025 at 7:07 AM JST, John Hubbard wrote: >> Post-Kangrejos, the approach for NovaCore + VFIO has changed a bit: the >> idea now is that VFIO drivers, for NVIDIA GPUs that are supported by >> NovaCore, should bind directly to the GPU's VFs. (An earlier idea was to >> let NovaCore bind to the VFs, and then have NovaCore call into the upper >> (VFIO) module via Aux Bus, but this turns out to be awkward and is no >> longer in favor.) So, in order to support that: >> >> Nova-core must only bind to Physical Functions (PFs) and regular PCI >> devices, not to Virtual Functions (VFs) created through SR-IOV. > > Naive question: will guests also see the passed-through VF as a VF? If > so, wouldn't this change also prevents guests from using Nova? I'm also new to this area. I would expect that guests *must* see these as PFs, otherwise...nothing makes any sense. Maybe Alex Williamson or Jason Gunthorpe (+CC) can chime in. thanks, -- John Hubbard
On Tue, Sep 30, 2025 at 06:26:23PM -0700, John Hubbard wrote: > On 9/30/25 5:26 PM, Alexandre Courbot wrote: > > On Wed Oct 1, 2025 at 7:07 AM JST, John Hubbard wrote: > >> Post-Kangrejos, the approach for NovaCore + VFIO has changed a bit: the > >> idea now is that VFIO drivers, for NVIDIA GPUs that are supported by > >> NovaCore, should bind directly to the GPU's VFs. (An earlier idea was to > >> let NovaCore bind to the VFs, and then have NovaCore call into the upper > >> (VFIO) module via Aux Bus, but this turns out to be awkward and is no > >> longer in favor.) So, in order to support that: > >> > >> Nova-core must only bind to Physical Functions (PFs) and regular PCI > >> devices, not to Virtual Functions (VFs) created through SR-IOV. > > > > Naive question: will guests also see the passed-through VF as a VF? If > > so, wouldn't this change also prevents guests from using Nova? > > I'm also new to this area. I would expect that guests *must* see > these as PFs, otherwise...nothing makes any sense. > > Maybe Alex Williamson or Jason Gunthorpe (+CC) can chime in. Driver should never do something like this. Novacore should work on a VF pretending to be a PF in a VM, and it should work directly on that same VF outside a VM. It is not the job of driver to make binding decisions like 'oh VFs of this devices are usually VFIO so I will fail probe'. VFIO users should use the disable driver autobinding sysfs before creating SRIOV instance to prevent this auto binding and then bind VFIO manually. Or userspace can manually unbind novacore from the VF and rebind VFIO. Jason
On Wed, 1 Oct 2025 11:46:29 -0300 Jason Gunthorpe <jgg@nvidia.com> wrote: > On Tue, Sep 30, 2025 at 06:26:23PM -0700, John Hubbard wrote: > > On 9/30/25 5:26 PM, Alexandre Courbot wrote: > > > On Wed Oct 1, 2025 at 7:07 AM JST, John Hubbard wrote: > > >> Post-Kangrejos, the approach for NovaCore + VFIO has changed a bit: the > > >> idea now is that VFIO drivers, for NVIDIA GPUs that are supported by > > >> NovaCore, should bind directly to the GPU's VFs. (An earlier idea was to > > >> let NovaCore bind to the VFs, and then have NovaCore call into the upper > > >> (VFIO) module via Aux Bus, but this turns out to be awkward and is no > > >> longer in favor.) So, in order to support that: > > >> > > >> Nova-core must only bind to Physical Functions (PFs) and regular PCI > > >> devices, not to Virtual Functions (VFs) created through SR-IOV. > > > > > > Naive question: will guests also see the passed-through VF as a VF? If > > > so, wouldn't this change also prevents guests from using Nova? > > > > I'm also new to this area. I would expect that guests *must* see > > these as PFs, otherwise...nothing makes any sense. To answer this specific question, a VF essentially appears as a PF to the VM. The relationship between a PF and VF is established when SR-IOV is configured and in part requires understanding the offset and stride of the VF enumeration, none of which is visible to the VM. The gaps in VF devices (ex. device ID register) are also emulated in the hypervisor stack. > > Maybe Alex Williamson or Jason Gunthorpe (+CC) can chime in. > > Driver should never do something like this. > > Novacore should work on a VF pretending to be a PF in a VM, and it > should work directly on that same VF outside a VM. > > It is not the job of driver to make binding decisions like 'oh VFs of > this devices are usually VFIO so I will fail probe'. > > VFIO users should use the disable driver autobinding sysfs before > creating SRIOV instance to prevent this auto binding and then bind > VFIO manually. > > Or userspace can manually unbind novacore from the VF and rebind VFIO. But this is also true, unbinding "native" host drivers is a fact of life for vfio and we do have the sriov_drivers_autoprobe sysfs attributes if a user wants to set a policy for automatically probing VF drivers for a PF. I think the question would be whether a "bare" VF really provides a useful device for nova-core to bind to or if we're just picking it up because the ID table matches. It's my impression that we require a fair bit of software emulation/virtualization in the host vGPU driver to turn the VF into something that can work like a PF in the VM and I don't know that we can require nova-core to make use of a VF without that emulation/virtualization layer. For example, aren't VRAM allocations for a VF done as part of profiling the VF through the vGPU host driver? Thanks, Alex
On Wed, Oct 01, 2025 at 12:16:31PM -0600, Alex Williamson wrote: > I think the question would be whether a "bare" VF really provides a > useful device for nova-core to bind to or if we're just picking it > up It really should work, actual linux containers are my goto reason for people wanting to use VF's without a virtualization layer. > fair bit of software emulation/virtualization in the host vGPU driver to > turn the VF into something that can work like a PF in the VM and I > don't know that we can require nova-core to make use of a VF without > that emulation/virtualization layer. For example, aren't VRAM > allocations for a VF done as part of profiling the VF through the vGPU > host driver? The VF profiling should be designed to work without VFIO. It is was one thing to have the VFIO variant driver profile mediated devices that only it can create, but now that it is a generic VF without mediation it doesn't make sense anymore. The question is how much mediation does the variant driver insert between the VM and the VF, and from what I can see that is mostly limited to config space.. IOW, I would expect nova-core on the PF has a way to profile and activate the VF to a usable state and then nova-core can run either through a vm or directly on the VF. At least this is how all the NIC drivers have their SRIOV support designed today. Jason
On Wed Oct 1, 2025 at 10:26 AM JST, John Hubbard wrote: > On 9/30/25 5:26 PM, Alexandre Courbot wrote: >> On Wed Oct 1, 2025 at 7:07 AM JST, John Hubbard wrote: >>> Post-Kangrejos, the approach for NovaCore + VFIO has changed a bit: the >>> idea now is that VFIO drivers, for NVIDIA GPUs that are supported by >>> NovaCore, should bind directly to the GPU's VFs. (An earlier idea was to >>> let NovaCore bind to the VFs, and then have NovaCore call into the upper >>> (VFIO) module via Aux Bus, but this turns out to be awkward and is no >>> longer in favor.) So, in order to support that: >>> >>> Nova-core must only bind to Physical Functions (PFs) and regular PCI >>> devices, not to Virtual Functions (VFs) created through SR-IOV. >> >> Naive question: will guests also see the passed-through VF as a VF? If >> so, wouldn't this change also prevents guests from using Nova? > > I'm also new to this area. I would expect that guests *must* see > these as PFs, otherwise...nothing makes any sense. But if the guest sees the passed-through VF as a PF, won't it try to do things it is not supposed to do like loading the GSP firmware (which is managed by the host)?
On 9/30/25 6:39 PM, Alexandre Courbot wrote: > On Wed Oct 1, 2025 at 10:26 AM JST, John Hubbard wrote: >> On 9/30/25 5:26 PM, Alexandre Courbot wrote: >>> On Wed Oct 1, 2025 at 7:07 AM JST, John Hubbard wrote: >>>> Post-Kangrejos, the approach for NovaCore + VFIO has changed a bit: the >>>> idea now is that VFIO drivers, for NVIDIA GPUs that are supported by >>>> NovaCore, should bind directly to the GPU's VFs. (An earlier idea was to >>>> let NovaCore bind to the VFs, and then have NovaCore call into the upper >>>> (VFIO) module via Aux Bus, but this turns out to be awkward and is no >>>> longer in favor.) So, in order to support that: >>>> >>>> Nova-core must only bind to Physical Functions (PFs) and regular PCI >>>> devices, not to Virtual Functions (VFs) created through SR-IOV. >>> >>> Naive question: will guests also see the passed-through VF as a VF? If >>> so, wouldn't this change also prevents guests from using Nova? >> >> I'm also new to this area. I would expect that guests *must* see >> these as PFs, otherwise...nothing makes any sense. > > But if the guest sees the passed-through VF as a PF, won't it try to > do things it is not supposed to do like loading the GSP firmware (which > is managed by the host)? Yes. A non-paravirtualized guest will attempt to behave just like a bare metal driver would behave. It's the job of the various layers of virtualization to intercept and modify such things appropriately. Looking ahead: if the VFIO experts come back and tell us that guests see these as VFs, then there is still a way forward, because we talked about loading nova-core with a "vfio_mode" kernel module parameter. So then it becomes "if vfio_mode, then skip VFs". thanks, -- John Hubbard
On 1.10.2025 4.45, John Hubbard wrote: > On 9/30/25 6:39 PM, Alexandre Courbot wrote: >> On Wed Oct 1, 2025 at 10:26 AM JST, John Hubbard wrote: >>> On 9/30/25 5:26 PM, Alexandre Courbot wrote: >>>> On Wed Oct 1, 2025 at 7:07 AM JST, John Hubbard wrote: >>>>> Post-Kangrejos, the approach for NovaCore + VFIO has changed a bit: the >>>>> idea now is that VFIO drivers, for NVIDIA GPUs that are supported by >>>>> NovaCore, should bind directly to the GPU's VFs. (An earlier idea was to >>>>> let NovaCore bind to the VFs, and then have NovaCore call into the upper >>>>> (VFIO) module via Aux Bus, but this turns out to be awkward and is no >>>>> longer in favor.) So, in order to support that: >>>>> >>>>> Nova-core must only bind to Physical Functions (PFs) and regular PCI >>>>> devices, not to Virtual Functions (VFs) created through SR-IOV. >>>> >>>> Naive question: will guests also see the passed-through VF as a VF? If >>>> so, wouldn't this change also prevents guests from using Nova? >>> pdev->virtfn (VF) is set to "true" when admin enabling VFs via the sysfs and PF driver. Presumably, pdev->virtfn will be "false" all the time in the guest. >>> I'm also new to this area. I would expect that guests *must* see >>> these as PFs, otherwise...nothing makes any sense. >> >> But if the guest sees the passed-through VF as a PF, won't it try to >> do things it is not supposed to do like loading the GSP firmware (which >> is managed by the host)? > The guest driver will read PMC_BOOT_1 and check PMC_BOOT_1_VGPU_VF flag to tell if it is running on a VF or a PF. https://github.com/NVIDIA/open-gpu-kernel-modules/blob/main/src/nvidia/arch/nvalloc/unix/src/os-hypervisor.c#L945 > Yes. A non-paravirtualized guest will attempt to behave just like a > bare metal driver would behave. It's the job of the various layers > of virtualization to intercept and modify such things appropriately. > > Looking ahead: if the VFIO experts come back and tell us that guests > see these as VFs, then there is still a way forward, because we > talked about loading nova-core with a "vfio_mode" kernel module > parameter. So then it becomes "if vfio_mode, then skip VFs". > > > thanks,
On Wed, Oct 01, 2025 at 08:09:37AM +0000, Zhi Wang wrote: > >> But if the guest sees the passed-through VF as a PF, won't it try to > >> do things it is not supposed to do like loading the GSP firmware (which > >> is managed by the host)? > > > > The guest driver will read PMC_BOOT_1 and check PMC_BOOT_1_VGPU_VF flag > to tell if it is running on a VF or a PF. Yes exactly, and then novacore should modify its behavior and operate the device in the different mode. It doesn't matter if a VM is involved or not, a VF driver running side by side wit the PF driver should still work. There are use cases where people do this, eg they can stick the VF into a linux container and use the SRIOV mechanism as a QOS control. 'This container only gets 1/4 of a GPU' Jason
On 2025-10-01 at 08:07 +1000, John Hubbard <jhubbard@nvidia.com> wrote... > Post-Kangrejos, the approach for NovaCore + VFIO has changed a bit: the > idea now is that VFIO drivers, for NVIDIA GPUs that are supported by > NovaCore, should bind directly to the GPU's VFs. (An earlier idea was to > let NovaCore bind to the VFs, and then have NovaCore call into the upper > (VFIO) module via Aux Bus, but this turns out to be awkward and is no > longer in favor.) So, in order to support that: > > Nova-core must only bind to Physical Functions (PFs) and regular PCI > devices, not to Virtual Functions (VFs) created through SR-IOV. > > Add a method to check if a PCI device is a Virtual Function (VF). This > allows Rust drivers to determine whether a device is a VF created > through SR-IOV. This is required in order to implement VFIO, because > drivers such as NovaCore must only bind to Physical Functions (PFs) or > regular PCI devices. The VFs must be left unclaimed, so that a VFIO > kernel module can claim them. Curiously based on a quick glance I didn't see any other drivers doing this which makes me wonder why we're different here. But it seems likely their virtual functions are supported by the same driver rather than requiring a different VF specific driver (or I glanced too quickly!). I'm guessing the proposal is to fail the probe() function in nova-core for the VFs - I'm not sure but does the driver core continue to try probing other drivers if one fails probe()? It seems like this would be something best filtered on in the device id table, although I understand that's not possible today. > Use is_virtfn() in NovaCore, in preparation for it to be used in a VFIO > scenario. > > I've based this on top of today's driver-core-next [1], because the > first patch belongs there, and the second patch applies cleanly to either > driver-core-next or drm-rust-next. So this seems like the easiest to > work with. > > > [1] https://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core.git/ > > John Hubbard (2): > rust: pci: add is_virtfn(), to check for VFs > gpu: nova-core: reject binding to SR-IOV Virtual Functions > > drivers/gpu/nova-core/driver.rs | 5 +++++ > rust/kernel/pci.rs | 6 ++++++ > 2 files changed, 11 insertions(+) > > > base-commit: 6d97171ac6585de698df019b0bfea3f123fd8385 > -- > 2.51.0 >
On 9/30/25 5:29 PM, Alistair Popple wrote: > On 2025-10-01 at 08:07 +1000, John Hubbard <jhubbard@nvidia.com> wrote... >> Post-Kangrejos, the approach for NovaCore + VFIO has changed a bit: the >> idea now is that VFIO drivers, for NVIDIA GPUs that are supported by >> NovaCore, should bind directly to the GPU's VFs. (An earlier idea was to >> let NovaCore bind to the VFs, and then have NovaCore call into the upper >> (VFIO) module via Aux Bus, but this turns out to be awkward and is no >> longer in favor.) So, in order to support that: >> >> Nova-core must only bind to Physical Functions (PFs) and regular PCI >> devices, not to Virtual Functions (VFs) created through SR-IOV. >> >> Add a method to check if a PCI device is a Virtual Function (VF). This >> allows Rust drivers to determine whether a device is a VF created >> through SR-IOV. This is required in order to implement VFIO, because >> drivers such as NovaCore must only bind to Physical Functions (PFs) or >> regular PCI devices. The VFs must be left unclaimed, so that a VFIO >> kernel module can claim them. > > Curiously based on a quick glance I didn't see any other drivers doing this > which makes me wonder why we're different here. But it seems likely their > virtual functions are supported by the same driver rather than requiring a > different VF specific driver (or I glanced too quickly!). I haven't checked into that, but it sounds reasonable. > > I'm guessing the proposal is to fail the probe() function in nova-core for > the VFs - I'm not sure but does the driver core continue to try probing other > drivers if one fails probe()? It seems like this would be something best > filtered on in the device id table, although I understand that's not possible > today. Yes, from my experience with building Nouveau and Nova and running both on the same system, with 2 GPUs: when Nova gets probed first, because Nova is a work in progress, however far it gets, it still fails the probe in the end. And then Nouveau gets probed, and claims the GPU. thanks, -- John Hubbard
On Wed Oct 1, 2025 at 3:22 AM CEST, John Hubbard wrote: > On 9/30/25 5:29 PM, Alistair Popple wrote: >> On 2025-10-01 at 08:07 +1000, John Hubbard <jhubbard@nvidia.com> wrote... >>> Post-Kangrejos, the approach for NovaCore + VFIO has changed a bit: the >>> idea now is that VFIO drivers, for NVIDIA GPUs that are supported by >>> NovaCore, should bind directly to the GPU's VFs. (An earlier idea was to >>> let NovaCore bind to the VFs, and then have NovaCore call into the upper >>> (VFIO) module via Aux Bus, but this turns out to be awkward and is no >>> longer in favor.) So, in order to support that: >>> >>> Nova-core must only bind to Physical Functions (PFs) and regular PCI >>> devices, not to Virtual Functions (VFs) created through SR-IOV. >>> >>> Add a method to check if a PCI device is a Virtual Function (VF). This >>> allows Rust drivers to determine whether a device is a VF created >>> through SR-IOV. This is required in order to implement VFIO, because >>> drivers such as NovaCore must only bind to Physical Functions (PFs) or >>> regular PCI devices. The VFs must be left unclaimed, so that a VFIO >>> kernel module can claim them. >> >> Curiously based on a quick glance I didn't see any other drivers doing this >> which makes me wonder why we're different here. But it seems likely their >> virtual functions are supported by the same driver rather than requiring a >> different VF specific driver (or I glanced too quickly!). > > I haven't checked into that, but it sounds reasonable. There are multiple cases: Some devices have different PCI device IDs for their physical and virtual functions and different drivers handling then. One example for that is Intel IXGBE. But there are also some drivers, which do a similar check and just stop probing if they detect a virtual function. So, this patch series does not do anything uncommon. >> I'm guessing the proposal is to fail the probe() function in nova-core for >> the VFs - I'm not sure but does the driver core continue to try probing other >> drivers if one fails probe()? It seems like this would be something best >> filtered on in the device id table, although I understand that's not possible >> today. Yes, the driver core keeps going until it finds a driver that succeeds probing or no driver is left to probe. (This behavior is also the reason for the name probe() in the first place.) However, nowadays we ideally know whether a driver fits a device before probe() is called, but there are still exceptions; with PCI virtual functions we've just hit one of those. Theoretically, we could also indicate whether a driver handles virtual functions through a boolean in struct pci_driver, which would be a bit more elegant. If you want I can also pick this up with my SR-IOV RFC which will probably touch the driver structure as well; I plan to send something in a few days.
On 1.10.2025 13.32, Danilo Krummrich wrote: > On Wed Oct 1, 2025 at 3:22 AM CEST, John Hubbard wrote: >> On 9/30/25 5:29 PM, Alistair Popple wrote: >>> On 2025-10-01 at 08:07 +1000, John Hubbard <jhubbard@nvidia.com> wrote... >>>> Post-Kangrejos, the approach for NovaCore + VFIO has changed a bit: the >>>> idea now is that VFIO drivers, for NVIDIA GPUs that are supported by >>>> NovaCore, should bind directly to the GPU's VFs. (An earlier idea was to >>>> let NovaCore bind to the VFs, and then have NovaCore call into the upper >>>> (VFIO) module via Aux Bus, but this turns out to be awkward and is no >>>> longer in favor.) So, in order to support that: >>>> >>>> Nova-core must only bind to Physical Functions (PFs) and regular PCI >>>> devices, not to Virtual Functions (VFs) created through SR-IOV. >>>> >>>> Add a method to check if a PCI device is a Virtual Function (VF). This >>>> allows Rust drivers to determine whether a device is a VF created >>>> through SR-IOV. This is required in order to implement VFIO, because >>>> drivers such as NovaCore must only bind to Physical Functions (PFs) or >>>> regular PCI devices. The VFs must be left unclaimed, so that a VFIO >>>> kernel module can claim them. >>> >>> Curiously based on a quick glance I didn't see any other drivers doing this >>> which makes me wonder why we're different here. But it seems likely their >>> virtual functions are supported by the same driver rather than requiring a >>> different VF specific driver (or I glanced too quickly!). >> >> I haven't checked into that, but it sounds reasonable. > > There are multiple cases: > > Some devices have different PCI device IDs for their physical and virtual > functions and different drivers handling then. One example for that is Intel > IXGBE. > > But there are also some drivers, which do a similar check and just stop probing > if they detect a virtual function. > Right, it really depends on the hardware design and the intended use cases, and is therefore device-specific. In networking, for example, there are scenarios where VFs are used directly on bare metal - such as with DPDK to bypass the kernel network stack for better performance. In such cases, PF and VF drivers can end up being quite different and VF driver can attach on the baremetal (via pdev->is_virtfn in probe()). Similarly, in the GPU domain, there are comparable scenarios where VFs are exposed on bare metal for use cases, like containers. (I remember Xe driver can be attached to a VF in bare metal for such a use case.) For NVIDIA GPUs, VFs are only associated with VMs. So this change makes sense within this scope. Z. > So, this patch series does not do anything uncommon. > >>> I'm guessing the proposal is to fail the probe() function in nova-core for >>> the VFs - I'm not sure but does the driver core continue to try probing other >>> drivers if one fails probe()? It seems like this would be something best >>> filtered on in the device id table, although I understand that's not possible >>> today. > > Yes, the driver core keeps going until it finds a driver that succeeds probing > or no driver is left to probe. (This behavior is also the reason for the name > probe() in the first place.) > > However, nowadays we ideally know whether a driver fits a device before probe() > is called, but there are still exceptions; with PCI virtual functions we've just > hit one of those. > > Theoretically, we could also indicate whether a driver handles virtual functions > through a boolean in struct pci_driver, which would be a bit more elegant. > > If you want I can also pick this up with my SR-IOV RFC which will probably touch > the driver structure as well; I plan to send something in a few days.
© 2016 - 2025 Red Hat, Inc.