[v1] rust: pci: add abstractions for SR-IOV capability

[PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Peter Colberg 2 months, 3 weeks ago

Add a method to return the Physical Function (PF) device for a Virtual
Function (VF) device in the bound device context.

Unlike for a PCI driver written in C, guarantee that when a VF device is
bound to a driver, the underlying PF device is bound to a driver, too.

When a device with enabled VFs is unbound from a driver, invoke the
sriov_configure() callback to disable SR-IOV before the unbind()
callback. To ensure the guarantee is upheld, call disable_sriov()
to remove all VF devices if the driver has not done so already.

Suggested-by: Danilo Krummrich <dakr@kernel.org>
Signed-off-by: Peter Colberg <pcolberg@redhat.com>
---
 rust/kernel/pci.rs | 47 +++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 47 insertions(+)

diff --git a/rust/kernel/pci.rs b/rust/kernel/pci.rs
index f9054c52a3bdff2c42273366a4058d943ee5fd76..d6cc5d7e7cd7a3b6e451c0ff5aea044b89c6774a 100644
--- a/rust/kernel/pci.rs
+++ b/rust/kernel/pci.rs
@@ -120,6 +120,19 @@ extern "C" fn remove_callback(pdev: *mut bindings::pci_dev) {
         // and stored a `Pin<KBox<T>>`.
         let data = unsafe { pdev.as_ref().drvdata_obtain::<T>() };
 
+        // Always disable SR-IOV before `unbind()` to guarantee that when a VF device is bound
+        // to a driver, the underlying PF device is bound to a driver, too.
+        #[cfg(CONFIG_PCI_IOV)]
+        {
+            // First, allow the driver to gracefully disable SR-IOV by itself.
+            if T::HAS_SRIOV_CONFIGURE && pdev.num_vf() != 0 {
+                let _ = T::sriov_configure(pdev, 0);
+            }
+            // Then, forcibly disable SR-IOV if the driver has not done so already,
+            // to uphold the guarantee with regard to driver binding described above.
+            pdev.disable_sriov();
+        }
+
         T::unbind(pdev, data.as_ref());
     }
 
@@ -332,6 +345,11 @@ fn unbind(dev: &Device<device::Core>, this: Pin<&Self>) {
     /// [`Device`] by writing the number of Virtual Functions (VF), `nr_virtfn` or zero to the
     /// sysfs file `sriov_numvfs` for this device. Implementing this callback is optional.
     ///
+    /// Further, and unlike for a PCI driver written in C, when a PF device with enabled VFs is
+    /// unbound from its bound [`Driver`], the `sriov_configure()` callback is invoked to disable
+    /// SR-IOV before the `unbind()` callback. This guarantees that when a VF device is bound to a
+    /// driver, the underlying PF device is bound to a driver, too.
+    ///
     /// Upon success, this callback must return the number of VFs that were enabled, or zero if
     /// SR-IOV was disabled.
     ///
@@ -500,6 +518,35 @@ pub fn pci_class(&self) -> Class {
     }
 }
 
+impl Device<device::Bound> {
+    /// Returns the Physical Function (PF) device for a Virtual Function (VF) device.
+    ///
+    /// # Examples
+    ///
+    /// The following example illustrates how to obtain the private driver data of the PF device,
+    /// where `vf_pdev` is the VF device of reference type `&Device<Core>` or `&Device<Bound>`.
+    ///
+    /// ```
+    /// let pf_pdev = vf_pdev.physfn()?;
+    /// let pf_drvdata = pf_pdev.as_ref().drvdata::<MyDriver>()?;
+    /// ```
+    #[cfg(CONFIG_PCI_IOV)]
+    pub fn physfn(&self) -> Result<&Device<device::Bound>> {
+        if !self.is_virtfn() {
+            return Err(EINVAL);
+        }
+        // SAFETY:
+        // `self.as_raw` returns a valid pointer to a `struct pci_dev`.
+        //
+        // `physfn` is a valid pointer to a `struct pci_dev` since self.is_virtfn() is `true`.
+        //
+        // `physfn` may be cast to a `Device<device::Bound>` since `pci::Driver::remove()` calls
+        // `disable_sriov()` to remove all VF devices, which guarantees that the underlying
+        // PF device is always bound to a driver when the VF device is bound to a driver.
+        Ok(unsafe { &*(*self.as_raw()).__bindgen_anon_1.physfn.cast() })
+    }
+}
+
 impl Device<device::Core> {
     /// Enable memory resources for this device.
     pub fn enable_device_mem(&self) -> Result {

-- 
2.51.1

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Jason Gunthorpe 2 months, 2 weeks ago

On Wed, Nov 19, 2025 at 05:19:11PM -0500, Peter Colberg wrote:
> Add a method to return the Physical Function (PF) device for a Virtual
> Function (VF) device in the bound device context.
> 
> Unlike for a PCI driver written in C, guarantee that when a VF device is
> bound to a driver, the underlying PF device is bound to a driver, too.

You can't do this as an absolutely statement from rust code alone,
this statement is confused.

> When a device with enabled VFs is unbound from a driver, invoke the
> sriov_configure() callback to disable SR-IOV before the unbind()
> callback. To ensure the guarantee is upheld, call disable_sriov()
> to remove all VF devices if the driver has not done so already.

This doesn't seem like it should be in this patch.

Good drivers using the PCI APIs should be calling pci_disable_sriov()
during their unbind. 

The prior patch #3 should not have added "safe" bindings for
enable_sriov that allow this lifetime rule to be violated.

> +    #[cfg(CONFIG_PCI_IOV)]
> +    pub fn physfn(&self) -> Result<&Device<device::Bound>> {
> +        if !self.is_virtfn() {
> +            return Err(EINVAL);
> +        }
> +        // SAFETY:
> +        // `self.as_raw` returns a valid pointer to a `struct pci_dev`.
> +        //
> +        // `physfn` is a valid pointer to a `struct pci_dev` since self.is_virtfn() is `true`.
> +        //
> +        // `physfn` may be cast to a `Device<device::Bound>` since `pci::Driver::remove()` calls
> +        // `disable_sriov()` to remove all VF devices, which guarantees that the underlying
> +        // PF device is always bound to a driver when the VF device is bound to a driver.

Wrong safety statement. There are drivers that don't call
disable_sriov(). You need to also check that the driver attached to
the PF is actually working properly.

I do not like to see this attempt to open code the tricky login of
pci_iov_get_pf_drvdata() in rust without understanding the issues :(

Jason

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Peter Colberg 2 months ago

On Fri, Nov 21, 2025 at 07:26:42PM -0400, Jason Gunthorpe wrote:
> On Wed, Nov 19, 2025 at 05:19:11PM -0500, Peter Colberg wrote:
> > Add a method to return the Physical Function (PF) device for a Virtual
> > Function (VF) device in the bound device context.
> > 
> > Unlike for a PCI driver written in C, guarantee that when a VF device is
> > bound to a driver, the underlying PF device is bound to a driver, too.
> 
> You can't do this as an absolutely statement from rust code alone,
> this statement is confused.
> 
> > When a device with enabled VFs is unbound from a driver, invoke the
> > sriov_configure() callback to disable SR-IOV before the unbind()
> > callback. To ensure the guarantee is upheld, call disable_sriov()
> > to remove all VF devices if the driver has not done so already.
> 
> This doesn't seem like it should be in this patch.
> 
> Good drivers using the PCI APIs should be calling pci_disable_sriov()
> during their unbind. 
> 
> The prior patch #3 should not have added "safe" bindings for
> enable_sriov that allow this lifetime rule to be violated.
> 
> > +    #[cfg(CONFIG_PCI_IOV)]
> > +    pub fn physfn(&self) -> Result<&Device<device::Bound>> {
> > +        if !self.is_virtfn() {
> > +            return Err(EINVAL);
> > +        }
> > +        // SAFETY:
> > +        // `self.as_raw` returns a valid pointer to a `struct pci_dev`.
> > +        //
> > +        // `physfn` is a valid pointer to a `struct pci_dev` since self.is_virtfn() is `true`.
> > +        //
> > +        // `physfn` may be cast to a `Device<device::Bound>` since `pci::Driver::remove()` calls
> > +        // `disable_sriov()` to remove all VF devices, which guarantees that the underlying
> > +        // PF device is always bound to a driver when the VF device is bound to a driver.
> 
> Wrong safety statement. There are drivers that don't call
> disable_sriov(). You need to also check that the driver attached to
> the PF is actually working properly.

Thank you for the review. I missed the obvious case where the
PF driver is a C driver that does not disable SR-IOV on unbind
and the VF driver is a Rust driver that invokes physfn().

> I do not like to see this attempt to open code the tricky login of
> pci_iov_get_pf_drvdata() in rust without understanding the issues :(

I will work on a proper solution based on Danilo's proposal [1].

[1] https://lore.kernel.org/all/DEFL4TG0WX1C.2GLH4417EPU3V@kernel.org/

Thanks,
Peter

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Danilo Krummrich 2 months, 2 weeks ago

On Sat Nov 22, 2025 at 12:26 PM NZDT, Jason Gunthorpe wrote:
> On Wed, Nov 19, 2025 at 05:19:11PM -0500, Peter Colberg wrote:
>> Add a method to return the Physical Function (PF) device for a Virtual
>> Function (VF) device in the bound device context.
>> 
>> Unlike for a PCI driver written in C, guarantee that when a VF device is
>> bound to a driver, the underlying PF device is bound to a driver, too.
>
> You can't do this as an absolutely statement from rust code alone,
> this statement is confused.

Indeed! However, I'd like to see that we actually provide this guarantee from
the C PCI code.

So far I haven't heard a convincing reason for not providing this guarantee. The
only reason not to guarantee this I have heard is that some PF drivers only
enable SR-IOV and hence could be unloaded afterwards. However, I think there is
no strong reason to do so.

What I would like to see is that we unbind VF drivers when the PF driver is
unbound in general, analogous to what we are guaranteed by the auxiliary bus.

>> +    #[cfg(CONFIG_PCI_IOV)]
>> +    pub fn physfn(&self) -> Result<&Device<device::Bound>> {
>> +        if !self.is_virtfn() {
>> +            return Err(EINVAL);
>> +        }
>> +        // SAFETY:
>> +        // `self.as_raw` returns a valid pointer to a `struct pci_dev`.
>> +        //
>> +        // `physfn` is a valid pointer to a `struct pci_dev` since self.is_virtfn() is `true`.
>> +        //
>> +        // `physfn` may be cast to a `Device<device::Bound>` since `pci::Driver::remove()` calls
>> +        // `disable_sriov()` to remove all VF devices, which guarantees that the underlying
>> +        // PF device is always bound to a driver when the VF device is bound to a driver.
>
> Wrong safety statement. There are drivers that don't call
> disable_sriov(). You need to also check that the driver attached to
> the PF is actually working properly.

Indeed, with this patch, only Rust drivers provide this guarantee of the VF
being bound when the PF is bound.

> I do not like to see this attempt to open code the tricky login of
> pci_iov_get_pf_drvdata() in rust without understanding the issues :(

I discussed this with Peter in advance (thanks Peter for your work on this
topic!), and as mentioned above I'd like to see this series to propose that we
always guarantee that a VF is bound when the corresponding PF is bound.

With this, the above code will be correct and a driver can use the generic
infrastructure to:

  1) Call pci::Device<Bound>::physfn() returning a Result<pci::Device<Bound>>
  2) Grab the driver's device private data from the returned Device<Bound>

Note that 2) (i.e. accessing the driver's device private data with
Device::drvdata() [1]) ensures that the device is actually bound (drvdata() is
only implemented for Device<Bound>) and that the returned type actually matches
the type of the object that has been stored.

Since we always need those two checks when accessing a driver's device private
data, it is already done in the generic drvdata() accessor.

Therefore the only additional guarantee we have to give is that VF bound implies
PF bound. Otherwise physfn() would need to be unsafe and the driver would need
to promise that this is the case. From there on drvdata() already does the other
checks as mentioned.

I suggest to have a look at [2] for an example with how this turns out with the
auxiliary bus [2][3], since in the end it's the same problem, i.e. an auxiliary
driver calling into its parent, except that the auxiliary bus already guarantees
that the parent is bound when the child is bound.

Given that, there is no value in using pci_iov_get_pf_drvdata(), in Rust you'd
just call

	// `vfdev` must be a `pci::Device<Bound>` for `physfn()` to be
	// available; `pfdev` will therefore be a `pci::Device<Bound>` too
	// (assuming we provide the guarantee for this, otherwise this would
	// need to be unsafe).
	let pfdev = vfdev.phyfn();

	// `FooData` is the type of the PF drvier's device private data. The
	// call to `drvdata()` will fail with an error of the asserted type is
	// wrong.
	let drvdata = pfdev.drvdata::<FooData>()?;

So, if we'd provide a Rust accessor for the PF's device driver data, we'd
implement it like above, because Device::drvdata() is already safe. If we want
pci::Device::pf_drvdata() to be safe, we'd otherwise need to do all the checks
Device::drvdata() already does again before we call into
pci_iov_get_pf_drvdata().

(Note that I'm currently travelling, hence I might not be as responsive as
usual. I'm travelling until after LPC; I plan to take a detailed look at this
series in the week of the conference).

[1] https://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core.git/tree/rust/kernel/device.rs?h=driver-core-next#n313
[2] https://git.kernel.org/pub/scm/linux/kernel/git/driver-core/driver-core.git/tree/samples/rust/rust_driver_auxiliary.rs?h=driver-core-next#n81
[3] https://lore.kernel.org/all/20251020223516.241050-1-dakr@kernel.org/

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Jason Gunthorpe 2 months, 2 weeks ago

On Sat, Nov 22, 2025 at 11:23:16PM +1300, Danilo Krummrich wrote:
> So far I haven't heard a convincing reason for not providing this guarantee. The
> only reason not to guarantee this I have heard is that some PF drivers only
> enable SR-IOV and hence could be unloaded afterwards. However, I think there is
> no strong reason to do so.

I know some people have work flows around this. I think they are
wrong. When we "fixed" mlx5 a while back there was some pushback and
some weird things did stop working.

So while I support this goal, I don't know if enough people will
agree..

> With this, the above code will be correct and a driver can use the generic
> infrastructure to:
> 
>   1) Call pci::Device<Bound>::physfn() returning a Result<pci::Device<Bound>>
>   2) Grab the driver's device private data from the returned Device<Bound>
> 
> Note that 2) (i.e. accessing the driver's device private data with
> Device::drvdata() [1]) ensures that the device is actually bound (drvdata() is
> only implemented for Device<Bound>) and that the returned type actually matches
> the type of the object that has been stored.

This is what the core helper does, with the twist that it "validates"
the PF driver is working right by checking its driver binding..

> I suggest to have a look at [2] for an example with how this turns out with the
> auxiliary bus [2][3], since in the end it's the same problem, i.e. an auxiliary
> driver calling into its parent, except that the auxiliary bus already guarantees
> that the parent is bound when the child is bound.

Aux bus is a little different because it should be used in a way where
there are stronger guarantees about what the parent is. Ie the aux bus
names should be unique and limited to the type of parent.

> So, if we'd provide a Rust accessor for the PF's device driver data, we'd
> implement it like above, because Device::drvdata() is already safe. If we want
> pci::Device::pf_drvdata() to be safe, we'd otherwise need to do all the checks
> Device::drvdata() already does again before we call into
> pci_iov_get_pf_drvdata().

I think to make progress along this line you need to still somehow
validate that the PF driver is working right, either by checking that
the driver is bound to a rust driver somehow or using the same
approach as the core helper.

I'm not sure the idea to force all drivers to do disable sriov is
going to be easy, and I'd rather see rust bindings progress without
opening such a topic..

Jason

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Danilo Krummrich 2 months, 2 weeks ago

On Sun Nov 23, 2025 at 5:16 AM NZDT, Jason Gunthorpe wrote:
> I think to make progress along this line you need to still somehow
> validate that the PF driver is working right, either by checking that
> the driver is bound to a rust driver somehow or using the same
> approach as the core helper.

Do you refer to the

	if (pf_dev->driver != pf_driver)

check? If so, that's (in a slightly different form) already part of the generic
Device::drvdata() accessor.

> I'm not sure the idea to force all drivers to do disable sriov is
> going to be easy, and I'd rather see rust bindings progress without
> opening such a topic..

I'm sorry, I should have mentioned what I actually propose:

My idea would be to provide a bool in struct pci_driver, which, if set,
guarantees that all VFs are unbound when the PF is unbound.

With this bool being set, the PCI bus can provide the guarantee that VF bound
implies PF bound; the Rust accessor can then leverage this guarantee.

This can also be leveraged by the C code, where we could have a separate
accessor that checks the bool rather than askes the driver to promise that the
PF is bound, which pci_iov_get_pf_drvdata() does.

(Although I have to admit that without the additional type system capabilities
we have in Rust, it is not that big of an improvement.)

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Jason Gunthorpe 2 months, 2 weeks ago

On Sun, Nov 23, 2025 at 11:43:08AM +1300, Danilo Krummrich wrote:
> On Sun Nov 23, 2025 at 5:16 AM NZDT, Jason Gunthorpe wrote:
> > I think to make progress along this line you need to still somehow
> > validate that the PF driver is working right, either by checking that
> > the driver is bound to a rust driver somehow or using the same
> > approach as the core helper.
> 
> Do you refer to the
> 
> 	if (pf_dev->driver != pf_driver)
> 
> check? If so, that's (in a slightly different form) already part of the generic
> Device::drvdata() accessor.

Yeah, but by that point it has already returned a bound device pointer
without proving it is safe to do so..

If you check the flag you are proposing below then it would be OK:

> > I'm not sure the idea to force all drivers to do disable sriov is
> > going to be easy, and I'd rather see rust bindings progress without
> > opening such a topic..
> 
> I'm sorry, I should have mentioned what I actually propose:
> 
> My idea would be to provide a bool in struct pci_driver, which, if set,
> guarantees that all VFs are unbound when the PF is unbound.
> 
> With this bool being set, the PCI bus can provide the guarantee that VF bound
> implies PF bound; the Rust accessor can then leverage this guarantee.
> 
> This can also be leveraged by the C code, where we could have a separate
> accessor that checks the bool rather than askes the driver to promise that the
> PF is bound, which pci_iov_get_pf_drvdata() does.
> 
> (Although I have to admit that without the additional type system capabilities
> we have in Rust, it is not that big of an improvement.)

This seems like it has a decent chance of succeeding..

Though as I said in my other email, you'd probably want a version
where there is WARN_ON if that common code is actually triggered as it
would be a driver bug to misorder its destruction.

Jason

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Leon Romanovsky 2 months, 2 weeks ago

On Sat, Nov 22, 2025 at 12:16:15PM -0400, Jason Gunthorpe wrote:
> On Sat, Nov 22, 2025 at 11:23:16PM +1300, Danilo Krummrich wrote:
> > So far I haven't heard a convincing reason for not providing this guarantee. The
> > only reason not to guarantee this I have heard is that some PF drivers only
> > enable SR-IOV and hence could be unloaded afterwards. However, I think there is
> > no strong reason to do so.
> 
> I know some people have work flows around this. I think they are
> wrong. When we "fixed" mlx5 a while back there was some pushback and
> some weird things did stop working.

I don't think that this is correct. The main use case for keeping VFs,
while unloading PF driver is to allow reuse this PF for VFIO too. It is
very natural and common flow for "real" SR-IOV devices which present all
PF/VFs as truly separate devices.

The pushback for mlx5 was related to change in eswitch mode and not to
driver unload. In very corner case, mlx5 kept VFs to protect change in
netdevs.

> 
> So while I support this goal, I don't know if enough people will
> agree..

I don't see that Bjorn is CCed here, so it is unclear to whom Danilo
talked to get rationale for this new behaviour.

> 
> > With this, the above code will be correct and a driver can use the generic
> > infrastructure to:
> > 
> >   1) Call pci::Device<Bound>::physfn() returning a Result<pci::Device<Bound>>
> >   2) Grab the driver's device private data from the returned Device<Bound>
> > 
> > Note that 2) (i.e. accessing the driver's device private data with
> > Device::drvdata() [1]) ensures that the device is actually bound (drvdata() is
> > only implemented for Device<Bound>) and that the returned type actually matches
> > the type of the object that has been stored.
> 
> This is what the core helper does, with the twist that it "validates"
> the PF driver is working right by checking its driver binding..
> 
> > I suggest to have a look at [2] for an example with how this turns out with the
> > auxiliary bus [2][3], since in the end it's the same problem, i.e. an auxiliary
> > driver calling into its parent, except that the auxiliary bus already guarantees
> > that the parent is bound when the child is bound.
> 
> Aux bus is a little different because it should be used in a way where
> there are stronger guarantees about what the parent is. Ie the aux bus
> names should be unique and limited to the type of parent.

Right, aux bus is completely unrelated to real physical PCI bus.

Thanks

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Danilo Krummrich 2 months, 2 weeks ago

On Sun Nov 23, 2025 at 7:57 AM NZDT, Leon Romanovsky wrote:
> On Sat, Nov 22, 2025 at 12:16:15PM -0400, Jason Gunthorpe wrote:
>> On Sat, Nov 22, 2025 at 11:23:16PM +1300, Danilo Krummrich wrote:
>> > So far I haven't heard a convincing reason for not providing this guarantee. The
>> > only reason not to guarantee this I have heard is that some PF drivers only
>> > enable SR-IOV and hence could be unloaded afterwards. However, I think there is
>> > no strong reason to do so.
>> 
>> I know some people have work flows around this. I think they are
>> wrong. When we "fixed" mlx5 a while back there was some pushback and
>> some weird things did stop working.
>
> I don't think that this is correct. The main use case for keeping VFs,
> while unloading PF driver is to allow reuse this PF for VFIO too. It is
> very natural and common flow for "real" SR-IOV devices which present all
> PF/VFs as truly separate devices.

That sounds a bit odd to me, what exactly do you mean with "reuse the PF for
VFIO"? What do you do with the PF after driver unload instead? Load another
driver? If so, why separate ones?

>> So while I support this goal, I don't know if enough people will
>> agree..
>
> I don't see that Bjorn is CCed here, so it is unclear to whom Danilo
> talked to get rationale for this new behaviour.

Not sure what you mean, Bjorn is CC'd on the whole series and hence also this
conversation.

Otherwise, my proposal to guarantee that the PF is bound as long as the VF(s)
are bound stems from the corresponding beneficial lifecycle implications for the
driver model (some more on this below).

>> > With this, the above code will be correct and a driver can use the generic
>> > infrastructure to:
>> > 
>> >   1) Call pci::Device<Bound>::physfn() returning a Result<pci::Device<Bound>>
>> >   2) Grab the driver's device private data from the returned Device<Bound>
>> > 
>> > Note that 2) (i.e. accessing the driver's device private data with
>> > Device::drvdata() [1]) ensures that the device is actually bound (drvdata() is
>> > only implemented for Device<Bound>) and that the returned type actually matches
>> > the type of the object that has been stored.
>> 
>> This is what the core helper does, with the twist that it "validates"
>> the PF driver is working right by checking its driver binding..
>> 
>> > I suggest to have a look at [2] for an example with how this turns out with the
>> > auxiliary bus [2][3], since in the end it's the same problem, i.e. an auxiliary
>> > driver calling into its parent, except that the auxiliary bus already guarantees
>> > that the parent is bound when the child is bound.
>> 
>> Aux bus is a little different because it should be used in a way where
>> there are stronger guarantees about what the parent is. Ie the aux bus
>> names should be unique and limited to the type of parent.
>
> Right, aux bus is completely unrelated to real physical PCI bus.

Of course the auxiliary and the PCI bus are fundamentally different busses,
*but* they also have commonalities: the driver lifecycle requirements drivers
have to deal with are the same.

For instance, if you call from an auxiliary driver into the parent driver to let
the parent driver do some work on behalf, the auxiliary driver has to guarantee
that the parent device is guranteed to remain bound for the duration of the
call, otherwise the parent driver can't call dev_get_drvdata() without risking a
UAF, since a concurrent unbind might free the parent driver's device private
data.

The same is true for PCI when a VF driver calls into the PF driver to do some
work on behalf. The VF driver has to make sure that the PF driver remains bound
for the whole duration of the call for the exact same reasons.

I.e. it is a common driver lifecycle pattern for interactions between drivers in
general.

As for Rust specifically, we can actually model such driver lifecycle patterns
with the type system and with that do a lot of the driver lifecycle checks (that
drivers love to ignore :) even at compile time, such that your code doesn't even
compile when you violate the driver lifecycle rules.

For the auxiliary bus that's already the case and I'd love to get to this point
with PF <-> VF interactions (in Rust) as well.

I hope this helps.

- Danilo

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Leon Romanovsky 2 months, 2 weeks ago

On Sun, Nov 23, 2025 at 11:26:52AM +1300, Danilo Krummrich wrote:
> On Sun Nov 23, 2025 at 7:57 AM NZDT, Leon Romanovsky wrote:
> > On Sat, Nov 22, 2025 at 12:16:15PM -0400, Jason Gunthorpe wrote:
> >> On Sat, Nov 22, 2025 at 11:23:16PM +1300, Danilo Krummrich wrote:
> >> > So far I haven't heard a convincing reason for not providing this guarantee. The
> >> > only reason not to guarantee this I have heard is that some PF drivers only
> >> > enable SR-IOV and hence could be unloaded afterwards. However, I think there is
> >> > no strong reason to do so.
> >> 
> >> I know some people have work flows around this. I think they are
> >> wrong. When we "fixed" mlx5 a while back there was some pushback and
> >> some weird things did stop working.
> >
> > I don't think that this is correct. The main use case for keeping VFs,
> > while unloading PF driver is to allow reuse this PF for VFIO too. It is
> > very natural and common flow for "real" SR-IOV devices which present all
> > PF/VFs as truly separate devices.
> 
> That sounds a bit odd to me, what exactly do you mean with "reuse the PF for
> VFIO"? What do you do with the PF after driver unload instead? Load another
> driver? If so, why separate ones?

One of the main use cases for SR-IOV is to provide to VM users/customers
devices with performance and security promises as physical ones. In this
case, the VMs are created through PF and not bound to any driver. Once
customer/user requests VM, that VF is bound to vfio-pci driver and
attached to that VM.

In many cases, PF is unbound too from its original driver and attached
to some other VM. It allows for these VM providers to maximize
utilization of their SR-IOV devices.

At least in PCI spec 6.0.1, it stays clearly that PF can be attached to SI (VM in spec language).
"Physical Function (PF) - A PF is a PCIe Function that supports the SR-IOV Extended Capability
and is accessible to an SR-PCIM, a VI, or an SI."

> 
> >> So while I support this goal, I don't know if enough people will
> >> agree..
> >
> > I don't see that Bjorn is CCed here, so it is unclear to whom Danilo
> > talked to get rationale for this new behaviour.
> 
> Not sure what you mean, Bjorn is CC'd on the whole series and hence also this
> conversation.
> 
> Otherwise, my proposal to guarantee that the PF is bound as long as the VF(s)
> are bound stems from the corresponding beneficial lifecycle implications for the
> driver model (some more on this below).
> 
> >> > With this, the above code will be correct and a driver can use the generic
> >> > infrastructure to:
> >> > 
> >> >   1) Call pci::Device<Bound>::physfn() returning a Result<pci::Device<Bound>>
> >> >   2) Grab the driver's device private data from the returned Device<Bound>
> >> > 
> >> > Note that 2) (i.e. accessing the driver's device private data with
> >> > Device::drvdata() [1]) ensures that the device is actually bound (drvdata() is
> >> > only implemented for Device<Bound>) and that the returned type actually matches
> >> > the type of the object that has been stored.
> >> 
> >> This is what the core helper does, with the twist that it "validates"
> >> the PF driver is working right by checking its driver binding..
> >> 
> >> > I suggest to have a look at [2] for an example with how this turns out with the
> >> > auxiliary bus [2][3], since in the end it's the same problem, i.e. an auxiliary
> >> > driver calling into its parent, except that the auxiliary bus already guarantees
> >> > that the parent is bound when the child is bound.
> >> 
> >> Aux bus is a little different because it should be used in a way where
> >> there are stronger guarantees about what the parent is. Ie the aux bus
> >> names should be unique and limited to the type of parent.
> >
> > Right, aux bus is completely unrelated to real physical PCI bus.
> 
> Of course the auxiliary and the PCI bus are fundamentally different busses,
> *but* they also have commonalities: the driver lifecycle requirements drivers
> have to deal with are the same.
> 
> For instance, if you call from an auxiliary driver into the parent driver to let
> the parent driver do some work on behalf, the auxiliary driver has to guarantee
> that the parent device is guranteed to remain bound for the duration of the
> call, otherwise the parent driver can't call dev_get_drvdata() without risking a
> UAF, since a concurrent unbind might free the parent driver's device private
> data.
> 
> The same is true for PCI when a VF driver calls into the PF driver to do some
> work on behalf. The VF driver has to make sure that the PF driver remains bound
> for the whole duration of the call for the exact same reasons.

And this section is not entirely correct. It is one of the possible
usages of PF. In many devices, PF and its VFs are independent and don't
communicate through kernel at all. There is no need for VF to have PF be
bounded to its original driver.

> 
> I.e. it is a common driver lifecycle pattern for interactions between drivers in
> general.

And it is not true as well. In case you need to interact between
devices/drivers, it is up to the caller to ensure that this driver isn't
unloaded.

> 
> As for Rust specifically, we can actually model such driver lifecycle patterns
> with the type system and with that do a lot of the driver lifecycle checks (that
> drivers love to ignore :) even at compile time, such that your code doesn't even
> compile when you violate the driver lifecycle rules.
> 
> For the auxiliary bus that's already the case and I'd love to get to this point
> with PF <-> VF interactions (in Rust) as well.

It is absolutely correct thing to do for auxbus. It is very questionable for SR-IOV.


> 
> I hope this helps.

Thanks

> 
> - Danilo

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Danilo Krummrich 2 months, 2 weeks ago

On Sun Nov 23, 2025 at 7:34 PM NZDT, Leon Romanovsky wrote:
> On Sun, Nov 23, 2025 at 11:26:52AM +1300, Danilo Krummrich wrote:
>> On Sun Nov 23, 2025 at 7:57 AM NZDT, Leon Romanovsky wrote:
>> > On Sat, Nov 22, 2025 at 12:16:15PM -0400, Jason Gunthorpe wrote:
>> >> On Sat, Nov 22, 2025 at 11:23:16PM +1300, Danilo Krummrich wrote:
>> >> > So far I haven't heard a convincing reason for not providing this guarantee. The
>> >> > only reason not to guarantee this I have heard is that some PF drivers only
>> >> > enable SR-IOV and hence could be unloaded afterwards. However, I think there is
>> >> > no strong reason to do so.
>> >> 
>> >> I know some people have work flows around this. I think they are
>> >> wrong. When we "fixed" mlx5 a while back there was some pushback and
>> >> some weird things did stop working.
>> >
>> > I don't think that this is correct. The main use case for keeping VFs,
>> > while unloading PF driver is to allow reuse this PF for VFIO too. It is
>> > very natural and common flow for "real" SR-IOV devices which present all
>> > PF/VFs as truly separate devices.
>> 
>> That sounds a bit odd to me, what exactly do you mean with "reuse the PF for
>> VFIO"? What do you do with the PF after driver unload instead? Load another
>> driver? If so, why separate ones?
>
> One of the main use cases for SR-IOV is to provide to VM users/customers
> devices with performance and security promises as physical ones. In this
> case, the VMs are created through PF and not bound to any driver. Once
> customer/user requests VM, that VF is bound to vfio-pci driver and
> attached to that VM.
>
> In many cases, PF is unbound too from its original driver and attached
> to some other VM. It allows for these VM providers to maximize
> utilization of their SR-IOV devices.
>
> At least in PCI spec 6.0.1, it stays clearly that PF can be attached to SI (VM in spec language).
> "Physical Function (PF) - A PF is a PCIe Function that supports the SR-IOV Extended Capability
> and is accessible to an SR-PCIM, a VI, or an SI."

Hm, that's possible, but do we have cases of this in practice where we bind and
unbind the same PF multiple times, pass it to different VMs, etc.?

In any case, what we can always do is what I propose in [1], i.e. add a bool to
struct pci_driver that indicates whether the VFs should be unbound when the PF
is unbound, such that we can uphold the guarantee of VF bound implies PF bound
at least conditionally.

[1] https://lore.kernel.org/all/DEFL4TG0WX1C.2GLH4417EPU3V@kernel.org/

>> >> So while I support this goal, I don't know if enough people will
>> >> agree..
>> >
>> > I don't see that Bjorn is CCed here, so it is unclear to whom Danilo
>> > talked to get rationale for this new behaviour.
>> 
>> Not sure what you mean, Bjorn is CC'd on the whole series and hence also this
>> conversation.
>> 
>> Otherwise, my proposal to guarantee that the PF is bound as long as the VF(s)
>> are bound stems from the corresponding beneficial lifecycle implications for the
>> driver model (some more on this below).
>> 
>> >> > With this, the above code will be correct and a driver can use the generic
>> >> > infrastructure to:
>> >> > 
>> >> >   1) Call pci::Device<Bound>::physfn() returning a Result<pci::Device<Bound>>
>> >> >   2) Grab the driver's device private data from the returned Device<Bound>
>> >> > 
>> >> > Note that 2) (i.e. accessing the driver's device private data with
>> >> > Device::drvdata() [1]) ensures that the device is actually bound (drvdata() is
>> >> > only implemented for Device<Bound>) and that the returned type actually matches
>> >> > the type of the object that has been stored.
>> >> 
>> >> This is what the core helper does, with the twist that it "validates"
>> >> the PF driver is working right by checking its driver binding..
>> >> 
>> >> > I suggest to have a look at [2] for an example with how this turns out with the
>> >> > auxiliary bus [2][3], since in the end it's the same problem, i.e. an auxiliary
>> >> > driver calling into its parent, except that the auxiliary bus already guarantees
>> >> > that the parent is bound when the child is bound.
>> >> 
>> >> Aux bus is a little different because it should be used in a way where
>> >> there are stronger guarantees about what the parent is. Ie the aux bus
>> >> names should be unique and limited to the type of parent.
>> >
>> > Right, aux bus is completely unrelated to real physical PCI bus.
>> 
>> Of course the auxiliary and the PCI bus are fundamentally different busses,
>> *but* they also have commonalities: the driver lifecycle requirements drivers
>> have to deal with are the same.
>> 
>> For instance, if you call from an auxiliary driver into the parent driver to let
>> the parent driver do some work on behalf, the auxiliary driver has to guarantee
>> that the parent device is guranteed to remain bound for the duration of the
>> call, otherwise the parent driver can't call dev_get_drvdata() without risking a
>> UAF, since a concurrent unbind might free the parent driver's device private
>> data.
>> 
>> The same is true for PCI when a VF driver calls into the PF driver to do some
>> work on behalf. The VF driver has to make sure that the PF driver remains bound
>> for the whole duration of the call for the exact same reasons.
>
> And this section is not entirely correct. It is one of the possible
> usages of PF. In many devices, PF and its VFs are independent and don't
> communicate through kernel at all. There is no need for VF to have PF be
> bounded to its original driver.

Of course -- similarly there might be cases where an auxiliary driver might not
cummunicate with its parent, but that's not the case I'm referring to above.

>
>> 
>> I.e. it is a common driver lifecycle pattern for interactions between drivers in
>> general.
>
> And it is not true as well. In case you need to interact between
> devices/drivers, it is up to the caller to ensure that this driver isn't
> unloaded.

You're mixing two things here. The driver model lifecycle requires that if
driver A calls into driver B - where B accesses its device private data - that B
is bound for the full duration of the call.

This is true for any such interaction, the question of who is responsible to
uphold this guarantee, e.g. the bus layer or the driver itself is orthogonal.

Whenever possible, it's best to let the bus layer uphold the guarantee though,
since it's one pitfall less a driver can fall into.

>> 
>> As for Rust specifically, we can actually model such driver lifecycle patterns
>> with the type system and with that do a lot of the driver lifecycle checks (that
>> drivers love to ignore :) even at compile time, such that your code doesn't even
>> compile when you violate the driver lifecycle rules.
>> 
>> For the auxiliary bus that's already the case and I'd love to get to this point
>> with PF <-> VF interactions (in Rust) as well.
>
> It is absolutely correct thing to do for auxbus. It is very questionable for SR-IOV.

At least conditionally (as proposed above), it's an improvement for cases where
there is PF <-> VF interactions, i.e. why let drivers take care if the bus can
already do it for them.

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Leon Romanovsky 2 months, 2 weeks ago

On Sun, Nov 23, 2025 at 11:07:47PM +1300, Danilo Krummrich wrote:
> On Sun Nov 23, 2025 at 7:34 PM NZDT, Leon Romanovsky wrote:
> > On Sun, Nov 23, 2025 at 11:26:52AM +1300, Danilo Krummrich wrote:
> >> On Sun Nov 23, 2025 at 7:57 AM NZDT, Leon Romanovsky wrote:
> >> > On Sat, Nov 22, 2025 at 12:16:15PM -0400, Jason Gunthorpe wrote:
> >> >> On Sat, Nov 22, 2025 at 11:23:16PM +1300, Danilo Krummrich wrote:
> >> >> > So far I haven't heard a convincing reason for not providing this guarantee. The
> >> >> > only reason not to guarantee this I have heard is that some PF drivers only
> >> >> > enable SR-IOV and hence could be unloaded afterwards. However, I think there is
> >> >> > no strong reason to do so.
> >> >> 
> >> >> I know some people have work flows around this. I think they are
> >> >> wrong. When we "fixed" mlx5 a while back there was some pushback and
> >> >> some weird things did stop working.
> >> >
> >> > I don't think that this is correct. The main use case for keeping VFs,
> >> > while unloading PF driver is to allow reuse this PF for VFIO too. It is
> >> > very natural and common flow for "real" SR-IOV devices which present all
> >> > PF/VFs as truly separate devices.
> >> 
> >> That sounds a bit odd to me, what exactly do you mean with "reuse the PF for
> >> VFIO"? What do you do with the PF after driver unload instead? Load another
> >> driver? If so, why separate ones?
> >
> > One of the main use cases for SR-IOV is to provide to VM users/customers
> > devices with performance and security promises as physical ones. In this
> > case, the VMs are created through PF and not bound to any driver. Once
> > customer/user requests VM, that VF is bound to vfio-pci driver and
> > attached to that VM.
> >
> > In many cases, PF is unbound too from its original driver and attached
> > to some other VM. It allows for these VM providers to maximize
> > utilization of their SR-IOV devices.
> >
> > At least in PCI spec 6.0.1, it stays clearly that PF can be attached to SI (VM in spec language).
> > "Physical Function (PF) - A PF is a PCIe Function that supports the SR-IOV Extended Capability
> > and is accessible to an SR-PCIM, a VI, or an SI."
> 
> Hm, that's possible, but do we have cases of this in practice where we bind and
> unbind the same PF multiple times, pass it to different VMs, etc.?

It is very common case, when the goal is to maximize hardware utilization.

> 
> In any case, what we can always do is what I propose in [1], i.e. add a bool to
> struct pci_driver that indicates whether the VFs should be unbound when the PF
> is unbound, such that we can uphold the guarantee of VF bound implies PF bound
> at least conditionally.
> 
> [1] https://lore.kernel.org/all/DEFL4TG0WX1C.2GLH4417EPU3V@kernel.org/
> 
> >> >> So while I support this goal, I don't know if enough people will
> >> >> agree..
> >> >
> >> > I don't see that Bjorn is CCed here, so it is unclear to whom Danilo
> >> > talked to get rationale for this new behaviour.
> >> 
> >> Not sure what you mean, Bjorn is CC'd on the whole series and hence also this
> >> conversation.
> >> 
> >> Otherwise, my proposal to guarantee that the PF is bound as long as the VF(s)
> >> are bound stems from the corresponding beneficial lifecycle implications for the
> >> driver model (some more on this below).
> >> 
> >> >> > With this, the above code will be correct and a driver can use the generic
> >> >> > infrastructure to:
> >> >> > 
> >> >> >   1) Call pci::Device<Bound>::physfn() returning a Result<pci::Device<Bound>>
> >> >> >   2) Grab the driver's device private data from the returned Device<Bound>
> >> >> > 
> >> >> > Note that 2) (i.e. accessing the driver's device private data with
> >> >> > Device::drvdata() [1]) ensures that the device is actually bound (drvdata() is
> >> >> > only implemented for Device<Bound>) and that the returned type actually matches
> >> >> > the type of the object that has been stored.
> >> >> 
> >> >> This is what the core helper does, with the twist that it "validates"
> >> >> the PF driver is working right by checking its driver binding..
> >> >> 
> >> >> > I suggest to have a look at [2] for an example with how this turns out with the
> >> >> > auxiliary bus [2][3], since in the end it's the same problem, i.e. an auxiliary
> >> >> > driver calling into its parent, except that the auxiliary bus already guarantees
> >> >> > that the parent is bound when the child is bound.
> >> >> 
> >> >> Aux bus is a little different because it should be used in a way where
> >> >> there are stronger guarantees about what the parent is. Ie the aux bus
> >> >> names should be unique and limited to the type of parent.
> >> >
> >> > Right, aux bus is completely unrelated to real physical PCI bus.
> >> 
> >> Of course the auxiliary and the PCI bus are fundamentally different busses,
> >> *but* they also have commonalities: the driver lifecycle requirements drivers
> >> have to deal with are the same.
> >> 
> >> For instance, if you call from an auxiliary driver into the parent driver to let
> >> the parent driver do some work on behalf, the auxiliary driver has to guarantee
> >> that the parent device is guranteed to remain bound for the duration of the
> >> call, otherwise the parent driver can't call dev_get_drvdata() without risking a
> >> UAF, since a concurrent unbind might free the parent driver's device private
> >> data.
> >> 
> >> The same is true for PCI when a VF driver calls into the PF driver to do some
> >> work on behalf. The VF driver has to make sure that the PF driver remains bound
> >> for the whole duration of the call for the exact same reasons.
> >
> > And this section is not entirely correct. It is one of the possible
> > usages of PF. In many devices, PF and its VFs are independent and don't
> > communicate through kernel at all. There is no need for VF to have PF be
> > bounded to its original driver.
> 
> Of course -- similarly there might be cases where an auxiliary driver might not
> cummunicate with its parent, but that's not the case I'm referring to above.

Not really. Auxbus is a logical in-kernel separation of one physical device.
The promise is that parent auxdevice is tied to its children. It was done
intentionally to make sure that children don't outlive their parent.

It is not the case for SR-IOV.

> 
> >
> >> 
> >> I.e. it is a common driver lifecycle pattern for interactions between drivers in
> >> general.
> >
> > And it is not true as well. In case you need to interact between
> > devices/drivers, it is up to the caller to ensure that this driver isn't
> > unloaded.
> 
> You're mixing two things here. The driver model lifecycle requires that if
> driver A calls into driver B - where B accesses its device private data - that B
> is bound for the full duration of the call.

I'm aware of this, and we are not talking about driver model. Whole
discussion is "if PF can be unbound, while VFs exist". The answer is yes, it can,
both from PCI spec perspective and from operational one.

> 
> This is true for any such interaction, the question of who is responsible to
> uphold this guarantee, e.g. the bus layer or the driver itself is orthogonal.
> 
> Whenever possible, it's best to let the bus layer uphold the guarantee though,
> since it's one pitfall less a driver can fall into.
> 
> >> 
> >> As for Rust specifically, we can actually model such driver lifecycle patterns
> >> with the type system and with that do a lot of the driver lifecycle checks (that
> >> drivers love to ignore :) even at compile time, such that your code doesn't even
> >> compile when you violate the driver lifecycle rules.
> >> 
> >> For the auxiliary bus that's already the case and I'd love to get to this point
> >> with PF <-> VF interactions (in Rust) as well.
> >
> > It is absolutely correct thing to do for auxbus. It is very questionable for SR-IOV.
> 
> At least conditionally (as proposed above), it's an improvement for cases where
> there is PF <-> VF interactions, i.e. why let drivers take care if the bus can
> already do it for them.

I'm not against some opt-in/opt-out solution. I'm against "one size fits
all" proposal, where PF is tied to its original driver as long as VFs
exist.

Thanks

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Jason Gunthorpe 2 months, 2 weeks ago

On Sun, Nov 23, 2025 at 01:18:23PM +0200, Leon Romanovsky wrote:
> > >> That sounds a bit odd to me, what exactly do you mean with "reuse the PF for
> > >> VFIO"? What do you do with the PF after driver unload instead? Load another
> > >> driver? If so, why separate ones?
> > >
> > > One of the main use cases for SR-IOV is to provide to VM users/customers
> > > devices with performance and security promises as physical ones. In this
> > > case, the VMs are created through PF and not bound to any driver. Once
> > > customer/user requests VM, that VF is bound to vfio-pci driver and
> > > attached to that VM.
> > >
> > > In many cases, PF is unbound too from its original driver and attached
> > > to some other VM. It allows for these VM providers to maximize
> > > utilization of their SR-IOV devices.
> > >
> > > At least in PCI spec 6.0.1, it stays clearly that PF can be attached to SI (VM in spec language).
> > > "Physical Function (PF) - A PF is a PCIe Function that supports the SR-IOV Extended Capability
> > > and is accessible to an SR-PCIM, a VI, or an SI."
> > 
> > Hm, that's possible, but do we have cases of this in practice where we bind and
> > unbind the same PF multiple times, pass it to different VMs, etc.?
> 
> It is very common case, when the goal is to maximize hardware utilization.

It is a sort of common configuration, but VFIO should be driving the
PF directly using its native SRIOV support. There is no need to rebind
a driver while SRIOV is still enabled.

> > You're mixing two things here. The driver model lifecycle requires that if
> > driver A calls into driver B - where B accesses its device private data - that B
> > is bound for the full duration of the call.
> 
> I'm aware of this, and we are not talking about driver model. Whole
> discussion is "if PF can be unbound, while VFs exist". The answer is yes, it can,
> both from PCI spec perspective and from operational one.

This whole discussion highlights my original feeling.. While I think
it makes alot of sense to tie the VF lifecycle to the PF driver
binding universally there are enough contrary opinions.

> > At least conditionally (as proposed above), it's an improvement for cases where
> > there is PF <-> VF interactions, i.e. why let drivers take care if the bus can
> > already do it for them.

Drivers like mlx5 have a sequencing requirement during shutdown, it
wants to see SRIOV turned off before it moves on to other steps. This
is probably somewhat common..

So while it is nice for the bus to guarentee it, it probably also
signals there is a bug somewhere if that code gets used..

Jason

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Leon Romanovsky 2 months, 2 weeks ago

On Mon, Nov 24, 2025 at 09:57:25AM -0400, Jason Gunthorpe wrote:
> On Sun, Nov 23, 2025 at 01:18:23PM +0200, Leon Romanovsky wrote:
> > > >> That sounds a bit odd to me, what exactly do you mean with "reuse the PF for
> > > >> VFIO"? What do you do with the PF after driver unload instead? Load another
> > > >> driver? If so, why separate ones?
> > > >
> > > > One of the main use cases for SR-IOV is to provide to VM users/customers
> > > > devices with performance and security promises as physical ones. In this
> > > > case, the VMs are created through PF and not bound to any driver. Once
> > > > customer/user requests VM, that VF is bound to vfio-pci driver and
> > > > attached to that VM.
> > > >
> > > > In many cases, PF is unbound too from its original driver and attached
> > > > to some other VM. It allows for these VM providers to maximize
> > > > utilization of their SR-IOV devices.
> > > >
> > > > At least in PCI spec 6.0.1, it stays clearly that PF can be attached to SI (VM in spec language).
> > > > "Physical Function (PF) - A PF is a PCIe Function that supports the SR-IOV Extended Capability
> > > > and is accessible to an SR-PCIM, a VI, or an SI."
> > > 
> > > Hm, that's possible, but do we have cases of this in practice where we bind and
> > > unbind the same PF multiple times, pass it to different VMs, etc.?
> > 
> > It is very common case, when the goal is to maximize hardware utilization.
> 
> It is a sort of common configuration, but VFIO should be driving the
> PF directly using its native SRIOV support. There is no need to rebind
> a driver while SRIOV is still enabled.

It depends on how you created these VFs. If you created them through
native driver, you will need to unbind and bind it to VFIO.

Thanks

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Jason Gunthorpe 2 months, 2 weeks ago

On Mon, Nov 24, 2025 at 04:53:32PM +0200, Leon Romanovsky wrote:
> On Mon, Nov 24, 2025 at 09:57:25AM -0400, Jason Gunthorpe wrote:
> > On Sun, Nov 23, 2025 at 01:18:23PM +0200, Leon Romanovsky wrote:
> > > > >> That sounds a bit odd to me, what exactly do you mean with "reuse the PF for
> > > > >> VFIO"? What do you do with the PF after driver unload instead? Load another
> > > > >> driver? If so, why separate ones?
> > > > >
> > > > > One of the main use cases for SR-IOV is to provide to VM users/customers
> > > > > devices with performance and security promises as physical ones. In this
> > > > > case, the VMs are created through PF and not bound to any driver. Once
> > > > > customer/user requests VM, that VF is bound to vfio-pci driver and
> > > > > attached to that VM.
> > > > >
> > > > > In many cases, PF is unbound too from its original driver and attached
> > > > > to some other VM. It allows for these VM providers to maximize
> > > > > utilization of their SR-IOV devices.
> > > > >
> > > > > At least in PCI spec 6.0.1, it stays clearly that PF can be attached to SI (VM in spec language).
> > > > > "Physical Function (PF) - A PF is a PCIe Function that supports the SR-IOV Extended Capability
> > > > > and is accessible to an SR-PCIM, a VI, or an SI."
> > > > 
> > > > Hm, that's possible, but do we have cases of this in practice where we bind and
> > > > unbind the same PF multiple times, pass it to different VMs, etc.?
> > > 
> > > It is very common case, when the goal is to maximize hardware utilization.
> > 
> > It is a sort of common configuration, but VFIO should be driving the
> > PF directly using its native SRIOV support. There is no need to rebind
> > a driver while SRIOV is still enabled.
> 
> It depends on how you created these VFs. If you created them through
> native driver, you will need to unbind and bind it to VFIO.

That should not be done or encouraged.

Jason

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by Leon Romanovsky 2 months, 2 weeks ago

On Mon, Nov 24, 2025 at 11:04:50AM -0400, Jason Gunthorpe wrote:
> On Mon, Nov 24, 2025 at 04:53:32PM +0200, Leon Romanovsky wrote:
> > On Mon, Nov 24, 2025 at 09:57:25AM -0400, Jason Gunthorpe wrote:
> > > On Sun, Nov 23, 2025 at 01:18:23PM +0200, Leon Romanovsky wrote:
> > > > > >> That sounds a bit odd to me, what exactly do you mean with "reuse the PF for
> > > > > >> VFIO"? What do you do with the PF after driver unload instead? Load another
> > > > > >> driver? If so, why separate ones?
> > > > > >
> > > > > > One of the main use cases for SR-IOV is to provide to VM users/customers
> > > > > > devices with performance and security promises as physical ones. In this
> > > > > > case, the VMs are created through PF and not bound to any driver. Once
> > > > > > customer/user requests VM, that VF is bound to vfio-pci driver and
> > > > > > attached to that VM.
> > > > > >
> > > > > > In many cases, PF is unbound too from its original driver and attached
> > > > > > to some other VM. It allows for these VM providers to maximize
> > > > > > utilization of their SR-IOV devices.
> > > > > >
> > > > > > At least in PCI spec 6.0.1, it stays clearly that PF can be attached to SI (VM in spec language).
> > > > > > "Physical Function (PF) - A PF is a PCIe Function that supports the SR-IOV Extended Capability
> > > > > > and is accessible to an SR-PCIM, a VI, or an SI."
> > > > > 
> > > > > Hm, that's possible, but do we have cases of this in practice where we bind and
> > > > > unbind the same PF multiple times, pass it to different VMs, etc.?
> > > > 
> > > > It is very common case, when the goal is to maximize hardware utilization.
> > > 
> > > It is a sort of common configuration, but VFIO should be driving the
> > > PF directly using its native SRIOV support. There is no need to rebind
> > > a driver while SRIOV is still enabled.
> > 
> > It depends on how you created these VFs. If you created them through
> > native driver, you will need to unbind and bind it to VFIO.
> 
> That should not be done or encouraged.

Maybe, but this is how it is done now.

Thanks

> 
> Jason

Re: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device

Posted by kernel test robot 2 months, 2 weeks ago

Hi Peter,

kernel test robot noticed the following build errors:

[auto build test ERROR on e4addc7cc2dfcc19f1c8c8e47f3834b22cb21559]

url:    https://github.com/intel-lab-lkp/linux/commits/Peter-Colberg/rust-pci-add-is_virtfn-to-check-for-VFs/20251120-062302
base:   e4addc7cc2dfcc19f1c8c8e47f3834b22cb21559
patch link:    https://lore.kernel.org/r/20251119-rust-pci-sriov-v1-7-883a94599a97%40redhat.com
patch subject: [PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device
config: um-randconfig-001-20251121 (https://download.01.org/0day-ci/archive/20251121/202511211511.hUcSTiSF-lkp@intel.com/config)
compiler: clang version 22.0.0git (https://github.com/llvm/llvm-project 9e9fe08b16ea2c4d9867fb4974edf2a3776d6ece)
rustc: rustc 1.88.0 (6b00bc388 2025-06-23)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20251121/202511211511.hUcSTiSF-lkp@intel.com/reproduce)

If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202511211511.hUcSTiSF-lkp@intel.com/

All errors (new ones prefixed by >>):

>> error[E0609]: no field `__bindgen_anon_1` on type `bindings::pci_dev`
   --> rust/kernel/pci.rs:546:40
   |
   546 |         Ok(unsafe { &*(*self.as_raw()).__bindgen_anon_1.physfn.cast() })
   |                                        ^^^^^^^^^^^^^^^^ unknown field
   |
   = note: available field is: `_address`

-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki

[PATCH 1/8] rust: pci: add is_virtfn(), to check for VFs
[PATCH 2/8] rust: pci: add is_physfn(), to check for PFs
[PATCH 3/8] rust: pci: add {enable,disable}_sriov(), to control SR-IOV capability
[PATCH 4/8] rust: pci: add num_vf(), to return number of VFs
[PATCH 5/8] rust: pci: add vtable attribute to pci::Driver trait
[PATCH 6/8] rust: pci: add bus callback sriov_configure(), to control SR-IOV from sysfs
[PATCH 7/8] rust: pci: add physfn(), to return PF device for VF device
[PATCH 8/8] samples: rust: add SR-IOV driver sample