From: Leon Romanovsky <leonro@nvidia.com>
Make sure that all VFIO PCI devices have peer-to-peer capabilities
enables, so we would be able to export their MMIO memory through DMABUF,
Signed-off-by: Leon Romanovsky <leonro@nvidia.com>
---
drivers/vfio/pci/vfio_pci_core.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c
index 7dcf5439dedc..608af135308e 100644
--- a/drivers/vfio/pci/vfio_pci_core.c
+++ b/drivers/vfio/pci/vfio_pci_core.c
@@ -28,6 +28,9 @@
#include <linux/nospec.h>
#include <linux/sched/mm.h>
#include <linux/iommufd.h>
+#ifdef CONFIG_VFIO_PCI_DMABUF
+#include <linux/pci-p2pdma.h>
+#endif
#if IS_ENABLED(CONFIG_EEH)
#include <asm/eeh.h>
#endif
@@ -2085,6 +2088,7 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev)
{
struct vfio_pci_core_device *vdev =
container_of(core_vdev, struct vfio_pci_core_device, vdev);
+ int __maybe_unused ret;
vdev->pdev = to_pci_dev(core_vdev->dev);
vdev->irq_type = VFIO_PCI_NUM_IRQS;
@@ -2094,6 +2098,11 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev)
INIT_LIST_HEAD(&vdev->dummy_resources_list);
INIT_LIST_HEAD(&vdev->ioeventfds_list);
INIT_LIST_HEAD(&vdev->sriov_pfs_item);
+#ifdef CONFIG_VFIO_PCI_DMABUF
+ ret = pcim_p2pdma_init(vdev->pdev);
+ if (ret)
+ return ret;
+#endif
init_rwsem(&vdev->memory_lock);
xa_init(&vdev->ctx);
--
2.51.0
On Sun, 28 Sep 2025 17:50:18 +0300 Leon Romanovsky <leon@kernel.org> wrote: > From: Leon Romanovsky <leonro@nvidia.com> > > Make sure that all VFIO PCI devices have peer-to-peer capabilities > enables, so we would be able to export their MMIO memory through DMABUF, > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > --- > drivers/vfio/pci/vfio_pci_core.c | 9 +++++++++ > 1 file changed, 9 insertions(+) > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > index 7dcf5439dedc..608af135308e 100644 > --- a/drivers/vfio/pci/vfio_pci_core.c > +++ b/drivers/vfio/pci/vfio_pci_core.c > @@ -28,6 +28,9 @@ > #include <linux/nospec.h> > #include <linux/sched/mm.h> > #include <linux/iommufd.h> > +#ifdef CONFIG_VFIO_PCI_DMABUF > +#include <linux/pci-p2pdma.h> > +#endif > #if IS_ENABLED(CONFIG_EEH) > #include <asm/eeh.h> > #endif > @@ -2085,6 +2088,7 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev) > { > struct vfio_pci_core_device *vdev = > container_of(core_vdev, struct vfio_pci_core_device, vdev); > + int __maybe_unused ret; > > vdev->pdev = to_pci_dev(core_vdev->dev); > vdev->irq_type = VFIO_PCI_NUM_IRQS; > @@ -2094,6 +2098,11 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev) > INIT_LIST_HEAD(&vdev->dummy_resources_list); > INIT_LIST_HEAD(&vdev->ioeventfds_list); > INIT_LIST_HEAD(&vdev->sriov_pfs_item); > +#ifdef CONFIG_VFIO_PCI_DMABUF > + ret = pcim_p2pdma_init(vdev->pdev); > + if (ret) > + return ret; > +#endif > init_rwsem(&vdev->memory_lock); > xa_init(&vdev->ctx); > What breaks if we don't test the return value and remove all the #ifdefs? The feature call should fail if we don't have a provider but that seems more robust than failing to register the device. Thanks, Alex
On Mon, Sep 29, 2025 at 03:17:45PM -0600, Alex Williamson wrote: > On Sun, 28 Sep 2025 17:50:18 +0300 > Leon Romanovsky <leon@kernel.org> wrote: > > > From: Leon Romanovsky <leonro@nvidia.com> > > > > Make sure that all VFIO PCI devices have peer-to-peer capabilities > > enables, so we would be able to export their MMIO memory through DMABUF, > > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > > --- > > drivers/vfio/pci/vfio_pci_core.c | 9 +++++++++ > > 1 file changed, 9 insertions(+) > > > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > > index 7dcf5439dedc..608af135308e 100644 > > --- a/drivers/vfio/pci/vfio_pci_core.c > > +++ b/drivers/vfio/pci/vfio_pci_core.c > > @@ -28,6 +28,9 @@ > > #include <linux/nospec.h> > > #include <linux/sched/mm.h> > > #include <linux/iommufd.h> > > +#ifdef CONFIG_VFIO_PCI_DMABUF > > +#include <linux/pci-p2pdma.h> > > +#endif > > #if IS_ENABLED(CONFIG_EEH) > > #include <asm/eeh.h> > > #endif > > @@ -2085,6 +2088,7 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev) > > { > > struct vfio_pci_core_device *vdev = > > container_of(core_vdev, struct vfio_pci_core_device, vdev); > > + int __maybe_unused ret; > > > > vdev->pdev = to_pci_dev(core_vdev->dev); > > vdev->irq_type = VFIO_PCI_NUM_IRQS; > > @@ -2094,6 +2098,11 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev) > > INIT_LIST_HEAD(&vdev->dummy_resources_list); > > INIT_LIST_HEAD(&vdev->ioeventfds_list); > > INIT_LIST_HEAD(&vdev->sriov_pfs_item); > > +#ifdef CONFIG_VFIO_PCI_DMABUF > > + ret = pcim_p2pdma_init(vdev->pdev); > > + if (ret) > > + return ret; > > +#endif > > init_rwsem(&vdev->memory_lock); > > xa_init(&vdev->ctx); > > > > What breaks if we don't test the return value and remove all the > #ifdefs? The feature call should fail if we don't have a provider but > that seems more robust than failing to register the device. Thanks, pcim_p2pdma_init() fails if memory allocation fails, which is worth to check. Such failure will most likely cause to non-working vfio-pci module anyway, as failure in pcim_p2pdma_init() will trigger OOM. It is better to fail early and help for the system to recover from OOM, instead of delaying to the next failure while trying to load vfio-pci. CONFIG_VFIO_PCI_DMABUF is mostly for next line "INIT_LIST_HEAD(&vdev->dmabufs);" from the following patch. Because that pcim_p2pdma_init() and dmabufs list are coupled, I put CONFIG_VFIO_PCI_DMABUF on both of them. Thanks > > Alex > >
On Tue, 30 Sep 2025 10:30:53 +0300 Leon Romanovsky <leon@kernel.org> wrote: > On Mon, Sep 29, 2025 at 03:17:45PM -0600, Alex Williamson wrote: > > On Sun, 28 Sep 2025 17:50:18 +0300 > > Leon Romanovsky <leon@kernel.org> wrote: > > > > > From: Leon Romanovsky <leonro@nvidia.com> > > > > > > Make sure that all VFIO PCI devices have peer-to-peer capabilities > > > enables, so we would be able to export their MMIO memory through DMABUF, > > > > > > Signed-off-by: Leon Romanovsky <leonro@nvidia.com> > > > --- > > > drivers/vfio/pci/vfio_pci_core.c | 9 +++++++++ > > > 1 file changed, 9 insertions(+) > > > > > > diff --git a/drivers/vfio/pci/vfio_pci_core.c b/drivers/vfio/pci/vfio_pci_core.c > > > index 7dcf5439dedc..608af135308e 100644 > > > --- a/drivers/vfio/pci/vfio_pci_core.c > > > +++ b/drivers/vfio/pci/vfio_pci_core.c > > > @@ -28,6 +28,9 @@ > > > #include <linux/nospec.h> > > > #include <linux/sched/mm.h> > > > #include <linux/iommufd.h> > > > +#ifdef CONFIG_VFIO_PCI_DMABUF > > > +#include <linux/pci-p2pdma.h> > > > +#endif > > > #if IS_ENABLED(CONFIG_EEH) > > > #include <asm/eeh.h> > > > #endif > > > @@ -2085,6 +2088,7 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev) > > > { > > > struct vfio_pci_core_device *vdev = > > > container_of(core_vdev, struct vfio_pci_core_device, vdev); > > > + int __maybe_unused ret; > > > > > > vdev->pdev = to_pci_dev(core_vdev->dev); > > > vdev->irq_type = VFIO_PCI_NUM_IRQS; > > > @@ -2094,6 +2098,11 @@ int vfio_pci_core_init_dev(struct vfio_device *core_vdev) > > > INIT_LIST_HEAD(&vdev->dummy_resources_list); > > > INIT_LIST_HEAD(&vdev->ioeventfds_list); > > > INIT_LIST_HEAD(&vdev->sriov_pfs_item); > > > +#ifdef CONFIG_VFIO_PCI_DMABUF > > > + ret = pcim_p2pdma_init(vdev->pdev); > > > + if (ret) > > > + return ret; > > > +#endif > > > init_rwsem(&vdev->memory_lock); > > > xa_init(&vdev->ctx); > > > > > > > What breaks if we don't test the return value and remove all the > > #ifdefs? The feature call should fail if we don't have a provider but > > that seems more robust than failing to register the device. Thanks, > > pcim_p2pdma_init() fails if memory allocation fails, which is worth to check. > Such failure will most likely cause to non-working vfio-pci module anyway, > as failure in pcim_p2pdma_init() will trigger OOM. It is better to fail early > and help for the system to recover from OOM, instead of delaying to the > next failure while trying to load vfio-pci. > > CONFIG_VFIO_PCI_DMABUF is mostly for next line "INIT_LIST_HEAD(&vdev->dmabufs);" > from the following patch. Because that pcim_p2pdma_init() and dmabufs list are > coupled, I put CONFIG_VFIO_PCI_DMABUF on both of them. Maybe it would remove my hang-up on the #ifdefs if we were to unconditionally include the header and move everything below that into a 'if (IS_ENABLED(CONFIG_VFIO_PCI_DMA)) {}' block. I think that would be statically evaluated by the compiler so we can still conditionalize the list_head in the vfio_pci_core_device struct via #ifdef, though I'm not super concerned about that since I'm expecting this will eventually be necessary for p2p DMA with IOMMUFD. That's also my basis for questioning why we think this needs a user visible kconfig option. I don't see a lot of value in enabling P2PDMA, DMABUF, and VFIO_PCI, but not VFIO_PCI_DMABUF. Thanks, Alex
© 2016 - 2025 Red Hat, Inc.