On s390 systems, which use a machine level hypervisor, PCI devices are
always accessed through a form of PCI pass-through which fundamentally
operates on a per PCI function granularity. This is also reflected in the
s390 PCI hotplug driver which creates hotplug slots for individual PCI
functions. Its reset_slot() function, which is a wrapper for
zpci_hot_reset_device(), thus also resets individual functions.
Currently, the kernel's PCI_SLOT() macro assigns the same pci_slot object
to multifunction devices. This approach worked fine on s390 systems that
only exposed virtual functions as individual PCI domains to the operating
system. Since commit 44510d6fa0c0 ("s390/pci: Handling multifunctions")
s390 supports exposing the topology of multifunction PCI devices by
grouping them in a shared PCI domain. When attempting to reset a function
through the hotplug driver, the shared slot assignment causes the wrong
function to be reset instead of the intended one. It also leaks memory as
we do create a pci_slot object for the function, but don't correctly free
it in pci_slot_release().
Add a flag for struct pci_slot to allow per function PCI slots for
functions managed through a hypervisor, which exposes individual PCI
functions while retaining the topology.
Fixes: 44510d6fa0c0 ("s390/pci: Handling multifunctions")
Cc: stable@vger.kernel.org
Suggested-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
---
drivers/pci/pci.c | 5 +++--
drivers/pci/slot.c | 25 ++++++++++++++++++++++---
include/linux/pci.h | 1 +
3 files changed, 26 insertions(+), 5 deletions(-)
diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index b14dd064006c..36ee38e0d817 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -4980,8 +4980,9 @@ static int pci_reset_hotplug_slot(struct hotplug_slot *hotplug, bool probe)
static int pci_dev_reset_slot_function(struct pci_dev *dev, bool probe)
{
- if (dev->multifunction || dev->subordinate || !dev->slot ||
- dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET)
+ if (dev->subordinate || !dev->slot ||
+ dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET ||
+ (dev->multifunction && !dev->slot->per_func_slot))
return -ENOTTY;
return pci_reset_hotplug_slot(dev->slot->hotplug, probe);
diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c
index 50fb3eb595fe..ed10fa3ae727 100644
--- a/drivers/pci/slot.c
+++ b/drivers/pci/slot.c
@@ -63,6 +63,22 @@ static ssize_t cur_speed_read_file(struct pci_slot *slot, char *buf)
return bus_speed_read(slot->bus->cur_bus_speed, buf);
}
+static bool pci_dev_matches_slot(struct pci_dev *dev, struct pci_slot *slot)
+{
+ if (slot->per_func_slot)
+ return dev->devfn == slot->number;
+
+ return PCI_SLOT(dev->devfn) == slot->number;
+}
+
+static bool pci_slot_enabled_per_func(void)
+{
+ if (IS_ENABLED(CONFIG_S390))
+ return true;
+
+ return false;
+}
+
static void pci_slot_release(struct kobject *kobj)
{
struct pci_dev *dev;
@@ -73,7 +89,7 @@ static void pci_slot_release(struct kobject *kobj)
down_read(&pci_bus_sem);
list_for_each_entry(dev, &slot->bus->devices, bus_list)
- if (PCI_SLOT(dev->devfn) == slot->number)
+ if (pci_dev_matches_slot(dev, slot))
dev->slot = NULL;
up_read(&pci_bus_sem);
@@ -166,7 +182,7 @@ void pci_dev_assign_slot(struct pci_dev *dev)
mutex_lock(&pci_slot_mutex);
list_for_each_entry(slot, &dev->bus->slots, list)
- if (PCI_SLOT(dev->devfn) == slot->number)
+ if (pci_dev_matches_slot(dev, slot))
dev->slot = slot;
mutex_unlock(&pci_slot_mutex);
}
@@ -265,6 +281,9 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr,
slot->bus = pci_bus_get(parent);
slot->number = slot_nr;
+ if (pci_slot_enabled_per_func())
+ slot->per_func_slot = 1;
+
slot->kobj.kset = pci_slots_kset;
slot_name = make_slot_name(name);
@@ -285,7 +304,7 @@ struct pci_slot *pci_create_slot(struct pci_bus *parent, int slot_nr,
down_read(&pci_bus_sem);
list_for_each_entry(dev, &parent->devices, bus_list)
- if (PCI_SLOT(dev->devfn) == slot_nr)
+ if (pci_dev_matches_slot(dev, slot))
dev->slot = slot;
up_read(&pci_bus_sem);
diff --git a/include/linux/pci.h b/include/linux/pci.h
index d1fdf81fbe1e..6ad194597ab5 100644
--- a/include/linux/pci.h
+++ b/include/linux/pci.h
@@ -78,6 +78,7 @@ struct pci_slot {
struct list_head list; /* Node in list of slots */
struct hotplug_slot *hotplug; /* Hotplug info (move here) */
unsigned char number; /* PCI_SLOT(pci_dev->devfn) */
+ unsigned int per_func_slot:1; /* Allow per function slot */
struct kobject kobj;
};
--
2.43.0
On Wed, 2025-10-22 at 14:24 -0700, Farhan Ali wrote:
> On s390 systems, which use a machine level hypervisor, PCI devices are
> always accessed through a form of PCI pass-through which fundamentally
> operates on a per PCI function granularity. This is also reflected in the
> s390 PCI hotplug driver which creates hotplug slots for individual PCI
> functions. Its reset_slot() function, which is a wrapper for
> zpci_hot_reset_device(), thus also resets individual functions.
>
> Currently, the kernel's PCI_SLOT() macro assigns the same pci_slot object
> to multifunction devices. This approach worked fine on s390 systems that
> only exposed virtual functions as individual PCI domains to the operating
> system. Since commit 44510d6fa0c0 ("s390/pci: Handling multifunctions")
> s390 supports exposing the topology of multifunction PCI devices by
> grouping them in a shared PCI domain. When attempting to reset a function
> through the hotplug driver, the shared slot assignment causes the wrong
> function to be reset instead of the intended one. It also leaks memory as
> we do create a pci_slot object for the function, but don't correctly free
> it in pci_slot_release().
>
> Add a flag for struct pci_slot to allow per function PCI slots for
> functions managed through a hypervisor, which exposes individual PCI
> functions while retaining the topology.
I wonder if LoongArch which now also does per PCI function pass-through
might need this too. Adding their KVM maintainers.
>
> Fixes: 44510d6fa0c0 ("s390/pci: Handling multifunctions")
> Cc: stable@vger.kernel.org
> Suggested-by: Niklas Schnelle <schnelle@linux.ibm.com>
> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
> ---
> drivers/pci/pci.c | 5 +++--
> drivers/pci/slot.c | 25 ++++++++++++++++++++++---
> include/linux/pci.h | 1 +
> 3 files changed, 26 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index b14dd064006c..36ee38e0d817 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -4980,8 +4980,9 @@ static int pci_reset_hotplug_slot(struct hotplug_slot *hotplug, bool probe)
>
> static int pci_dev_reset_slot_function(struct pci_dev *dev, bool probe)
> {
> - if (dev->multifunction || dev->subordinate || !dev->slot ||
> - dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET)
> + if (dev->subordinate || !dev->slot ||
> + dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET ||
> + (dev->multifunction && !dev->slot->per_func_slot))
> return -ENOTTY;
>
> return pci_reset_hotplug_slot(dev->slot->hotplug, probe);
> diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c
> index 50fb3eb595fe..ed10fa3ae727 100644
> --- a/drivers/pci/slot.c
> +++ b/drivers/pci/slot.c
> @@ -63,6 +63,22 @@ static ssize_t cur_speed_read_file(struct pci_slot *slot, char *buf)
> return bus_speed_read(slot->bus->cur_bus_speed, buf);
> }
>
> +static bool pci_dev_matches_slot(struct pci_dev *dev, struct pci_slot *slot)
> +{
> + if (slot->per_func_slot)
> + return dev->devfn == slot->number;
> +
> + return PCI_SLOT(dev->devfn) == slot->number;
> +}
> +
> +static bool pci_slot_enabled_per_func(void)
> +{
> + if (IS_ENABLED(CONFIG_S390))
> + return true;
> +
> + return false;
> +}
> +
--- snip ---
>
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index d1fdf81fbe1e..6ad194597ab5 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -78,6 +78,7 @@ struct pci_slot {
> struct list_head list; /* Node in list of slots */
> struct hotplug_slot *hotplug; /* Hotplug info (move here) */
> unsigned char number; /* PCI_SLOT(pci_dev->devfn) */
> + unsigned int per_func_slot:1; /* Allow per function slot */
> struct kobject kobj;
> };
>
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
On 2025/10/27 下午8:29, Niklas Schnelle wrote:
> On Wed, 2025-10-22 at 14:24 -0700, Farhan Ali wrote:
>> On s390 systems, which use a machine level hypervisor, PCI devices are
>> always accessed through a form of PCI pass-through which fundamentally
>> operates on a per PCI function granularity. This is also reflected in the
>> s390 PCI hotplug driver which creates hotplug slots for individual PCI
>> functions. Its reset_slot() function, which is a wrapper for
>> zpci_hot_reset_device(), thus also resets individual functions.
>>
>> Currently, the kernel's PCI_SLOT() macro assigns the same pci_slot object
>> to multifunction devices. This approach worked fine on s390 systems that
>> only exposed virtual functions as individual PCI domains to the operating
>> system. Since commit 44510d6fa0c0 ("s390/pci: Handling multifunctions")
>> s390 supports exposing the topology of multifunction PCI devices by
>> grouping them in a shared PCI domain. When attempting to reset a function
>> through the hotplug driver, the shared slot assignment causes the wrong
>> function to be reset instead of the intended one. It also leaks memory as
>> we do create a pci_slot object for the function, but don't correctly free
>> it in pci_slot_release().
>>
>> Add a flag for struct pci_slot to allow per function PCI slots for
>> functions managed through a hypervisor, which exposes individual PCI
>> functions while retaining the topology.
>
> I wonder if LoongArch which now also does per PCI function pass-through
> might need this too. Adding their KVM maintainers.
Hi Niklas,
Thanks for your reminder. Yes, LoongArch do per PCI function
pass-throught. In theory, function pci_slot_enabled_per_func() should
return true on LoongArch also. Only that now IOMMU driver is not merged,
there is no way to test it, however we will write down this as a note
inside about this issue and verify it once IOMMU driver is merged.
Regards
Bibo Mao
>
>>
>> Fixes: 44510d6fa0c0 ("s390/pci: Handling multifunctions")
>> Cc: stable@vger.kernel.org
>> Suggested-by: Niklas Schnelle <schnelle@linux.ibm.com>
>> Signed-off-by: Farhan Ali <alifm@linux.ibm.com>
>> ---
>> drivers/pci/pci.c | 5 +++--
>> drivers/pci/slot.c | 25 ++++++++++++++++++++++---
>> include/linux/pci.h | 1 +
>> 3 files changed, 26 insertions(+), 5 deletions(-)
>>
>> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
>> index b14dd064006c..36ee38e0d817 100644
>> --- a/drivers/pci/pci.c
>> +++ b/drivers/pci/pci.c
>> @@ -4980,8 +4980,9 @@ static int pci_reset_hotplug_slot(struct hotplug_slot *hotplug, bool probe)
>>
>> static int pci_dev_reset_slot_function(struct pci_dev *dev, bool probe)
>> {
>> - if (dev->multifunction || dev->subordinate || !dev->slot ||
>> - dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET)
>> + if (dev->subordinate || !dev->slot ||
>> + dev->dev_flags & PCI_DEV_FLAGS_NO_BUS_RESET ||
>> + (dev->multifunction && !dev->slot->per_func_slot))
>> return -ENOTTY;
>>
>> return pci_reset_hotplug_slot(dev->slot->hotplug, probe);
>> diff --git a/drivers/pci/slot.c b/drivers/pci/slot.c
>> index 50fb3eb595fe..ed10fa3ae727 100644
>> --- a/drivers/pci/slot.c
>> +++ b/drivers/pci/slot.c
>> @@ -63,6 +63,22 @@ static ssize_t cur_speed_read_file(struct pci_slot *slot, char *buf)
>> return bus_speed_read(slot->bus->cur_bus_speed, buf);
>> }
>>
>> +static bool pci_dev_matches_slot(struct pci_dev *dev, struct pci_slot *slot)
>> +{
>> + if (slot->per_func_slot)
>> + return dev->devfn == slot->number;
>> +
>> + return PCI_SLOT(dev->devfn) == slot->number;
>> +}
>> +
>> +static bool pci_slot_enabled_per_func(void)
>> +{
>> + if (IS_ENABLED(CONFIG_S390))
>> + return true;
>> +
>> + return false;
>> +}
>> +
> --- snip ---
>>
>> diff --git a/include/linux/pci.h b/include/linux/pci.h
>> index d1fdf81fbe1e..6ad194597ab5 100644
>> --- a/include/linux/pci.h
>> +++ b/include/linux/pci.h
>> @@ -78,6 +78,7 @@ struct pci_slot {
>> struct list_head list; /* Node in list of slots */
>> struct hotplug_slot *hotplug; /* Hotplug info (move here) */
>> unsigned char number; /* PCI_SLOT(pci_dev->devfn) */
>> + unsigned int per_func_slot:1; /* Allow per function slot */
>> struct kobject kobj;
>> };
>>
>
> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
>
On Tue, 2025-10-28 at 09:11 +0800, Bibo Mao wrote:
>
> On 2025/10/27 下午8:29, Niklas Schnelle wrote:
> > On Wed, 2025-10-22 at 14:24 -0700, Farhan Ali wrote:
> > > On s390 systems, which use a machine level hypervisor, PCI devices are
> > > always accessed through a form of PCI pass-through which fundamentally
> > > operates on a per PCI function granularity. This is also reflected in the
> > > s390 PCI hotplug driver which creates hotplug slots for individual PCI
> > > functions. Its reset_slot() function, which is a wrapper for
> > > zpci_hot_reset_device(), thus also resets individual functions.
> > >
> > > Currently, the kernel's PCI_SLOT() macro assigns the same pci_slot object
> > > to multifunction devices. This approach worked fine on s390 systems that
> > > only exposed virtual functions as individual PCI domains to the operating
> > > system. Since commit 44510d6fa0c0 ("s390/pci: Handling multifunctions")
> > > s390 supports exposing the topology of multifunction PCI devices by
> > > grouping them in a shared PCI domain. When attempting to reset a function
> > > through the hotplug driver, the shared slot assignment causes the wrong
> > > function to be reset instead of the intended one. It also leaks memory as
> > > we do create a pci_slot object for the function, but don't correctly free
> > > it in pci_slot_release().
> > >
> > > Add a flag for struct pci_slot to allow per function PCI slots for
> > > functions managed through a hypervisor, which exposes individual PCI
> > > functions while retaining the topology.
> >
> > I wonder if LoongArch which now also does per PCI function pass-through
> > might need this too. Adding their KVM maintainers.
>
> Hi Niklas,
>
> Thanks for your reminder. Yes, LoongArch do per PCI function
> pass-throught. In theory, function pci_slot_enabled_per_func() should
> return true on LoongArch also. Only that now IOMMU driver is not merged,
> there is no way to test it, however we will write down this as a note
> inside about this issue and verify it once IOMMU driver is merged.
>
>
> Regards
> Bibo Mao
>
Hi Bibo,
Is your ability to test if it works for you also hindered for my other
patch touching pci_scan_slot()? It sounded like Huacai was looking into
testing that.
Thanks,
Niklas
© 2016 - 2026 Red Hat, Inc.