[PATCH v2 1/3] PCI/sysfs: Fix null pointer dereference during hotplug

Ziming Du posted 3 patches 1 month, 2 weeks ago
There is a newer version of this series
[PATCH v2 1/3] PCI/sysfs: Fix null pointer dereference during hotplug
Posted by Ziming Du 1 month, 2 weeks ago
During the concurrent process of creating and rescanning in VF, the
resource files for the same pci_dev may be created twice. The second
creation attempt fails, resulting the res_attr in pci_dev to kfree(),
but the pointer is not set to NULL. This will subsequently lead to
dereferencing a null pointer when removing the device.

When we perform the following operation:
  echo $vfcount > /sys/class/net/"$pfname"/device/sriov_numvfs &
  sleep 0.5
  echo 1 > /sys/bus/pci/rescan
  pci_remove "$pfname"
system will crash as follows:

  Unable to handle kernel NULL pointer dereference at virtual address 0000000000000000
  Call trace:
   __pi_strlen+0x14/0x150
   kernfs_find_ns+0x54/0x120
   kernfs_remove_by_name_ns+0x58/0xf0
   sysfs_remove_bin_file+0x24/0x38
   pci_remove_resource_files+0x44/0x90
   pci_remove_sysfs_dev_files+0x28/0x40
   pci_stop_bus_device+0xb8/0x118
   pci_stop_and_remove_bus_device+0x20/0x40
   pci_iov_remove_virtfn+0xb8/0x138
   sriov_disable+0xbc/0x190
   pci_disable_sriov+0x30/0x48
   hinic_pci_sriov_disable+0x54/0x138 [hinic]
   hinic_remove+0x140/0x290 [hinic]
   pci_device_remove+0x4c/0xf8
   device_remove+0x54/0x90
   device_release_driver_internal+0x1d4/0x238
   device_release_driver+0x20/0x38
   pci_stop_bus_device+0xa8/0x118
   pci_stop_and_remove_bus_device_locked+0x28/0x50
   remove_store+0x128/0x208

Fix this by set the pointer to NULL after releasing res_attr immediately.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Signed-off-by: Ziming Du <duziming2@huawei.com>
---
 drivers/pci/pci-sysfs.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
index c2df915ad2d2..7e697b82c5e1 100644
--- a/drivers/pci/pci-sysfs.c
+++ b/drivers/pci/pci-sysfs.c
@@ -1222,12 +1222,14 @@ static void pci_remove_resource_files(struct pci_dev *pdev)
 		if (res_attr) {
 			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
 			kfree(res_attr);
+			pdev->res_attr[i] = NULL;
 		}
 
 		res_attr = pdev->res_attr_wc[i];
 		if (res_attr) {
 			sysfs_remove_bin_file(&pdev->dev.kobj, res_attr);
 			kfree(res_attr);
+			pdev->res_attr_wc[i] = NULL;
 		}
 	}
 }
-- 
2.43.0
Re: [PATCH v2 1/3] PCI/sysfs: Fix null pointer dereference during hotplug
Posted by Bjorn Helgaas 1 month, 1 week ago
On Wed, Dec 24, 2025 at 05:27:17PM +0800, Ziming Du wrote:
> During the concurrent process of creating and rescanning in VF, the
> resource files for the same pci_dev may be created twice. The second
> creation attempt fails, resulting the res_attr in pci_dev to kfree(),
> but the pointer is not set to NULL. This will subsequently lead to
> dereferencing a null pointer when removing the device.
> 
> When we perform the following operation:
>   echo $vfcount > /sys/class/net/"$pfname"/device/sriov_numvfs &

Is the value of $vfcount relevant here?  Can you use the actual values
here instead of the variables so this is more useful to others?

>   sleep 0.5
>   echo 1 > /sys/bus/pci/rescan
>   pci_remove "$pfname"
> system will crash as follows:
Re: [PATCH v2 1/3] PCI/sysfs: Fix null pointer dereference during hotplug
Posted by duziming 1 month, 1 week ago
在 2025/12/30 1:31, Bjorn Helgaas 写道:
> On Wed, Dec 24, 2025 at 05:27:17PM +0800, Ziming Du wrote:
>> During the concurrent process of creating and rescanning in VF, the
>> resource files for the same pci_dev may be created twice. The second
>> creation attempt fails, resulting the res_attr in pci_dev to kfree(),
>> but the pointer is not set to NULL. This will subsequently lead to
>> dereferencing a null pointer when removing the device.
>>
>> When we perform the following operation:
>>    echo $vfcount > /sys/class/net/"$pfname"/device/sriov_numvfs &
> Is the value of $vfcount relevant here?  Can you use the actual values
> here instead of the variables so this is more useful to others?

In fact, we directly use sriov_totalvfs here. In my opinion, the larger 
this value is,

the more likely it is to cause the issue.

>>    sleep 0.5
>>    echo 1 > /sys/bus/pci/rescan
>>    pci_remove "$pfname"
>> system will crash as follows: