[RFC PATCH] uio_hv_generic: Fix sysfs creation path for ring buffer

Naman Jain posted 1 patch 10 months, 1 week ago
There is a newer version of this series
drivers/hv/vmbus_drv.c       |  7 +++++++
drivers/uio/uio_hv_generic.c | 33 ++++++++++++++++++++++++++++-----
include/linux/hyperv.h       |  4 ++++
3 files changed, 39 insertions(+), 5 deletions(-)
[RFC PATCH] uio_hv_generic: Fix sysfs creation path for ring buffer
Posted by Naman Jain 10 months, 1 week ago
On regular bootup, devices get registered to vmbus first, so when
uio_hv_generic driver for a particular device type is probed,
the device is already initialized and added, so sysfs creation in
uio_hv_generic probe works fine. However, when device is removed
and brought back, the channel rescinds and again gets registered
to vmbus. However this time, the uio_hv_generic driver is already
registered to probe for that device and in this case sysfs creation
is tried before the device gets initialized completely. Fix this by
deferring sysfs creation till device gets initialized completely.

Problem path:
vmbus_device_register
    device_register
        uio_hv_generic probe
		    sysfs_create_bin_file (fails here)
	kset_create_and_add (dependency)
	vmbus_add_channel_kobj (dependency)

Fixes: 9ab877a6ccf8 ("uio_hv_generic: make ring buffer attribute for primary channel")
Cc: stable@kernel.org
Co-developed-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Signed-off-by: Saurabh Sengar <ssengar@linux.microsoft.com>
Signed-off-by: Naman Jain <namjain@linux.microsoft.com>
---

DPDK use-case depend on this sysfs node so to maintain backward compatibility and not break
DPDK, we could not remove sysfs logic from uio_hv_generic probe.
https://github.com/DPDK/dpdk/blob/main/drivers/bus/vmbus/linux/vmbus_uio.c#L360

Tried reordering functions in vmbus_device_register and also finding alternate functions
for device initialization, but could not find any other viable solution.
Explored the use of ATTRIBUTE_GROUPS and DEVICE_ATTR_RW but with that, I could create sysfs
node for a particular device but not for the channel for that device, as we need to.

From previous discussions, I could see sysfs creation in driver probe is not encouraged,
and we are now adding more complexity to it, so I am not sure if this is the best way to
solve the problem. Hence sharing this as RFC to gather some review comments.

Error logs:

[   35.574120] ------------[ cut here ]------------
[   35.574122] WARNING: CPU: 0 PID: 10 at fs/sysfs/file.c:591 sysfs_create_bin_file+0x81/0x90
[   35.574168] Workqueue: hv_pri_chan vmbus_add_channel_work
[   35.574172] RIP: 0010:sysfs_create_bin_file+0x81/0x90
[   35.574197] Call Trace:
[   35.574199]  <TASK>
[   35.574200]  ? show_regs+0x69/0x80
[   35.574217]  ? __warn+0x8d/0x130
[   35.574220]  ? sysfs_create_bin_file+0x81/0x90
[   35.574222]  ? report_bug+0x182/0x190
[   35.574225]  ? handle_bug+0x5b/0x90
[   35.574244]  ? exc_invalid_op+0x19/0x70
[   35.574247]  ? asm_exc_invalid_op+0x1b/0x20
[   35.574252]  ? sysfs_create_bin_file+0x81/0x90
[   35.574255]  hv_uio_probe+0x1e7/0x410 [uio_hv_generic]
[   35.574271]  vmbus_probe+0x3b/0x90
[   35.574275]  really_probe+0xf4/0x3b0
[   35.574279]  __driver_probe_device+0x8a/0x170
[   35.574282]  driver_probe_device+0x23/0xc0
[   35.574285]  __device_attach_driver+0xb5/0x140
[   35.574288]  ? __pfx___device_attach_driver+0x10/0x10
[   35.574291]  bus_for_each_drv+0x86/0xe0
[   35.574294]  __device_attach+0xc1/0x200
[   35.574297]  device_initial_probe+0x13/0x20
[   35.574315]  bus_probe_device+0x99/0xa0
[   35.574318]  device_add+0x647/0x870
[   35.574320]  ? hrtimer_init+0x28/0x70
[   35.574323]  device_register+0x1b/0x30
[   35.574326]  vmbus_device_register+0x83/0x130
[   35.574328]  vmbus_add_channel_work+0x135/0x1a0
[   35.574331]  process_one_work+0x177/0x340
[   35.574348]  worker_thread+0x2b2/0x3c0
[   35.574350]  kthread+0xe3/0x1f0
[   35.574353]  ? __pfx_worker_thread+0x10/0x10
[   35.574356]  ? __pfx_kthread+0x10/0x10
[   35.574358]  ret_from_fork+0x39/0x60
[   35.574362]  ? __pfx_kthread+0x10/0x10
[   35.574364]  ret_from_fork_asm+0x1a/0x30
[   35.574368]  </TASK>
[   35.574385] ---[ end trace 0000000000000000 ]---
[   35.574388] uio_hv_generic eb765408-105f-49b6-b4aa-c123b64d17d4: sysfs create ring bin file failed; -22

Thanks.
---
 drivers/hv/vmbus_drv.c       |  7 +++++++
 drivers/uio/uio_hv_generic.c | 33 ++++++++++++++++++++++++++++-----
 include/linux/hyperv.h       |  4 ++++
 3 files changed, 39 insertions(+), 5 deletions(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 0f6cd44fff29..16f7d7f2d7fd 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -835,6 +835,7 @@ static int vmbus_probe(struct device *child_device)
 	struct hv_device *dev = device_to_hv_device(child_device);
 	const struct hv_vmbus_device_id *dev_id;
 
+	dev->device_registered = false;
 	dev_id = hv_vmbus_get_id(drv, dev);
 	if (drv->probe) {
 		ret = drv->probe(dev, dev_id);
@@ -1927,6 +1928,7 @@ int vmbus_device_register(struct hv_device *child_device_obj)
 	child_device_obj->device.dma_mask = &child_device_obj->dma_mask;
 	dma_set_mask(&child_device_obj->device, DMA_BIT_MASK(64));
 
+	init_waitqueue_head(&child_device_obj->wait_event);
 	/*
 	 * Register with the LDM. This will kick off the driver/device
 	 * binding...which will eventually call vmbus_match() and vmbus_probe()
@@ -1951,6 +1953,11 @@ int vmbus_device_register(struct hv_device *child_device_obj)
 		pr_err("Unable to register primary channeln");
 		goto err_kset_unregister;
 	}
+
+	/* channel kobj allocated, now inform anyone waiting for it */
+	child_device_obj->device_registered = true;
+	wake_up_interruptible(&child_device_obj->wait_event);
+
 	hv_debug_add_dev_dir(child_device_obj);
 
 	return 0;
diff --git a/drivers/uio/uio_hv_generic.c b/drivers/uio/uio_hv_generic.c
index 1b19b5647495..99d6ecaa8f86 100644
--- a/drivers/uio/uio_hv_generic.c
+++ b/drivers/uio/uio_hv_generic.c
@@ -63,6 +63,8 @@ struct hv_uio_private_data {
 	void	*send_buf;
 	struct vmbus_gpadl send_gpadl;
 	char	send_name[32];
+
+	struct work_struct sysfs_work;
 };
 
 /*
@@ -243,6 +245,29 @@ hv_uio_release(struct uio_info *info, struct inode *inode)
 	return ret;
 }
 
+static void hv_uio_sysfs_work(struct work_struct *work)
+{
+	struct hv_uio_private_data *pdata =
+		container_of(work, struct hv_uio_private_data, sysfs_work);
+	struct vmbus_channel *channel = pdata->device->channel;
+	int ret;
+
+	ret = pdata->device->channels_kset ||
+		wait_event_interruptible_timeout(pdata->device->wait_event,
+						 pdata->device->device_registered,
+						 msecs_to_jiffies(5));
+	if (!ret) {
+		dev_warn(&pdata->device->device,
+			 "kset taking too long to initialize; %d\n", ret);
+		return;
+	}
+
+	ret = sysfs_create_bin_file(&channel->kobj, &ring_buffer_bin_attr);
+	if (ret)
+		dev_notice(&pdata->device->device,
+			   "sysfs create ring bin file failed; %d\n", ret);
+}
+
 static int
 hv_uio_probe(struct hv_device *dev,
 	     const struct hv_vmbus_device_id *dev_id)
@@ -349,11 +374,8 @@ hv_uio_probe(struct hv_device *dev,
 		dev_err(&dev->device, "hv_uio register failed\n");
 		goto fail_close;
 	}
-
-	ret = sysfs_create_bin_file(&channel->kobj, &ring_buffer_bin_attr);
-	if (ret)
-		dev_notice(&dev->device,
-			   "sysfs create ring bin file failed; %d\n", ret);
+	INIT_WORK(&pdata->sysfs_work, hv_uio_sysfs_work);
+	schedule_work(&pdata->sysfs_work);
 
 	hv_set_drvdata(dev, pdata);
 
@@ -376,6 +398,7 @@ hv_uio_remove(struct hv_device *dev)
 		return;
 
 	sysfs_remove_bin_file(&dev->channel->kobj, &ring_buffer_bin_attr);
+	cancel_work_sync(&pdata->sysfs_work);
 	uio_unregister_device(&pdata->info);
 	hv_uio_cleanup(dev, pdata);
 
diff --git a/include/linux/hyperv.h b/include/linux/hyperv.h
index 4179add2864b..6180231aff8d 100644
--- a/include/linux/hyperv.h
+++ b/include/linux/hyperv.h
@@ -1302,6 +1302,10 @@ struct hv_device {
 	u16 vendor_id;
 	u16 device_id;
 
+	/* check for device registration completion */
+	bool			device_registered;
+	wait_queue_head_t	wait_event;
+
 	struct device device;
 	/*
 	 * Driver name to force a match.  Do not set directly, because core

base-commit: 00f3246adeeacbda0bd0b303604e46eb59c32e6e
-- 
2.34.1
Re: [RFC PATCH] uio_hv_generic: Fix sysfs creation path for ring buffer
Posted by Greg Kroah-Hartman 10 months, 1 week ago
On Fri, Feb 14, 2025 at 12:13:51PM +0530, Naman Jain wrote:
> On regular bootup, devices get registered to vmbus first, so when
> uio_hv_generic driver for a particular device type is probed,
> the device is already initialized and added, so sysfs creation in
> uio_hv_generic probe works fine. However, when device is removed
> and brought back, the channel rescinds and again gets registered
> to vmbus. However this time, the uio_hv_generic driver is already
> registered to probe for that device and in this case sysfs creation
> is tried before the device gets initialized completely. Fix this by
> deferring sysfs creation till device gets initialized completely.
> 
> Problem path:
> vmbus_device_register
>     device_register
>         uio_hv_generic probe
> 		    sysfs_create_bin_file (fails here)

Ick, that's the issue, you shouldn't be manually creating sysfs files.
Have the driver core do it for you at the proper time, which should make
your logic much simpler, right?

Set the default attribute groups instead of manually creating this and
see if that works out better.

thanks,

greg k-h
Re: [RFC PATCH] uio_hv_generic: Fix sysfs creation path for ring buffer
Posted by Naman Jain 10 months, 1 week ago

On 2/14/2025 12:21 PM, Greg Kroah-Hartman wrote:
> On Fri, Feb 14, 2025 at 12:13:51PM +0530, Naman Jain wrote:
>> On regular bootup, devices get registered to vmbus first, so when
>> uio_hv_generic driver for a particular device type is probed,
>> the device is already initialized and added, so sysfs creation in
>> uio_hv_generic probe works fine. However, when device is removed
>> and brought back, the channel rescinds and again gets registered
>> to vmbus. However this time, the uio_hv_generic driver is already
>> registered to probe for that device and in this case sysfs creation
>> is tried before the device gets initialized completely. Fix this by
>> deferring sysfs creation till device gets initialized completely.
>>
>> Problem path:
>> vmbus_device_register
>>      device_register
>>          uio_hv_generic probe
>> 		    sysfs_create_bin_file (fails here)
> 
> Ick, that's the issue, you shouldn't be manually creating sysfs files.
> Have the driver core do it for you at the proper time, which should make
> your logic much simpler, right?
> 
> Set the default attribute groups instead of manually creating this and
> see if that works out better.
> 
> thanks,
> 
> greg k-h

Thanks for reviewing Greg. I tried this approach and here are my
observations:

What I could create with ATTRIBUTE_GROUPS:
/sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/ring

The one we have right now:
/sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/channels/6/ring

I could not find a way to tweak attributes to create the "ring" under 
above path. I could see the variations of sys_create_* which provides a
way to pass kobj and do that, but that is something we are already
using.

Regards,
Naman
Re: [RFC PATCH] uio_hv_generic: Fix sysfs creation path for ring buffer
Posted by Greg Kroah-Hartman 10 months, 1 week ago
On Fri, Feb 14, 2025 at 12:35:44PM +0530, Naman Jain wrote:
> 
> 
> On 2/14/2025 12:21 PM, Greg Kroah-Hartman wrote:
> > On Fri, Feb 14, 2025 at 12:13:51PM +0530, Naman Jain wrote:
> > > On regular bootup, devices get registered to vmbus first, so when
> > > uio_hv_generic driver for a particular device type is probed,
> > > the device is already initialized and added, so sysfs creation in
> > > uio_hv_generic probe works fine. However, when device is removed
> > > and brought back, the channel rescinds and again gets registered
> > > to vmbus. However this time, the uio_hv_generic driver is already
> > > registered to probe for that device and in this case sysfs creation
> > > is tried before the device gets initialized completely. Fix this by
> > > deferring sysfs creation till device gets initialized completely.
> > > 
> > > Problem path:
> > > vmbus_device_register
> > >      device_register
> > >          uio_hv_generic probe
> > > 		    sysfs_create_bin_file (fails here)
> > 
> > Ick, that's the issue, you shouldn't be manually creating sysfs files.
> > Have the driver core do it for you at the proper time, which should make
> > your logic much simpler, right?
> > 
> > Set the default attribute groups instead of manually creating this and
> > see if that works out better.
> > 
> > thanks,
> > 
> > greg k-h
> 
> Thanks for reviewing Greg. I tried this approach and here are my
> observations:
> 
> What I could create with ATTRIBUTE_GROUPS:
> /sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/ring
> 
> The one we have right now:
> /sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/channels/6/ring

What is "channels" and "6" here?  Are they real devices or just a
directory name or something else?

> I could not find a way to tweak attributes to create the "ring" under above
> path. I could see the variations of sys_create_* which provides a
> way to pass kobj and do that, but that is something we are already
> using.

No driver should EVER be pointing to a raw kobject, that's a huge hint
that something is really wrong.  Also, if a raw kobject is in a device
path in the middle like this, it will not be seen properly from
userspace library tools :(

So again, what is creating the "channels" and "6" subdirectories?  All
of that shoudl be under full control by the uio device, right?

thanks,

greg k-h
Re: [RFC PATCH] uio_hv_generic: Fix sysfs creation path for ring buffer
Posted by Stephen Hemminger 10 months, 1 week ago
On Fri, 14 Feb 2025 08:41:57 +0100
Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:

> On Fri, Feb 14, 2025 at 12:35:44PM +0530, Naman Jain wrote:
> > 
> > 
> > On 2/14/2025 12:21 PM, Greg Kroah-Hartman wrote:  
> > > On Fri, Feb 14, 2025 at 12:13:51PM +0530, Naman Jain wrote:  
> > > > On regular bootup, devices get registered to vmbus first, so when
> > > > uio_hv_generic driver for a particular device type is probed,
> > > > the device is already initialized and added, so sysfs creation in
> > > > uio_hv_generic probe works fine. However, when device is removed
> > > > and brought back, the channel rescinds and again gets registered
> > > > to vmbus. However this time, the uio_hv_generic driver is already
> > > > registered to probe for that device and in this case sysfs creation
> > > > is tried before the device gets initialized completely. Fix this by
> > > > deferring sysfs creation till device gets initialized completely.
> > > > 
> > > > Problem path:
> > > > vmbus_device_register
> > > >      device_register
> > > >          uio_hv_generic probe
> > > > 		    sysfs_create_bin_file (fails here)  
> > > 
> > > Ick, that's the issue, you shouldn't be manually creating sysfs files.
> > > Have the driver core do it for you at the proper time, which should make
> > > your logic much simpler, right?
> > > 
> > > Set the default attribute groups instead of manually creating this and
> > > see if that works out better.
> > > 
> > > thanks,
> > > 
> > > greg k-h  
> > 
> > Thanks for reviewing Greg. I tried this approach and here are my
> > observations:
> > 
> > What I could create with ATTRIBUTE_GROUPS:
> > /sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/ring
> > 
> > The one we have right now:
> > /sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/channels/6/ring  
> 
> What is "channels" and "6" here?  Are they real devices or just a
> directory name or something else?
> 
> > I could not find a way to tweak attributes to create the "ring" under above
> > path. I could see the variations of sys_create_* which provides a
> > way to pass kobj and do that, but that is something we are already
> > using.  
> 
> No driver should EVER be pointing to a raw kobject, that's a huge hint
> that something is really wrong.  Also, if a raw kobject is in a device
> path in the middle like this, it will not be seen properly from
> userspace library tools :(
> 
> So again, what is creating the "channels" and "6" subdirectories?  All
> of that shoudl be under full control by the uio device, right?

The original design of exposing channels was based on what the
network core does to expose queues. Worth comparing the two
to see if there is any shared insight.
Re: [RFC PATCH] uio_hv_generic: Fix sysfs creation path for ring buffer
Posted by Naman Jain 10 months ago

On 2/14/2025 10:41 PM, Stephen Hemminger wrote:
> On Fri, 14 Feb 2025 08:41:57 +0100
> Greg Kroah-Hartman <gregkh@linuxfoundation.org> wrote:
> 
>> On Fri, Feb 14, 2025 at 12:35:44PM +0530, Naman Jain wrote:
>>>
>>>
>>> On 2/14/2025 12:21 PM, Greg Kroah-Hartman wrote:
>>>> On Fri, Feb 14, 2025 at 12:13:51PM +0530, Naman Jain wrote:
>>>>> On regular bootup, devices get registered to vmbus first, so when
>>>>> uio_hv_generic driver for a particular device type is probed,
>>>>> the device is already initialized and added, so sysfs creation in
>>>>> uio_hv_generic probe works fine. However, when device is removed
>>>>> and brought back, the channel rescinds and again gets registered
>>>>> to vmbus. However this time, the uio_hv_generic driver is already
>>>>> registered to probe for that device and in this case sysfs creation
>>>>> is tried before the device gets initialized completely. Fix this by
>>>>> deferring sysfs creation till device gets initialized completely.
>>>>>
>>>>> Problem path:
>>>>> vmbus_device_register
>>>>>       device_register
>>>>>           uio_hv_generic probe
>>>>> 		    sysfs_create_bin_file (fails here)
>>>>
>>>> Ick, that's the issue, you shouldn't be manually creating sysfs files.
>>>> Have the driver core do it for you at the proper time, which should make
>>>> your logic much simpler, right?
>>>>
>>>> Set the default attribute groups instead of manually creating this and
>>>> see if that works out better.
>>>>
>>>> thanks,
>>>>
>>>> greg k-h
>>>
>>> Thanks for reviewing Greg. I tried this approach and here are my
>>> observations:
>>>
>>> What I could create with ATTRIBUTE_GROUPS:
>>> /sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/ring
>>>
>>> The one we have right now:
>>> /sys/bus/vmbus/devices/eb765408-105f-49b6-b4aa-c123b64d17d4/channels/6/ring
>>
>> What is "channels" and "6" here?  Are they real devices or just a
>> directory name or something else?
>>
>>> I could not find a way to tweak attributes to create the "ring" under above
>>> path. I could see the variations of sys_create_* which provides a
>>> way to pass kobj and do that, but that is something we are already
>>> using.
>>
>> No driver should EVER be pointing to a raw kobject, that's a huge hint
>> that something is really wrong.  Also, if a raw kobject is in a device
>> path in the middle like this, it will not be seen properly from
>> userspace library tools :(
>>
>> So again, what is creating the "channels" and "6" subdirectories?  All
>> of that shoudl be under full control by the uio device, right?
> 
> The original design of exposing channels was based on what the
> network core does to expose queues. Worth comparing the two
> to see if there is any shared insight.

Thanks Greg and Stephen. I'll try to find it.

Regards,
Naman