[PATCH net-next V2 0/2] virtio-net: don't busy poll for cvq command

Jason Wang posted 2 patches 2 years, 8 months ago
There is a newer version of this series
drivers/net/virtio_net.c | 76 ++++++++++++++++++++++++++++++++++------
1 file changed, 66 insertions(+), 10 deletions(-)
[PATCH net-next V2 0/2] virtio-net: don't busy poll for cvq command
Posted by Jason Wang 2 years, 8 months ago
Hi all:

The code used to busy poll for cvq command which turns out to have
several side effects:

1) infinite poll for buggy devices
2) bad interaction with scheduler

So this series tries to use sleep instead of busy polling. In this
version, I take a step back: the hardening part is not implemented and
leave for future investigation. We use to aggree to use interruptible
sleep but it doesn't work for a general workqueue.

Please review.

Thanks

Changes since V1:
- use RTNL to synchronize rx mode worker
- use completion for simplicity
- don't try to harden CVQ command

Changes since RFC:

- switch to use BAD_RING in virtio_break_device()
- check virtqueue_is_broken() after being woken up
- use more_used() instead of virtqueue_get_buf() to allow caller to
  get buffers afterwards
  - break the virtio-net device when timeout
  - get buffer manually since the virtio core check more_used() instead

Jason Wang (2):
  virtio-net: convert rx mode setting to use workqueue
  virtio-net: sleep instead of busy waiting for cvq command

 drivers/net/virtio_net.c | 76 ++++++++++++++++++++++++++++++++++------
 1 file changed, 66 insertions(+), 10 deletions(-)

-- 
2.25.1
Re: [PATCH net-next V2 0/2] virtio-net: don't busy poll for cvq command
Posted by Jakub Kicinski 2 years, 8 months ago
On Thu, 13 Apr 2023 14:40:25 +0800 Jason Wang wrote:
> The code used to busy poll for cvq command which turns out to have
> several side effects:
> 
> 1) infinite poll for buggy devices
> 2) bad interaction with scheduler
> 
> So this series tries to use sleep instead of busy polling. In this
> version, I take a step back: the hardening part is not implemented and
> leave for future investigation. We use to aggree to use interruptible
> sleep but it doesn't work for a general workqueue.

CC: netdev missing?
Re: [PATCH net-next V2 0/2] virtio-net: don't busy poll for cvq command
Posted by Jason Wang 2 years, 8 months ago
On Thu, Apr 13, 2023 at 10:04 PM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Thu, 13 Apr 2023 14:40:25 +0800 Jason Wang wrote:
> > The code used to busy poll for cvq command which turns out to have
> > several side effects:
> >
> > 1) infinite poll for buggy devices
> > 2) bad interaction with scheduler
> >
> > So this series tries to use sleep instead of busy polling. In this
> > version, I take a step back: the hardening part is not implemented and
> > leave for future investigation. We use to aggree to use interruptible
> > sleep but it doesn't work for a general workqueue.
>
> CC: netdev missing?

My bad. Will cc netdev for any discussion.

Thanks

>
Re: [PATCH net-next V2 0/2] virtio-net: don't busy poll for cvq command
Posted by Maxime Coquelin 2 years, 8 months ago
Hi Jason,

On 4/13/23 08:40, Jason Wang wrote:
> Hi all:
> 
> The code used to busy poll for cvq command which turns out to have
> several side effects:
> 
> 1) infinite poll for buggy devices
> 2) bad interaction with scheduler
> 
> So this series tries to use sleep instead of busy polling. In this
> version, I take a step back: the hardening part is not implemented and
> leave for future investigation. We use to aggree to use interruptible
> sleep but it doesn't work for a general workqueue.
> 
> Please review.

Thanks for working on this.
My DPDK VDUSE RFC missed to set the interrupt, as Xuan Zhou highlighted
it makes the vdpa dev add/del commands to freeze:
[<0>] device_del+0x37/0x3d0
[<0>] device_unregister+0x13/0x60
[<0>] unregister_virtio_device+0x11/0x20
[<0>] device_release_driver_internal+0x193/0x200
[<0>] bus_remove_device+0xbf/0x130
[<0>] device_del+0x174/0x3d0
[<0>] device_unregister+0x13/0x60
[<0>] vdpa_nl_cmd_dev_del_set_doit+0x66/0xe0 [vdpa]
[<0>] genl_family_rcv_msg_doit.isra.0+0xb8/0x100
[<0>] genl_rcv_msg+0x151/0x290
[<0>] netlink_rcv_skb+0x54/0x100
[<0>] genl_rcv+0x24/0x40
[<0>] netlink_unicast+0x217/0x340
[<0>] netlink_sendmsg+0x23e/0x4a0
[<0>] sock_sendmsg+0x8f/0xa0
[<0>] __sys_sendto+0xfc/0x170
[<0>] __x64_sys_sendto+0x20/0x30
[<0>] do_syscall_64+0x59/0x90
[<0>] entry_SYSCALL_64_after_hwframe+0x72/0xdc

Once fixed on DPDK side (you can use my vduse_v1 branch [0] for
testing), it works fine:

Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>

For the potential missing interrupt with non-compliant devices, I guess
it could be handled with the hardening work as same thing could happen
if the VDUSE application crashed for example.

Regards,
Maxime

[0]:
> Thanks
> 
> Changes since V1:
> - use RTNL to synchronize rx mode worker
> - use completion for simplicity
> - don't try to harden CVQ command
> 
> Changes since RFC:
> 
> - switch to use BAD_RING in virtio_break_device()
> - check virtqueue_is_broken() after being woken up
> - use more_used() instead of virtqueue_get_buf() to allow caller to
>    get buffers afterwards
>    - break the virtio-net device when timeout
>    - get buffer manually since the virtio core check more_used() instead
> 
> Jason Wang (2):
>    virtio-net: convert rx mode setting to use workqueue
>    virtio-net: sleep instead of busy waiting for cvq command
> 
>   drivers/net/virtio_net.c | 76 ++++++++++++++++++++++++++++++++++------
>   1 file changed, 66 insertions(+), 10 deletions(-)
>
Re: [PATCH net-next V2 0/2] virtio-net: don't busy poll for cvq command
Posted by Maxime Coquelin 2 years, 8 months ago

On 4/13/23 15:02, Maxime Coquelin wrote:
> Hi Jason,
> 
> On 4/13/23 08:40, Jason Wang wrote:
>> Hi all:
>>
>> The code used to busy poll for cvq command which turns out to have
>> several side effects:
>>
>> 1) infinite poll for buggy devices
>> 2) bad interaction with scheduler
>>
>> So this series tries to use sleep instead of busy polling. In this
>> version, I take a step back: the hardening part is not implemented and
>> leave for future investigation. We use to aggree to use interruptible
>> sleep but it doesn't work for a general workqueue.
>>
>> Please review.
> 
> Thanks for working on this.
> My DPDK VDUSE RFC missed to set the interrupt, as Xuan Zhou highlighted
> it makes the vdpa dev add/del commands to freeze:
> [<0>] device_del+0x37/0x3d0
> [<0>] device_unregister+0x13/0x60
> [<0>] unregister_virtio_device+0x11/0x20
> [<0>] device_release_driver_internal+0x193/0x200
> [<0>] bus_remove_device+0xbf/0x130
> [<0>] device_del+0x174/0x3d0
> [<0>] device_unregister+0x13/0x60
> [<0>] vdpa_nl_cmd_dev_del_set_doit+0x66/0xe0 [vdpa]
> [<0>] genl_family_rcv_msg_doit.isra.0+0xb8/0x100
> [<0>] genl_rcv_msg+0x151/0x290
> [<0>] netlink_rcv_skb+0x54/0x100
> [<0>] genl_rcv+0x24/0x40
> [<0>] netlink_unicast+0x217/0x340
> [<0>] netlink_sendmsg+0x23e/0x4a0
> [<0>] sock_sendmsg+0x8f/0xa0
> [<0>] __sys_sendto+0xfc/0x170
> [<0>] __x64_sys_sendto+0x20/0x30
> [<0>] do_syscall_64+0x59/0x90
> [<0>] entry_SYSCALL_64_after_hwframe+0x72/0xdc
> 
> Once fixed on DPDK side (you can use my vduse_v1 branch [0] for
> testing), it works fine:
> 
> Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
> 
> For the potential missing interrupt with non-compliant devices, I guess
> it could be handled with the hardening work as same thing could happen
> if the VDUSE application crashed for example.
> 
> Regards,
> Maxime
> 
> [0]:

Better with the link...

[0]: https://gitlab.com/mcoquelin/dpdk-next-virtio/-/commits/vduse_v1/

>> Thanks
>>
>> Changes since V1:
>> - use RTNL to synchronize rx mode worker
>> - use completion for simplicity
>> - don't try to harden CVQ command
>>
>> Changes since RFC:
>>
>> - switch to use BAD_RING in virtio_break_device()
>> - check virtqueue_is_broken() after being woken up
>> - use more_used() instead of virtqueue_get_buf() to allow caller to
>>    get buffers afterwards
>>    - break the virtio-net device when timeout
>>    - get buffer manually since the virtio core check more_used() instead
>>
>> Jason Wang (2):
>>    virtio-net: convert rx mode setting to use workqueue
>>    virtio-net: sleep instead of busy waiting for cvq command
>>
>>   drivers/net/virtio_net.c | 76 ++++++++++++++++++++++++++++++++++------
>>   1 file changed, 66 insertions(+), 10 deletions(-)
>>