[PATCH net] virtio-net: add cond_resched() to the command waiting loop

Jason Wang posted 1 patch 3 years, 7 months ago
drivers/net/virtio_net.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
[PATCH net] virtio-net: add cond_resched() to the command waiting loop
Posted by Jason Wang 3 years, 7 months ago
Adding cond_resched() to the command waiting loop for a better
co-operation with the scheduler. This allows to give CPU a breath to
run other task(workqueue) instead of busy looping when preemption is
not allowed.

What's more important. This is a must for some vDPA parent to work
since control virtqueue is emulated via a workqueue for those parents.

Fixes: bda324fd037a ("vdpasim: control virtqueue support")
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 drivers/net/virtio_net.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index ece00b84e3a7..169368365d6a 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -2000,8 +2000,10 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
 	 * into the hypervisor, so the request should be handled immediately.
 	 */
 	while (!virtqueue_get_buf(vi->cvq, &tmp) &&
-	       !virtqueue_is_broken(vi->cvq))
+	       !virtqueue_is_broken(vi->cvq)) {
+		cond_resched();
 		cpu_relax();
+	}
 
 	return vi->ctrl->status == VIRTIO_NET_OK;
 }
-- 
2.25.1
Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop
Posted by Michael S. Tsirkin 3 years, 5 months ago
On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> Adding cond_resched() to the command waiting loop for a better
> co-operation with the scheduler. This allows to give CPU a breath to
> run other task(workqueue) instead of busy looping when preemption is
> not allowed.
> 
> What's more important. This is a must for some vDPA parent to work
> since control virtqueue is emulated via a workqueue for those parents.
> 
> Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Acked-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  drivers/net/virtio_net.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index ece00b84e3a7..169368365d6a 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -2000,8 +2000,10 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
>  	 * into the hypervisor, so the request should be handled immediately.
>  	 */
>  	while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> -	       !virtqueue_is_broken(vi->cvq))
> +	       !virtqueue_is_broken(vi->cvq)) {
> +		cond_resched();
>  		cpu_relax();
> +	}
>  
>  	return vi->ctrl->status == VIRTIO_NET_OK;
>  }
> -- 
> 2.25.1
Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop
Posted by Michael S. Tsirkin 3 years, 7 months ago
On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> Adding cond_resched() to the command waiting loop for a better
> co-operation with the scheduler. This allows to give CPU a breath to
> run other task(workqueue) instead of busy looping when preemption is
> not allowed.
> 
> What's more important. This is a must for some vDPA parent to work
> since control virtqueue is emulated via a workqueue for those parents.
> 
> Fixes: bda324fd037a ("vdpasim: control virtqueue support")

That's a weird commit to fix. so it fixes the simulator?

> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  drivers/net/virtio_net.c | 4 +++-
>  1 file changed, 3 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> index ece00b84e3a7..169368365d6a 100644
> --- a/drivers/net/virtio_net.c
> +++ b/drivers/net/virtio_net.c
> @@ -2000,8 +2000,10 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
>  	 * into the hypervisor, so the request should be handled immediately.
>  	 */
>  	while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> -	       !virtqueue_is_broken(vi->cvq))
> +	       !virtqueue_is_broken(vi->cvq)) {
> +		cond_resched();
>  		cpu_relax();
> +	}

with cond_resched do we still need cpu_relax?

>  	return vi->ctrl->status == VIRTIO_NET_OK;
>  }
> -- 
> 2.25.1
Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop
Posted by Jason Wang 3 years, 7 months ago
On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > Adding cond_resched() to the command waiting loop for a better
> > co-operation with the scheduler. This allows to give CPU a breath to
> > run other task(workqueue) instead of busy looping when preemption is
> > not allowed.
> >
> > What's more important. This is a must for some vDPA parent to work
> > since control virtqueue is emulated via a workqueue for those parents.
> >
> > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
>
> That's a weird commit to fix. so it fixes the simulator?

Yes, since the simulator is using a workqueue to handle control virtueue.

>
> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> > ---
> >  drivers/net/virtio_net.c | 4 +++-
> >  1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
> > index ece00b84e3a7..169368365d6a 100644
> > --- a/drivers/net/virtio_net.c
> > +++ b/drivers/net/virtio_net.c
> > @@ -2000,8 +2000,10 @@ static bool virtnet_send_command(struct virtnet_info *vi, u8 class, u8 cmd,
> >        * into the hypervisor, so the request should be handled immediately.
> >        */
> >       while (!virtqueue_get_buf(vi->cvq, &tmp) &&
> > -            !virtqueue_is_broken(vi->cvq))
> > +            !virtqueue_is_broken(vi->cvq)) {
> > +             cond_resched();
> >               cpu_relax();
> > +     }
>
> with cond_resched do we still need cpu_relax?

I think so, it's possible that cond_sched() just doesn't trigger scheduling.

Thanks

>
> >       return vi->ctrl->status == VIRTIO_NET_OK;
> >  }
> > --
> > 2.25.1
>
Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop
Posted by Paolo Abeni 3 years, 7 months ago
On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > 
> > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > Adding cond_resched() to the command waiting loop for a better
> > > co-operation with the scheduler. This allows to give CPU a breath to
> > > run other task(workqueue) instead of busy looping when preemption is
> > > not allowed.
> > > 
> > > What's more important. This is a must for some vDPA parent to work
> > > since control virtqueue is emulated via a workqueue for those parents.
> > > 
> > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > 
> > That's a weird commit to fix. so it fixes the simulator?
> 
> Yes, since the simulator is using a workqueue to handle control virtueue.

Uhmm... touching a driver for a simulator's sake looks a little weird. 

Additionally, if the bug is vdpasim, I think it's better to try to
solve it there, if possible.

Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
neither needs a process context, so perhaps you could rework it to run
the work_fn() directly from vdpasim_kick_vq(), at least for the control
virtqueue?

Thanks!

Paolo
Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop
Posted by Jason Wang 3 years, 7 months ago
On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > >
> > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > Adding cond_resched() to the command waiting loop for a better
> > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > run other task(workqueue) instead of busy looping when preemption is
> > > > not allowed.
> > > >
> > > > What's more important. This is a must for some vDPA parent to work
> > > > since control virtqueue is emulated via a workqueue for those parents.
> > > >
> > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > >
> > > That's a weird commit to fix. so it fixes the simulator?
> >
> > Yes, since the simulator is using a workqueue to handle control virtueue.
>
> Uhmm... touching a driver for a simulator's sake looks a little weird.

Simulator is not the only one that is using a workqueue (but should be
the first).

I can see  that the mlx5 vDPA driver is using a workqueue as well (see
mlx5_vdpa_kick_vq()).

And in the case of VDUSE, it needs to wait for the response from the
userspace, this means cond_resched() is probably a must for the case
like UP.

>
> Additionally, if the bug is vdpasim, I think it's better to try to
> solve it there, if possible.
>
> Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> neither needs a process context, so perhaps you could rework it to run
> the work_fn() directly from vdpasim_kick_vq(), at least for the control
> virtqueue?

It's possible (but require some rework on the simulator core). But
considering we have other similar use cases, it looks better to solve
it in the virtio-net driver.

Additionally, this may have better behaviour when using for the buggy
hardware (e.g the control virtqueue takes too long to respond). We may
consider switching to use interrupt/sleep in the future (but not
suitable for -net).

Thanks

>
> Thanks!
>
> Paolo
>
Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop
Posted by Paolo Abeni 3 years, 7 months ago
On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
> On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > 
> > On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > 
> > > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > > Adding cond_resched() to the command waiting loop for a better
> > > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > > run other task(workqueue) instead of busy looping when preemption is
> > > > > not allowed.
> > > > > 
> > > > > What's more important. This is a must for some vDPA parent to work
> > > > > since control virtqueue is emulated via a workqueue for those parents.
> > > > > 
> > > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > > > 
> > > > That's a weird commit to fix. so it fixes the simulator?
> > > 
> > > Yes, since the simulator is using a workqueue to handle control virtueue.
> > 
> > Uhmm... touching a driver for a simulator's sake looks a little weird.
> 
> Simulator is not the only one that is using a workqueue (but should be
> the first).
> 
> I can see  that the mlx5 vDPA driver is using a workqueue as well (see
> mlx5_vdpa_kick_vq()).
> 
> And in the case of VDUSE, it needs to wait for the response from the
> userspace, this means cond_resched() is probably a must for the case
> like UP.
> 
> > 
> > Additionally, if the bug is vdpasim, I think it's better to try to
> > solve it there, if possible.
> > 
> > Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> > neither needs a process context, so perhaps you could rework it to run
> > the work_fn() directly from vdpasim_kick_vq(), at least for the control
> > virtqueue?
> 
> It's possible (but require some rework on the simulator core). But
> considering we have other similar use cases, it looks better to solve
> it in the virtio-net driver.

I see.

> Additionally, this may have better behaviour when using for the buggy
> hardware (e.g the control virtqueue takes too long to respond). We may
> consider switching to use interrupt/sleep in the future (but not
> suitable for -net).

Agreed. Possibly a timeout could be useful, too.

Cheers,

Paolo
Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop
Posted by Michael S. Tsirkin 3 years, 7 months ago
On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
> On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
> > On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > > 
> > > On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > > > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > 
> > > > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > > > Adding cond_resched() to the command waiting loop for a better
> > > > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > > > run other task(workqueue) instead of busy looping when preemption is
> > > > > > not allowed.
> > > > > > 
> > > > > > What's more important. This is a must for some vDPA parent to work
> > > > > > since control virtqueue is emulated via a workqueue for those parents.
> > > > > > 
> > > > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > > > > 
> > > > > That's a weird commit to fix. so it fixes the simulator?
> > > > 
> > > > Yes, since the simulator is using a workqueue to handle control virtueue.
> > > 
> > > Uhmm... touching a driver for a simulator's sake looks a little weird.
> > 
> > Simulator is not the only one that is using a workqueue (but should be
> > the first).
> > 
> > I can see  that the mlx5 vDPA driver is using a workqueue as well (see
> > mlx5_vdpa_kick_vq()).
> > 
> > And in the case of VDUSE, it needs to wait for the response from the
> > userspace, this means cond_resched() is probably a must for the case
> > like UP.
> > 
> > > 
> > > Additionally, if the bug is vdpasim, I think it's better to try to
> > > solve it there, if possible.
> > > 
> > > Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> > > neither needs a process context, so perhaps you could rework it to run
> > > the work_fn() directly from vdpasim_kick_vq(), at least for the control
> > > virtqueue?
> > 
> > It's possible (but require some rework on the simulator core). But
> > considering we have other similar use cases, it looks better to solve
> > it in the virtio-net driver.
> 
> I see.
> 
> > Additionally, this may have better behaviour when using for the buggy
> > hardware (e.g the control virtqueue takes too long to respond). We may
> > consider switching to use interrupt/sleep in the future (but not
> > suitable for -net).
> 
> Agreed. Possibly a timeout could be useful, too.
> 
> Cheers,
> 
> Paolo


Hmm timeouts are kind of arbitrary.
regular drivers basically derive them from hardware
behaviour but with a generic driver like virtio it's harder.
I guess we could add timeout as a config field, have
device make a promise to the driver.

Making the wait interruptible seems more reasonable.

-- 
MST
Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop
Posted by Jason Wang 3 years, 7 months ago
在 2022/9/7 15:46, Michael S. Tsirkin 写道:
> On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
>> On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
>>> On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <pabeni@redhat.com> wrote:
>>>> On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
>>>>> On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>> On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
>>>>>>> Adding cond_resched() to the command waiting loop for a better
>>>>>>> co-operation with the scheduler. This allows to give CPU a breath to
>>>>>>> run other task(workqueue) instead of busy looping when preemption is
>>>>>>> not allowed.
>>>>>>>
>>>>>>> What's more important. This is a must for some vDPA parent to work
>>>>>>> since control virtqueue is emulated via a workqueue for those parents.
>>>>>>>
>>>>>>> Fixes: bda324fd037a ("vdpasim: control virtqueue support")
>>>>>> That's a weird commit to fix. so it fixes the simulator?
>>>>> Yes, since the simulator is using a workqueue to handle control virtueue.
>>>> Uhmm... touching a driver for a simulator's sake looks a little weird.
>>> Simulator is not the only one that is using a workqueue (but should be
>>> the first).
>>>
>>> I can see  that the mlx5 vDPA driver is using a workqueue as well (see
>>> mlx5_vdpa_kick_vq()).
>>>
>>> And in the case of VDUSE, it needs to wait for the response from the
>>> userspace, this means cond_resched() is probably a must for the case
>>> like UP.
>>>
>>>> Additionally, if the bug is vdpasim, I think it's better to try to
>>>> solve it there, if possible.
>>>>
>>>> Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
>>>> neither needs a process context, so perhaps you could rework it to run
>>>> the work_fn() directly from vdpasim_kick_vq(), at least for the control
>>>> virtqueue?
>>> It's possible (but require some rework on the simulator core). But
>>> considering we have other similar use cases, it looks better to solve
>>> it in the virtio-net driver.
>> I see.
>>
>>> Additionally, this may have better behaviour when using for the buggy
>>> hardware (e.g the control virtqueue takes too long to respond). We may
>>> consider switching to use interrupt/sleep in the future (but not
>>> suitable for -net).
>> Agreed. Possibly a timeout could be useful, too.
>>
>> Cheers,
>>
>> Paolo
>
> Hmm timeouts are kind of arbitrary.
> regular drivers basically derive them from hardware
> behaviour but with a generic driver like virtio it's harder.
> I guess we could add timeout as a config field, have
> device make a promise to the driver.
>
> Making the wait interruptible seems more reasonable.


Yes, but I think we still need this patch for -net and -stable.

Thanks


>

Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop
Posted by Michael S. Tsirkin 3 years, 7 months ago
On Thu, Sep 08, 2022 at 10:21:45AM +0800, Jason Wang wrote:
> 
> 在 2022/9/7 15:46, Michael S. Tsirkin 写道:
> > On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
> > > On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
> > > > On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > > > > On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > > > > > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > > > > > Adding cond_resched() to the command waiting loop for a better
> > > > > > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > > > > > run other task(workqueue) instead of busy looping when preemption is
> > > > > > > > not allowed.
> > > > > > > > 
> > > > > > > > What's more important. This is a must for some vDPA parent to work
> > > > > > > > since control virtqueue is emulated via a workqueue for those parents.
> > > > > > > > 
> > > > > > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > > > > > > That's a weird commit to fix. so it fixes the simulator?
> > > > > > Yes, since the simulator is using a workqueue to handle control virtueue.
> > > > > Uhmm... touching a driver for a simulator's sake looks a little weird.
> > > > Simulator is not the only one that is using a workqueue (but should be
> > > > the first).
> > > > 
> > > > I can see  that the mlx5 vDPA driver is using a workqueue as well (see
> > > > mlx5_vdpa_kick_vq()).
> > > > 
> > > > And in the case of VDUSE, it needs to wait for the response from the
> > > > userspace, this means cond_resched() is probably a must for the case
> > > > like UP.
> > > > 
> > > > > Additionally, if the bug is vdpasim, I think it's better to try to
> > > > > solve it there, if possible.
> > > > > 
> > > > > Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> > > > > neither needs a process context, so perhaps you could rework it to run
> > > > > the work_fn() directly from vdpasim_kick_vq(), at least for the control
> > > > > virtqueue?
> > > > It's possible (but require some rework on the simulator core). But
> > > > considering we have other similar use cases, it looks better to solve
> > > > it in the virtio-net driver.
> > > I see.
> > > 
> > > > Additionally, this may have better behaviour when using for the buggy
> > > > hardware (e.g the control virtqueue takes too long to respond). We may
> > > > consider switching to use interrupt/sleep in the future (but not
> > > > suitable for -net).
> > > Agreed. Possibly a timeout could be useful, too.
> > > 
> > > Cheers,
> > > 
> > > Paolo
> > 
> > Hmm timeouts are kind of arbitrary.
> > regular drivers basically derive them from hardware
> > behaviour but with a generic driver like virtio it's harder.
> > I guess we could add timeout as a config field, have
> > device make a promise to the driver.
> > 
> > Making the wait interruptible seems more reasonable.
> 
> 
> Yes, but I think we still need this patch for -net and -stable.
> 
> Thanks

I was referring to Paolo's idea of having a timeout.

-- 
MST

Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop
Posted by Jason Wang 3 years, 6 months ago
在 2022/9/8 13:19, Michael S. Tsirkin 写道:
> On Thu, Sep 08, 2022 at 10:21:45AM +0800, Jason Wang wrote:
>> 在 2022/9/7 15:46, Michael S. Tsirkin 写道:
>>> On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
>>>> On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
>>>>> On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <pabeni@redhat.com> wrote:
>>>>>> On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
>>>>>>> On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>>>>>>>> On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
>>>>>>>>> Adding cond_resched() to the command waiting loop for a better
>>>>>>>>> co-operation with the scheduler. This allows to give CPU a breath to
>>>>>>>>> run other task(workqueue) instead of busy looping when preemption is
>>>>>>>>> not allowed.
>>>>>>>>>
>>>>>>>>> What's more important. This is a must for some vDPA parent to work
>>>>>>>>> since control virtqueue is emulated via a workqueue for those parents.
>>>>>>>>>
>>>>>>>>> Fixes: bda324fd037a ("vdpasim: control virtqueue support")
>>>>>>>> That's a weird commit to fix. so it fixes the simulator?
>>>>>>> Yes, since the simulator is using a workqueue to handle control virtueue.
>>>>>> Uhmm... touching a driver for a simulator's sake looks a little weird.
>>>>> Simulator is not the only one that is using a workqueue (but should be
>>>>> the first).
>>>>>
>>>>> I can see  that the mlx5 vDPA driver is using a workqueue as well (see
>>>>> mlx5_vdpa_kick_vq()).
>>>>>
>>>>> And in the case of VDUSE, it needs to wait for the response from the
>>>>> userspace, this means cond_resched() is probably a must for the case
>>>>> like UP.
>>>>>
>>>>>> Additionally, if the bug is vdpasim, I think it's better to try to
>>>>>> solve it there, if possible.
>>>>>>
>>>>>> Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
>>>>>> neither needs a process context, so perhaps you could rework it to run
>>>>>> the work_fn() directly from vdpasim_kick_vq(), at least for the control
>>>>>> virtqueue?
>>>>> It's possible (but require some rework on the simulator core). But
>>>>> considering we have other similar use cases, it looks better to solve
>>>>> it in the virtio-net driver.
>>>> I see.
>>>>
>>>>> Additionally, this may have better behaviour when using for the buggy
>>>>> hardware (e.g the control virtqueue takes too long to respond). We may
>>>>> consider switching to use interrupt/sleep in the future (but not
>>>>> suitable for -net).
>>>> Agreed. Possibly a timeout could be useful, too.
>>>>
>>>> Cheers,
>>>>
>>>> Paolo
>>> Hmm timeouts are kind of arbitrary.
>>> regular drivers basically derive them from hardware
>>> behaviour but with a generic driver like virtio it's harder.
>>> I guess we could add timeout as a config field, have
>>> device make a promise to the driver.
>>>
>>> Making the wait interruptible seems more reasonable.
>>
>> Yes, but I think we still need this patch for -net and -stable.
>>
>> Thanks
> I was referring to Paolo's idea of having a timeout.


Ok, I think we're fine with this patch. Any chance to merge this or do I 
need to resend?

Thanks


>

Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop
Posted by Michael S. Tsirkin 3 years, 5 months ago
On Sun, Oct 09, 2022 at 01:58:53PM +0800, Jason Wang wrote:
> 
> 在 2022/9/8 13:19, Michael S. Tsirkin 写道:
> > On Thu, Sep 08, 2022 at 10:21:45AM +0800, Jason Wang wrote:
> > > 在 2022/9/7 15:46, Michael S. Tsirkin 写道:
> > > > On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
> > > > > On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
> > > > > > On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > > > > > > On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > > > > > > > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > > > > > > > Adding cond_resched() to the command waiting loop for a better
> > > > > > > > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > > > > > > > run other task(workqueue) instead of busy looping when preemption is
> > > > > > > > > > not allowed.
> > > > > > > > > > 
> > > > > > > > > > What's more important. This is a must for some vDPA parent to work
> > > > > > > > > > since control virtqueue is emulated via a workqueue for those parents.
> > > > > > > > > > 
> > > > > > > > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > > > > > > > > That's a weird commit to fix. so it fixes the simulator?
> > > > > > > > Yes, since the simulator is using a workqueue to handle control virtueue.
> > > > > > > Uhmm... touching a driver for a simulator's sake looks a little weird.
> > > > > > Simulator is not the only one that is using a workqueue (but should be
> > > > > > the first).
> > > > > > 
> > > > > > I can see  that the mlx5 vDPA driver is using a workqueue as well (see
> > > > > > mlx5_vdpa_kick_vq()).
> > > > > > 
> > > > > > And in the case of VDUSE, it needs to wait for the response from the
> > > > > > userspace, this means cond_resched() is probably a must for the case
> > > > > > like UP.
> > > > > > 
> > > > > > > Additionally, if the bug is vdpasim, I think it's better to try to
> > > > > > > solve it there, if possible.
> > > > > > > 
> > > > > > > Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> > > > > > > neither needs a process context, so perhaps you could rework it to run
> > > > > > > the work_fn() directly from vdpasim_kick_vq(), at least for the control
> > > > > > > virtqueue?
> > > > > > It's possible (but require some rework on the simulator core). But
> > > > > > considering we have other similar use cases, it looks better to solve
> > > > > > it in the virtio-net driver.
> > > > > I see.
> > > > > 
> > > > > > Additionally, this may have better behaviour when using for the buggy
> > > > > > hardware (e.g the control virtqueue takes too long to respond). We may
> > > > > > consider switching to use interrupt/sleep in the future (but not
> > > > > > suitable for -net).
> > > > > Agreed. Possibly a timeout could be useful, too.
> > > > > 
> > > > > Cheers,
> > > > > 
> > > > > Paolo
> > > > Hmm timeouts are kind of arbitrary.
> > > > regular drivers basically derive them from hardware
> > > > behaviour but with a generic driver like virtio it's harder.
> > > > I guess we could add timeout as a config field, have
> > > > device make a promise to the driver.
> > > > 
> > > > Making the wait interruptible seems more reasonable.
> > > 
> > > Yes, but I think we still need this patch for -net and -stable.
> > > 
> > > Thanks
> > I was referring to Paolo's idea of having a timeout.
> 
> 
> Ok, I think we're fine with this patch. Any chance to merge this or do I
> need to resend?
> 
> Thanks

Last question: do we want cpu_relax here now? Or is cond_resched
sufficient?

> 
> > 

Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop
Posted by Jason Wang 3 years, 5 months ago
On Tue, Oct 11, 2022 at 1:11 AM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Sun, Oct 09, 2022 at 01:58:53PM +0800, Jason Wang wrote:
> >
> > 在 2022/9/8 13:19, Michael S. Tsirkin 写道:
> > > On Thu, Sep 08, 2022 at 10:21:45AM +0800, Jason Wang wrote:
> > > > 在 2022/9/7 15:46, Michael S. Tsirkin 写道:
> > > > > On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
> > > > > > On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
> > > > > > > On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > > > > > > > On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > > > > > > > > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > > > > > > > > Adding cond_resched() to the command waiting loop for a better
> > > > > > > > > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > > > > > > > > run other task(workqueue) instead of busy looping when preemption is
> > > > > > > > > > > not allowed.
> > > > > > > > > > >
> > > > > > > > > > > What's more important. This is a must for some vDPA parent to work
> > > > > > > > > > > since control virtqueue is emulated via a workqueue for those parents.
> > > > > > > > > > >
> > > > > > > > > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > > > > > > > > > That's a weird commit to fix. so it fixes the simulator?
> > > > > > > > > Yes, since the simulator is using a workqueue to handle control virtueue.
> > > > > > > > Uhmm... touching a driver for a simulator's sake looks a little weird.
> > > > > > > Simulator is not the only one that is using a workqueue (but should be
> > > > > > > the first).
> > > > > > >
> > > > > > > I can see  that the mlx5 vDPA driver is using a workqueue as well (see
> > > > > > > mlx5_vdpa_kick_vq()).
> > > > > > >
> > > > > > > And in the case of VDUSE, it needs to wait for the response from the
> > > > > > > userspace, this means cond_resched() is probably a must for the case
> > > > > > > like UP.
> > > > > > >
> > > > > > > > Additionally, if the bug is vdpasim, I think it's better to try to
> > > > > > > > solve it there, if possible.
> > > > > > > >
> > > > > > > > Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> > > > > > > > neither needs a process context, so perhaps you could rework it to run
> > > > > > > > the work_fn() directly from vdpasim_kick_vq(), at least for the control
> > > > > > > > virtqueue?
> > > > > > > It's possible (but require some rework on the simulator core). But
> > > > > > > considering we have other similar use cases, it looks better to solve
> > > > > > > it in the virtio-net driver.
> > > > > > I see.
> > > > > >
> > > > > > > Additionally, this may have better behaviour when using for the buggy
> > > > > > > hardware (e.g the control virtqueue takes too long to respond). We may
> > > > > > > consider switching to use interrupt/sleep in the future (but not
> > > > > > > suitable for -net).
> > > > > > Agreed. Possibly a timeout could be useful, too.
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Paolo
> > > > > Hmm timeouts are kind of arbitrary.
> > > > > regular drivers basically derive them from hardware
> > > > > behaviour but with a generic driver like virtio it's harder.
> > > > > I guess we could add timeout as a config field, have
> > > > > device make a promise to the driver.
> > > > >
> > > > > Making the wait interruptible seems more reasonable.
> > > >
> > > > Yes, but I think we still need this patch for -net and -stable.
> > > >
> > > > Thanks
> > > I was referring to Paolo's idea of having a timeout.
> >
> >
> > Ok, I think we're fine with this patch. Any chance to merge this or do I
> > need to resend?
> >
> > Thanks
>
> Last question: do we want cpu_relax here now? Or is cond_resched
> sufficient?

(Have answered in another thread)

I think we need cpu_relax() since there could be no high priority task
in the current cpu so we still need to relax.

Thanks

>
> >
> > >
>
Re: [PATCH net] virtio-net: add cond_resched() to the command waiting loop
Posted by Jason Wang 3 years, 5 months ago
On Wed, Oct 12, 2022 at 11:19 AM Jason Wang <jasowang@redhat.com> wrote:
>
> On Tue, Oct 11, 2022 at 1:11 AM Michael S. Tsirkin <mst@redhat.com> wrote:
> >
> > On Sun, Oct 09, 2022 at 01:58:53PM +0800, Jason Wang wrote:
> > >
> > > 在 2022/9/8 13:19, Michael S. Tsirkin 写道:
> > > > On Thu, Sep 08, 2022 at 10:21:45AM +0800, Jason Wang wrote:
> > > > > 在 2022/9/7 15:46, Michael S. Tsirkin 写道:
> > > > > > On Wed, Sep 07, 2022 at 09:07:20AM +0200, Paolo Abeni wrote:
> > > > > > > On Wed, 2022-09-07 at 10:09 +0800, Jason Wang wrote:
> > > > > > > > On Tue, Sep 6, 2022 at 6:56 PM Paolo Abeni <pabeni@redhat.com> wrote:
> > > > > > > > > On Mon, 2022-09-05 at 15:49 +0800, Jason Wang wrote:
> > > > > > > > > > On Mon, Sep 5, 2022 at 3:15 PM Michael S. Tsirkin <mst@redhat.com> wrote:
> > > > > > > > > > > On Mon, Sep 05, 2022 at 12:53:41PM +0800, Jason Wang wrote:
> > > > > > > > > > > > Adding cond_resched() to the command waiting loop for a better
> > > > > > > > > > > > co-operation with the scheduler. This allows to give CPU a breath to
> > > > > > > > > > > > run other task(workqueue) instead of busy looping when preemption is
> > > > > > > > > > > > not allowed.
> > > > > > > > > > > >
> > > > > > > > > > > > What's more important. This is a must for some vDPA parent to work
> > > > > > > > > > > > since control virtqueue is emulated via a workqueue for those parents.
> > > > > > > > > > > >
> > > > > > > > > > > > Fixes: bda324fd037a ("vdpasim: control virtqueue support")
> > > > > > > > > > > That's a weird commit to fix. so it fixes the simulator?
> > > > > > > > > > Yes, since the simulator is using a workqueue to handle control virtueue.
> > > > > > > > > Uhmm... touching a driver for a simulator's sake looks a little weird.
> > > > > > > > Simulator is not the only one that is using a workqueue (but should be
> > > > > > > > the first).
> > > > > > > >
> > > > > > > > I can see  that the mlx5 vDPA driver is using a workqueue as well (see
> > > > > > > > mlx5_vdpa_kick_vq()).
> > > > > > > >
> > > > > > > > And in the case of VDUSE, it needs to wait for the response from the
> > > > > > > > userspace, this means cond_resched() is probably a must for the case
> > > > > > > > like UP.
> > > > > > > >
> > > > > > > > > Additionally, if the bug is vdpasim, I think it's better to try to
> > > > > > > > > solve it there, if possible.
> > > > > > > > >
> > > > > > > > > Looking at vdpasim_net_work() and vdpasim_blk_work() it looks like
> > > > > > > > > neither needs a process context, so perhaps you could rework it to run
> > > > > > > > > the work_fn() directly from vdpasim_kick_vq(), at least for the control
> > > > > > > > > virtqueue?
> > > > > > > > It's possible (but require some rework on the simulator core). But
> > > > > > > > considering we have other similar use cases, it looks better to solve
> > > > > > > > it in the virtio-net driver.
> > > > > > > I see.
> > > > > > >
> > > > > > > > Additionally, this may have better behaviour when using for the buggy
> > > > > > > > hardware (e.g the control virtqueue takes too long to respond). We may
> > > > > > > > consider switching to use interrupt/sleep in the future (but not
> > > > > > > > suitable for -net).
> > > > > > > Agreed. Possibly a timeout could be useful, too.
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Paolo
> > > > > > Hmm timeouts are kind of arbitrary.
> > > > > > regular drivers basically derive them from hardware
> > > > > > behaviour but with a generic driver like virtio it's harder.
> > > > > > I guess we could add timeout as a config field, have
> > > > > > device make a promise to the driver.
> > > > > >
> > > > > > Making the wait interruptible seems more reasonable.
> > > > >
> > > > > Yes, but I think we still need this patch for -net and -stable.
> > > > >
> > > > > Thanks
> > > > I was referring to Paolo's idea of having a timeout.
> > >
> > >
> > > Ok, I think we're fine with this patch. Any chance to merge this or do I
> > > need to resend?
> > >
> > > Thanks
> >
> > Last question: do we want cpu_relax here now? Or is cond_resched
> > sufficient?
>
> (Have answered in another thread)
>
> I think we need cpu_relax() since there could be no high priority task
> in the current cpu so we still need to relax.
>
> Thanks

Michael, does this answer make sense? If yes, would you like to ack the patch?

Thanks

>
> >
> > >
> > > >
> >