[PATCH 0/2] Send all the SVQ control commands in parallel

Hawkins Jiawei posted 2 patches 1 year ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/cover.1681732982.git.yin31149@gmail.com
Maintainers: Jason Wang <jasowang@redhat.com>
There is a newer version of this series
net/vhost-vdpa.c | 150 +++++++++++++++++++++++++++++++++++------------
1 file changed, 112 insertions(+), 38 deletions(-)
[PATCH 0/2] Send all the SVQ control commands in parallel
Posted by Hawkins Jiawei 1 year ago
This patchset allows QEMU to poll and check the device used buffer
after sending all SVQ control commands, instead of polling and checking
immediately after sending each SVQ control command, so that QEMU can
send all the SVQ control commands in parallel, which have better
performance improvement.

I use vdpa_sim_net to simulate vdpa device, refactor
vhost_vdpa_net_load() to call vhost_vdpa_net_load_mac() 30 times,
refactor `net_vhost_vdpa_cvq_info.load` to call vhost_vdpa_net_load()
1000 times, to build a test environment for sending
multiple SVQ control commands. Time in monotonic to
finish `net_vhost_vdpa_cvq_info.load`:

    QEMU                            monotonic time
--------------------------------------------------
not patched                              89202
--------------------------------------------------
patched                                  80455

This patchset resolves the GitLab issue at
https://gitlab.com/qemu-project/qemu/-/issues/1578.

Hawkins Jiawei (2):
  vdpa: rename vhost_vdpa_net_cvq_add()
  vdpa: send CVQ state load commands in parallel

 net/vhost-vdpa.c | 150 +++++++++++++++++++++++++++++++++++------------
 1 file changed, 112 insertions(+), 38 deletions(-)

-- 
2.25.1
Re: [PATCH 0/2] Send all the SVQ control commands in parallel
Posted by Eugenio Perez Martin 1 year ago
On Wed, Apr 19, 2023 at 1:50 PM Hawkins Jiawei <yin31149@gmail.com> wrote:
>
> This patchset allows QEMU to poll and check the device used buffer
> after sending all SVQ control commands, instead of polling and checking
> immediately after sending each SVQ control command, so that QEMU can
> send all the SVQ control commands in parallel, which have better
> performance improvement.
>
> I use vdpa_sim_net to simulate vdpa device, refactor
> vhost_vdpa_net_load() to call vhost_vdpa_net_load_mac() 30 times,
> refactor `net_vhost_vdpa_cvq_info.load` to call vhost_vdpa_net_load()
> 1000 times,

Maybe a little bit too high for real scenarios but it gives us a hint
for sure :). Maybe it is more realistic to send ~10 or ~100 commands?

>  to build a test environment for sending
> multiple SVQ control commands. Time in monotonic to
> finish `net_vhost_vdpa_cvq_info.load`:
>
>     QEMU                            monotonic time
> --------------------------------------------------
> not patched                              89202
> --------------------------------------------------
> patched                                  80455
>

Is time expressed in seconds or milliseconds? I'm going to assume ms.

So let's say all the time was spent in the context switch between qemu
and kernel, this is a save of (89202 - 80455)/30000 = 0.3 ms per
command?

Thanks!

> This patchset resolves the GitLab issue at
> https://gitlab.com/qemu-project/qemu/-/issues/1578.
>
> Hawkins Jiawei (2):
>   vdpa: rename vhost_vdpa_net_cvq_add()
>   vdpa: send CVQ state load commands in parallel
>
>  net/vhost-vdpa.c | 150 +++++++++++++++++++++++++++++++++++------------
>  1 file changed, 112 insertions(+), 38 deletions(-)
>
> --
> 2.25.1
>
Re: [PATCH 0/2] Send all the SVQ control commands in parallel
Posted by Hawkins Jiawei 1 year ago
On Thu, 20 Apr 2023 at 01:17, Eugenio Perez Martin <eperezma@redhat.com> wrote:
>
> On Wed, Apr 19, 2023 at 1:50 PM Hawkins Jiawei <yin31149@gmail.com> wrote:
> >
> > This patchset allows QEMU to poll and check the device used buffer
> > after sending all SVQ control commands, instead of polling and checking
> > immediately after sending each SVQ control command, so that QEMU can
> > send all the SVQ control commands in parallel, which have better
> > performance improvement.
> >
> > I use vdpa_sim_net to simulate vdpa device, refactor
> > vhost_vdpa_net_load() to call vhost_vdpa_net_load_mac() 30 times,
> > refactor `net_vhost_vdpa_cvq_info.load` to call vhost_vdpa_net_load()
> > 1000 times,
>
> Maybe a little bit too high for real scenarios but it gives us a hint
> for sure :). Maybe it is more realistic to send ~10 or ~100 commands?

Yes, it is absolutely too high for real scenarios to call vhost_vdpa_net_load()
1000 times. But considering that the time to execute vhost_vdpa_net_load_mac()
30 times is very short, the result time may be highly unstable and
fluctuate greatly,
 so I call vhost_vdpa_net_load() 1000 times, hoping to get a more stable result.

>
> >  to build a test environment for sending
> > multiple SVQ control commands. Time in monotonic to
> > finish `net_vhost_vdpa_cvq_info.load`:
> >
> >     QEMU                            monotonic time
> > --------------------------------------------------
> > not patched                              89202
> > --------------------------------------------------
> > patched                                  80455
> >
>
> Is time expressed in seconds or milliseconds? I'm going to assume ms.

I got this by calling g_get_monotonic_time(), it should be microseconds
according to [1].

[1]. https://docs.gtk.org/glib/func.get_monotonic_time.html

>
> So let's say all the time was spent in the context switch between qemu
> and kernel, this is a save of (89202 - 80455)/30000 = 0.3 ms per
> command?

Yes, I think it is a save of 0.3 microseconds per command.

Thanks!

>
> Thanks!
>
> > This patchset resolves the GitLab issue at
> > https://gitlab.com/qemu-project/qemu/-/issues/1578.
> >
> > Hawkins Jiawei (2):
> >   vdpa: rename vhost_vdpa_net_cvq_add()
> >   vdpa: send CVQ state load commands in parallel
> >
> >  net/vhost-vdpa.c | 150 +++++++++++++++++++++++++++++++++++------------
> >  1 file changed, 112 insertions(+), 38 deletions(-)
> >
> > --
> > 2.25.1
> >
>