[RFC v3 3/7] vsock: batch buffers in tx

Guo Zhi posted 7 patches 3 years, 7 months ago
[RFC v3 3/7] vsock: batch buffers in tx
Posted by Guo Zhi 3 years, 7 months ago
Vsock uses buffers in order, and for tx driver doesn't have to
know the length of the buffer. So we can do a batch for vsock if
in order negotiated, only write one used ring for a batch of buffers

Signed-off-by: Guo Zhi <qtxuning1999@sjtu.edu.cn>
---
 drivers/vhost/vsock.c | 12 ++++++++++--
 1 file changed, 10 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 368330417bde..e08fbbb5439e 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -497,7 +497,7 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work)
 	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
 						 dev);
 	struct virtio_vsock_pkt *pkt;
-	int head, pkts = 0, total_len = 0;
+	int head, pkts = 0, total_len = 0, add = 0;
 	unsigned int out, in;
 	bool added = false;
 
@@ -551,10 +551,18 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work)
 		else
 			virtio_transport_free_pkt(pkt);
 
-		vhost_add_used(vq, head, 0);
+		if (!vhost_has_feature(vq, VIRTIO_F_IN_ORDER)) {
+			vhost_add_used(vq, head, 0);
+		} else {
+			vq->heads[add].id = head;
+			vq->heads[add++].len = 0;
+		}
 		added = true;
 	} while(likely(!vhost_exceeds_weight(vq, ++pkts, total_len)));
 
+	/* If in order feature negotiaged, we can do a batch to increase performance */
+	if (vhost_has_feature(vq, VIRTIO_F_IN_ORDER) && added)
+		vhost_add_used_n(vq, vq->heads, add);
 no_more_replies:
 	if (added)
 		vhost_signal(&vsock->dev, vq);
-- 
2.17.1
Re: [RFC v3 3/7] vsock: batch buffers in tx
Posted by Jason Wang 3 years, 7 months ago
在 2022/9/1 13:54, Guo Zhi 写道:
> Vsock uses buffers in order, and for tx driver doesn't have to
> know the length of the buffer. So we can do a batch for vsock if
> in order negotiated, only write one used ring for a batch of buffers
>
> Signed-off-by: Guo Zhi <qtxuning1999@sjtu.edu.cn>
> ---
>   drivers/vhost/vsock.c | 12 ++++++++++--
>   1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> index 368330417bde..e08fbbb5439e 100644
> --- a/drivers/vhost/vsock.c
> +++ b/drivers/vhost/vsock.c
> @@ -497,7 +497,7 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work)
>   	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
>   						 dev);
>   	struct virtio_vsock_pkt *pkt;
> -	int head, pkts = 0, total_len = 0;
> +	int head, pkts = 0, total_len = 0, add = 0;
>   	unsigned int out, in;
>   	bool added = false;
>   
> @@ -551,10 +551,18 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work)
>   		else
>   			virtio_transport_free_pkt(pkt);
>   
> -		vhost_add_used(vq, head, 0);
> +		if (!vhost_has_feature(vq, VIRTIO_F_IN_ORDER)) {
> +			vhost_add_used(vq, head, 0);


I'd do this step by step.

1) switch to use vhost_add_used_n() for vsock, less copy_to_user() 
better performance
2) do in-order on top


> +		} else {
> +			vq->heads[add].id = head;
> +			vq->heads[add++].len = 0;


How can we guarantee that we are in the boundary of the heads array?

Btw in the case of in-order we don't need to record the heads, instead 
we just need to know the head of the last buffer, it reduces the stress 
on the cache.

Thanks


> +		}
>   		added = true;
>   	} while(likely(!vhost_exceeds_weight(vq, ++pkts, total_len)));
>   
> +	/* If in order feature negotiaged, we can do a batch to increase performance */
> +	if (vhost_has_feature(vq, VIRTIO_F_IN_ORDER) && added)
> +		vhost_add_used_n(vq, vq->heads, add);
>   no_more_replies:
>   	if (added)
>   		vhost_signal(&vsock->dev, vq);

Re: [RFC v3 3/7] vsock: batch buffers in tx
Posted by Guo Zhi 3 years, 7 months ago

----- Original Message -----
> From: "jasowang" <jasowang@redhat.com>
> To: "Guo Zhi" <qtxuning1999@sjtu.edu.cn>, "eperezma" <eperezma@redhat.com>, "sgarzare" <sgarzare@redhat.com>, "Michael
> Tsirkin" <mst@redhat.com>
> Cc: "netdev" <netdev@vger.kernel.org>, "linux-kernel" <linux-kernel@vger.kernel.org>, "kvm list" <kvm@vger.kernel.org>,
> "virtualization" <virtualization@lists.linux-foundation.org>
> Sent: Wednesday, September 7, 2022 12:27:40 PM
> Subject: Re: [RFC v3 3/7] vsock: batch buffers in tx

> 在 2022/9/1 13:54, Guo Zhi 写道:
>> Vsock uses buffers in order, and for tx driver doesn't have to
>> know the length of the buffer. So we can do a batch for vsock if
>> in order negotiated, only write one used ring for a batch of buffers
>>
>> Signed-off-by: Guo Zhi <qtxuning1999@sjtu.edu.cn>
>> ---
>>   drivers/vhost/vsock.c | 12 ++++++++++--
>>   1 file changed, 10 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>> index 368330417bde..e08fbbb5439e 100644
>> --- a/drivers/vhost/vsock.c
>> +++ b/drivers/vhost/vsock.c
>> @@ -497,7 +497,7 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work
>> *work)
>>   	struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
>>   						 dev);
>>   	struct virtio_vsock_pkt *pkt;
>> -	int head, pkts = 0, total_len = 0;
>> +	int head, pkts = 0, total_len = 0, add = 0;
>>   	unsigned int out, in;
>>   	bool added = false;
>>   
>> @@ -551,10 +551,18 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work
>> *work)
>>   		else
>>   			virtio_transport_free_pkt(pkt);
>>   
>> -		vhost_add_used(vq, head, 0);
>> +		if (!vhost_has_feature(vq, VIRTIO_F_IN_ORDER)) {
>> +			vhost_add_used(vq, head, 0);
> 
> 
> I'd do this step by step.
> 
> 1) switch to use vhost_add_used_n() for vsock, less copy_to_user()
> better performance
> 2) do in-order on top
> 
> 
LGTM!, I think it is the correct way.

>> +		} else {
>> +			vq->heads[add].id = head;
>> +			vq->heads[add++].len = 0;
> 
> 
> How can we guarantee that we are in the boundary of the heads array?
> 
> Btw in the case of in-order we don't need to record the heads, instead
> we just need to know the head of the last buffer, it reduces the stress
> on the cache.
> 
> Thanks
> 
Yeah, I will change this and only copy last head for in order feature.

Thanks
> 
>> +		}
>>   		added = true;
>>   	} while(likely(!vhost_exceeds_weight(vq, ++pkts, total_len)));
>>   
>> +	/* If in order feature negotiaged, we can do a batch to increase performance
>> */
>> +	if (vhost_has_feature(vq, VIRTIO_F_IN_ORDER) && added)
>> +		vhost_add_used_n(vq, vq->heads, add);
>>   no_more_replies:
>>   	if (added)
>>   		vhost_signal(&vsock->dev, vq);
Re: [RFC v3 3/7] vsock: batch buffers in tx
Posted by Eugenio Perez Martin 3 years, 7 months ago
On Thu, Sep 1, 2022 at 7:55 AM Guo Zhi <qtxuning1999@sjtu.edu.cn> wrote:
>
> Vsock uses buffers in order, and for tx driver doesn't have to
> know the length of the buffer. So we can do a batch for vsock if
> in order negotiated, only write one used ring for a batch of buffers
>
> Signed-off-by: Guo Zhi <qtxuning1999@sjtu.edu.cn>
> ---
>  drivers/vhost/vsock.c | 12 ++++++++++--
>  1 file changed, 10 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> index 368330417bde..e08fbbb5439e 100644
> --- a/drivers/vhost/vsock.c
> +++ b/drivers/vhost/vsock.c
> @@ -497,7 +497,7 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work)
>         struct vhost_vsock *vsock = container_of(vq->dev, struct vhost_vsock,
>                                                  dev);
>         struct virtio_vsock_pkt *pkt;
> -       int head, pkts = 0, total_len = 0;
> +       int head, pkts = 0, total_len = 0, add = 0;
>         unsigned int out, in;
>         bool added = false;
>
> @@ -551,10 +551,18 @@ static void vhost_vsock_handle_tx_kick(struct vhost_work *work)
>                 else
>                         virtio_transport_free_pkt(pkt);
>
> -               vhost_add_used(vq, head, 0);
> +               if (!vhost_has_feature(vq, VIRTIO_F_IN_ORDER)) {
> +                       vhost_add_used(vq, head, 0);
> +               } else {
> +                       vq->heads[add].id = head;
> +                       vq->heads[add++].len = 0;

Knowing that the descriptors are used in order we can save a few
memory writes to the vq->heads[] array. vhost.c is checking for the
feature in_order anyway.

Thanks!

> +               }
>                 added = true;
>         } while(likely(!vhost_exceeds_weight(vq, ++pkts, total_len)));
>
> +       /* If in order feature negotiaged, we can do a batch to increase performance */
> +       if (vhost_has_feature(vq, VIRTIO_F_IN_ORDER) && added)
> +               vhost_add_used_n(vq, vq->heads, add);
>  no_more_replies:
>         if (added)
>                 vhost_signal(&vsock->dev, vq);
> --
> 2.17.1
>