[Qemu-devel] [PATCH v2] vhost-vsock: report QMP event when set running

Ning Bo posted 1 patch 4 years, 8 months ago
Test docker-clang@ubuntu passed
Test s390x passed
Test asan passed
Test docker-mingw@fedora passed
Test FreeBSD passed
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/1564975951-29263-1-git-send-email-ning.bo9@zte.com.cn
Maintainers: "Marc-André Lureau" <marcandre.lureau@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>, Markus Armbruster <armbru@redhat.com>, Eric Blake <eblake@redhat.com>, "Michael S. Tsirkin" <mst@redhat.com>
hw/virtio/vhost-vsock.c |  3 +++
qapi/char.json          | 22 ++++++++++++++++++++++
2 files changed, 25 insertions(+)
[Qemu-devel] [PATCH v2] vhost-vsock: report QMP event when set running
Posted by Ning Bo 4 years, 8 months ago
If a program in host communicates with a vsock device in guest via
'vsock://', but the device is not ready, the 'connect' syscall will
block and then timeout after 2 second default.(the timeout is defined
in kernel: #define VSOCK_DEFAULT_CONNECT_TIMEOUT (2 * HZ)).
We can avoid this case if qemu reports an event when the vsock is
ready and the program waits the event before connecting.

Report vsock running event so that the upper application can
control boot sequence.
see https://github.com/kata-containers/runtime/pull/1918

Signed-off-by: Ning Bo <ning.bo9@zte.com.cn>
---
v2: fix typo
---
 hw/virtio/vhost-vsock.c |  3 +++
 qapi/char.json          | 22 ++++++++++++++++++++++
 2 files changed, 25 insertions(+)

diff --git a/hw/virtio/vhost-vsock.c b/hw/virtio/vhost-vsock.c
index 0371493..a5920fd 100644
--- a/hw/virtio/vhost-vsock.c
+++ b/hw/virtio/vhost-vsock.c
@@ -22,6 +22,7 @@
 #include "qemu/iov.h"
 #include "qemu/module.h"
 #include "monitor/monitor.h"
+#include "qapi/qapi-events-char.h"
 
 enum {
     VHOST_VSOCK_SAVEVM_VERSION = 0,
@@ -68,6 +69,8 @@ static int vhost_vsock_set_running(VHostVSock *vsock, int start)
     if (ret < 0) {
         return -errno;
     }
+    qapi_event_send_vsock_running(vsock->conf.guest_cid, start != 0);
+
     return 0;
 }
 
diff --git a/qapi/char.json b/qapi/char.json
index a6e81ac..4cfbcf2 100644
--- a/qapi/char.json
+++ b/qapi/char.json
@@ -570,3 +570,25 @@
 { 'event': 'VSERPORT_CHANGE',
   'data': { 'id': 'str',
             'open': 'bool' } }
+
+##
+# @VSOCK_RUNNING:
+#
+# Emitted when the guest changes the vsock status.
+#
+# @cid: guest context ID
+#
+# @running: true if the vsock is running
+#
+# Since: 4.2
+#
+# Example:
+#
+# <- { "event": "VSOCK_RUNNING",
+#      "data": { "cid": "123456", "running": true },
+#      "timestamp": { "seconds": 1401385907, "microseconds": 422329 } }
+#
+##
+{ 'event': 'VSOCK_RUNNING',
+  'data': { 'cid': 'uint64',
+            'running': 'bool' } }
-- 
2.9.5


Re: [Qemu-devel] [PATCH v2] vhost-vsock: report QMP event when set running
Posted by Stefan Hajnoczi 4 years, 8 months ago
On Mon, Aug 05, 2019 at 11:32:31AM +0800, Ning Bo wrote:
> If a program in host communicates with a vsock device in guest via
> 'vsock://', but the device is not ready, the 'connect' syscall will
> block and then timeout after 2 second default.(the timeout is defined
> in kernel: #define VSOCK_DEFAULT_CONNECT_TIMEOUT (2 * HZ)).
> We can avoid this case if qemu reports an event when the vsock is
> ready and the program waits the event before connecting.
> 
> Report vsock running event so that the upper application can
> control boot sequence.
> see https://github.com/kata-containers/runtime/pull/1918

Please describe the issue with connect(2) in detail.  Are you observing
that connect(2) always times out when called before the guest driver
hasn't set the virtio-vsock status register to
VIRTIO_CONFIG_S_DRIVER_OK?

I think that adding a QMP event is working around the issue rather than
fixing the root cause.  This is probably a vhost_vsock.ko problem and
should be fixed there.

Stefan
Re:[Qemu-devel] [PATCH v2] vhost-vsock: report QMP event when setrunning
Posted by ning.bo9@zte.com.cn 4 years, 4 months ago
Let me describe the issue with an example via `nc-vsock`:

Let's assume the Guest cid is 3.
execute 'rmmod vmw_vsock_virtio_transport' in Guest,
then execute 'while true; do nc-vsock 3 1234' in Host.

Host                             Guest
                                 # rmmod vmw_vsock_virtio_transport

# while true; do ./nc-vsock 3 1234; done
(after 2 second)
connect: Connection timed out
(after 2 second)
connect: Connection timed out
...

                                 # modprobe vmw_vsock_virtio_transport

connect: Connection reset by peer
connect: Connection reset by peer
connect: Connection reset by peer
...

                                 # nc-vsock -l 1234
                                 Connetion from cid 2 port ***...
(stop printing)


The above process simulates the communication process between
the `kata-runtime` and `kata-agent` after starting the Guest.
In order to connect to `kata-agent` as soon as possible, 
`kata-runtime` will continuously try to connect to `kata-agent` in a loop.
see https://github.com/kata-containers/runtime/blob/d054556f60f092335a22a288011fa29539ad4ccc/vendor/github.com/kata-containers/agent/protocols/client/client.go#L327
But when the vsock device in the Guest is not ready, the connection
will block for 2 seconds. This situation actually slows down
the entire startup time of `kata-runtime`.


> I think that adding a QMP event is working around the issue rather than
> fixing the root cause.  This is probably a vhost_vsock.ko problem and
> should be fixed there.

After looking at the source code of vhost_vsock.ko, 
I think it is possible to optimize the logic here too.
The simple patch is as follows. Do you think the modification is appropriate?

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 9f57736f..8fad67be 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -51,6 +51,7 @@ struct vhost_vsock {
 	atomic_t queued_replies;

 	u32 guest_cid;
+	u32 state;
 };

 static u32 vhost_transport_get_local_cid(void)
@@ -497,6 +541,7 @@ static int vhost_vsock_start(struct vhost_vsock *vsock)

 		mutex_unlock(&vq->mutex);
 	}
+	vsock->state = 1;

 	mutex_unlock(&vsock->dev.mutex);
 	return 0;
@@ -535,6 +580,7 @@ static int vhost_vsock_stop(struct vhost_vsock *vsock)
 		vq->private_data = NULL;
 		mutex_unlock(&vq->mutex);
 	}
+	vsock->state = 0;

 err:
 	mutex_unlock(&vsock->dev.mutex);
@@ -786,6 +832,27 @@ static struct miscdevice vhost_vsock_misc = {
 	.fops = &vhost_vsock_fops,
 };

+int vhost_transport_connect(struct vsock_sock *vsk) {
+	struct vhost_vsock *vsock;
+
+	rcu_read_lock();
+
+	/* Find the vhost_vsock according to guest context id  */
+	vsock = vhost_vsock_get(vsk->remote_addr.svm_cid);
+	if (!vsock) {
+		rcu_read_unlock();
+		return -ENODEV;
+	}
+
+	rcu_read_unlock();
+
+	if (vsock->state == 1) {
+		return virtio_transport_connect(vsk);
+	} else {
+		return -ECONNRESET;
+	}
+}
+
 static struct virtio_transport vhost_transport = {
 	.transport = {
 		.get_local_cid            = vhost_transport_get_local_cid,
@@ -793,7 +860,7 @@ static struct virtio_transport vhost_transport = {
 		.init                     = virtio_transport_do_socket_init,
 		.destruct                 = virtio_transport_destruct,
 		.release                  = virtio_transport_release,
-		.connect                  = virtio_transport_connect,
+		.connect                  = vhost_transport_connect,
 		.shutdown                 = virtio_transport_shutdown,
 		.cancel_pkt               = vhost_transport_cancel_pkt,
Re: [Qemu-devel] [PATCH v2] vhost-vsock: report QMP event when setrunning
Posted by Stefan Hajnoczi 4 years, 4 months ago
On Thu, Nov 28, 2019 at 07:26:47PM +0800, ning.bo9@zte.com.cn wrote:
> Let me describe the issue with an example via `nc-vsock`:
> 
> Let's assume the Guest cid is 3.
> execute 'rmmod vmw_vsock_virtio_transport' in Guest,
> then execute 'while true; do nc-vsock 3 1234' in Host.
> 
> Host                             Guest
>                                  # rmmod vmw_vsock_virtio_transport
> 
> # while true; do ./nc-vsock 3 1234; done
> (after 2 second)
> connect: Connection timed out
> (after 2 second)
> connect: Connection timed out
> ...
> 
>                                  # modprobe vmw_vsock_virtio_transport
> 
> connect: Connection reset by peer
> connect: Connection reset by peer
> connect: Connection reset by peer
> ...
> 
>                                  # nc-vsock -l 1234
>                                  Connetion from cid 2 port ***...
> (stop printing)
> 
> 
> The above process simulates the communication process between
> the `kata-runtime` and `kata-agent` after starting the Guest.
> In order to connect to `kata-agent` as soon as possible, 
> `kata-runtime` will continuously try to connect to `kata-agent` in a loop.
> see https://github.com/kata-containers/runtime/blob/d054556f60f092335a22a288011fa29539ad4ccc/vendor/github.com/kata-containers/agent/protocols/client/client.go#L327
> But when the vsock device in the Guest is not ready, the connection
> will block for 2 seconds. This situation actually slows down
> the entire startup time of `kata-runtime`.

This can be done efficiently as follows:
1. kata-runtime listens on a vsock port
2. kata-agent-port=PORT is added to the kernel command-line options
3. kata-agent parses the port number and connects to the host

This eliminates the reconnection attempts.

> > I think that adding a QMP event is working around the issue rather than
> > fixing the root cause.  This is probably a vhost_vsock.ko problem and
> > should be fixed there.
> 
> After looking at the source code of vhost_vsock.ko, 
> I think it is possible to optimize the logic here too.
> The simple patch is as follows. Do you think the modification is appropriate?
> 
> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> index 9f57736f..8fad67be 100644
> --- a/drivers/vhost/vsock.c
> +++ b/drivers/vhost/vsock.c
> @@ -51,6 +51,7 @@ struct vhost_vsock {
>  	atomic_t queued_replies;
> 
>  	u32 guest_cid;
> +	u32 state;
>  };
> 
>  static u32 vhost_transport_get_local_cid(void)
> @@ -497,6 +541,7 @@ static int vhost_vsock_start(struct vhost_vsock *vsock)
> 
>  		mutex_unlock(&vq->mutex);
>  	}
> +	vsock->state = 1;
> 
>  	mutex_unlock(&vsock->dev.mutex);
>  	return 0;
> @@ -535,6 +580,7 @@ static int vhost_vsock_stop(struct vhost_vsock *vsock)
>  		vq->private_data = NULL;
>  		mutex_unlock(&vq->mutex);
>  	}
> +	vsock->state = 0;
> 
>  err:
>  	mutex_unlock(&vsock->dev.mutex);
> @@ -786,6 +832,27 @@ static struct miscdevice vhost_vsock_misc = {
>  	.fops = &vhost_vsock_fops,
>  };
> 
> +int vhost_transport_connect(struct vsock_sock *vsk) {
> +	struct vhost_vsock *vsock;
> +
> +	rcu_read_lock();
> +
> +	/* Find the vhost_vsock according to guest context id  */
> +	vsock = vhost_vsock_get(vsk->remote_addr.svm_cid);
> +	if (!vsock) {
> +		rcu_read_unlock();
> +		return -ENODEV;
> +	}
> +
> +	rcu_read_unlock();
> +
> +	if (vsock->state == 1) {
> +		return virtio_transport_connect(vsk);
> +	} else {
> +		return -ECONNRESET;
> +	}
> +}
> +
>  static struct virtio_transport vhost_transport = {
>  	.transport = {
>  		.get_local_cid            = vhost_transport_get_local_cid,
> @@ -793,7 +860,7 @@ static struct virtio_transport vhost_transport = {
>  		.init                     = virtio_transport_do_socket_init,
>  		.destruct                 = virtio_transport_destruct,
>  		.release                  = virtio_transport_release,
> -		.connect                  = virtio_transport_connect,
> +		.connect                  = vhost_transport_connect,
>  		.shutdown                 = virtio_transport_shutdown,
>  		.cancel_pkt               = vhost_transport_cancel_pkt,

I'm not keen on adding a special case for vhost_vsock.ko connect.

Userspace APIs to avoid the 2 second wait already exist:

1. The SO_VM_SOCKETS_CONNECT_TIMEOUT socket option controls the connect
   timeout for this socket.

2. Non-blocking connect allows the userspace process to do other things
   while a connection attempt is being made.

But the best solution is the one I mentioned above.

Stefan
Re:[Qemu-devel] [PATCH v2] vhost-vsock: report QMP event whensetrunning
Posted by ning.bo9@zte.com.cn 4 years, 4 months ago
> This can be done efficiently as follows:
> 1. kata-runtime listens on a vsock port
> 2. kata-agent-port=PORT is added to the kernel command-line options
> 3. kata-agent parses the port number and connects to the host
> 
> This eliminates the reconnection attempts.

There will be an additional problem if do this:
Who decides which port the `runtime` should listen?

Consider the worst case: 
The ports selected by two `runtime` running in parallel always conflict, 
and this case is unavoidable, even if we can reduce the possibility of conflicts through algorithms.
Because we don't have a daemon that can allocate unique port to `runtime`.


> Userspace APIs to avoid the 2 second wait already exist:
> 
> 1. The SO_VM_SOCKETS_CONNECT_TIMEOUT socket option controls the connect
>    timeout for this socket.

Yes, it has the same effect

> 2. Non-blocking connect allows the userspace process to do other things
>    while a connection attempt is being made.

I don't think the `tunime` has anything to do except wait for the response from the `agent` at that moment



Now let me sort out the currently known methods:
1. `runtime` does not connect until it receives the qmp event reported by qemu when the `agent` opens the vsock device.
    - The method looks inappropriate now.
2. adding a special case for vhost_vsock.ko.
    - Also inappropriate.
3. connect to `runtime` from `agent`.
    - `runtime` may not be able to choose the right port.
4. Use `SO_VM_SOCKETS_CONNECT_TIMEOUT` option.
    - The effect is similar to method 2, no need to modify the kernel module code.

I have an additional question:
If useing method 4, when `runtime` calls connect use NONBLOCK option with very short timeout in an infinite loop, the kernel maybe frequently creates timers. Is there any other side effects?-----BEGIN PGP SIGNATURE-----

iQEzBAEBCAAdFiEEhpWov9P5fNqsNXdanKSrs4Grc8gFAl3yHvUACgkQnKSrs4Gr
c8jg5ggAqaIQAS2Z81lTIi4bs475raquTl3SUzc+6T8yciP/Xs1Sb7tVdHx3WwFq
v1eqefEKrNNpdjUncOKoHRa4uMQZJSlVaJCsEmHUKBGOQPi+hJ8X0Q57/w4hEYQ6
bXrVPlwFK1vBzPPTr1w4qKbKIJyqCYrjhxUxr2KeVr1q8jpvdxnXTILTLWU1JCNS
Fh1l69CTM0RjtRiW4mbskNspNCluS5sq3KG0PMCBW+VqPNP9rXL6C3qpIwM1RY9p
XTrUNSS4wqRNXl2Ug/Pt52Vwr4YAAezsyC+JOCUZbC3nvzR/C2L4i1p/HLVvOuDI
9nsEqtr1Cj7xBuCKq9KCTfS2jpCsTg==
=0QNN
-----END PGP SIGNATURE-----

Re: [Qemu-devel] [PATCH v2] vhost-vsock: report QMP event whensetrunning
Posted by Stefan Hajnoczi 4 years, 4 months ago
On Fri, Dec 13, 2019 at 03:11:54PM +0800, ning.bo9@zte.com.cn wrote:
> > This can be done efficiently as follows:
> > 1. kata-runtime listens on a vsock port
> > 2. kata-agent-port=PORT is added to the kernel command-line options
> > 3. kata-agent parses the port number and connects to the host
> > 
> > This eliminates the reconnection attempts.
> 
> There will be an additional problem if do this:
> Who decides which port the `runtime` should listen?

Let the host kernel automatically assign a port using VMADDR_PORT_ANY.
It works like this:

  struct sockaddr_vm svm = {
      .svm_family = AF_VSOCK,
      .svm_port = VMADDR_PORT_ANY,
      .svm_cid = VMADDR_CID_ANY,
  };

  int fd = socket(AF_VSOCK, SOCK_STREAM, 0);
  ...
  if (bind(fd, (const struct sockaddr *)&svm, sizeof(svm)) < 0) {
      ...
  }

  socklen_t socklen = sizeof(svm);
  if (getsockname(fd, (struct sockaddr *)&svm, &socklen) < 0) {
      ...
  }

  printf("cid %u port %u\n", svm.svm_cid, svm.svm_port);

> Consider the worst case: 
> The ports selected by two `runtime` running in parallel always conflict, 
> and this case is unavoidable, even if we can reduce the possibility of conflicts through algorithms.
> Because we don't have a daemon that can allocate unique port to `runtime`.

The kernel assigns unique ports and only fails if the entire port
namespace is exhausted.  The port namespace is 32-bits so this is not a
real-world concern.

Does this information clarify how the runtime can connect to the guest
agent without loops or delays?

Stefan
Re:[Qemu-devel] [PATCH v2] vhost-vsock: report QMP eventwhensetrunning
Posted by ning.bo9@zte.com.cn 4 years, 4 months ago
> > There will be an additional problem if do this:
> > Who decides which port the `runtime` should listen?
> 
> Let the host kernel automatically assign a port using VMADDR_PORT_ANY.
> It works like this:
> 
>   struct sockaddr_vm svm = {
>       .svm_family = AF_VSOCK,
>       .svm_port = VMADDR_PORT_ANY,
>       .svm_cid = VMADDR_CID_ANY,
>   };
> 
>   int fd = socket(AF_VSOCK, SOCK_STREAM, 0);
>   ...
>   if (bind(fd, (const struct sockaddr  )&svm, sizeof(svm)) < 0) {
>       ...
>   }
> 
>   socklen_t socklen = sizeof(svm);
>   if (getsockname(fd, (struct sockaddr *)&svm, &socklen) < 0) {
>       ...
>   }
> 
>   printf("cid %u port %u\n", svm.svm_cid, svm.svm_port);
> 
> > Consider the worst case: 
> > The ports selected by two `runtime` running in parallel always conflict, 
> > and this case is unavoidable, even if we can reduce the possibility of 
> > conflicts through algorithms.
> > Because we don't have a daemon that can allocate unique port to `runtime`.
> 
> The kernel assigns unique ports and only fails if the entire port
> namespace is exhausted.  The port namespace is 32-bits so this is not a
> real-world concern.
> 
> Does this information clarify how the runtime can connect to the guest
> agent without loops or delays?

Thank you very much. I will do as you instructed above
Re: [Qemu-devel] [PATCH v2] vhost-vsock: report QMP event when setrunning
Posted by Michael S. Tsirkin 4 years, 4 months ago
On Thu, Dec 12, 2019 at 11:05:25AM +0000, Stefan Hajnoczi wrote:
> On Thu, Nov 28, 2019 at 07:26:47PM +0800, ning.bo9@zte.com.cn wrote:
> > Let me describe the issue with an example via `nc-vsock`:
> > 
> > Let's assume the Guest cid is 3.
> > execute 'rmmod vmw_vsock_virtio_transport' in Guest,
> > then execute 'while true; do nc-vsock 3 1234' in Host.
> > 
> > Host                             Guest
> >                                  # rmmod vmw_vsock_virtio_transport
> > 
> > # while true; do ./nc-vsock 3 1234; done
> > (after 2 second)
> > connect: Connection timed out
> > (after 2 second)
> > connect: Connection timed out
> > ...
> > 
> >                                  # modprobe vmw_vsock_virtio_transport
> > 
> > connect: Connection reset by peer
> > connect: Connection reset by peer
> > connect: Connection reset by peer
> > ...
> > 
> >                                  # nc-vsock -l 1234
> >                                  Connetion from cid 2 port ***...
> > (stop printing)
> > 
> > 
> > The above process simulates the communication process between
> > the `kata-runtime` and `kata-agent` after starting the Guest.
> > In order to connect to `kata-agent` as soon as possible, 
> > `kata-runtime` will continuously try to connect to `kata-agent` in a loop.
> > see https://github.com/kata-containers/runtime/blob/d054556f60f092335a22a288011fa29539ad4ccc/vendor/github.com/kata-containers/agent/protocols/client/client.go#L327
> > But when the vsock device in the Guest is not ready, the connection
> > will block for 2 seconds. This situation actually slows down
> > the entire startup time of `kata-runtime`.
> 
> This can be done efficiently as follows:
> 1. kata-runtime listens on a vsock port
> 2. kata-agent-port=PORT is added to the kernel command-line options
> 3. kata-agent parses the port number and connects to the host
> 
> This eliminates the reconnection attempts.

Then we'll get the same problem in reverse, won't we?
Agent must now be running before guest can boot ...
Or did I miss anything?

> > > I think that adding a QMP event is working around the issue rather than
> > > fixing the root cause.  This is probably a vhost_vsock.ko problem and
> > > should be fixed there.
> > 
> > After looking at the source code of vhost_vsock.ko, 
> > I think it is possible to optimize the logic here too.
> > The simple patch is as follows. Do you think the modification is appropriate?
> > 
> > diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
> > index 9f57736f..8fad67be 100644
> > --- a/drivers/vhost/vsock.c
> > +++ b/drivers/vhost/vsock.c
> > @@ -51,6 +51,7 @@ struct vhost_vsock {
> >  	atomic_t queued_replies;
> > 
> >  	u32 guest_cid;
> > +	u32 state;
> >  };
> > 
> >  static u32 vhost_transport_get_local_cid(void)
> > @@ -497,6 +541,7 @@ static int vhost_vsock_start(struct vhost_vsock *vsock)
> > 
> >  		mutex_unlock(&vq->mutex);
> >  	}
> > +	vsock->state = 1;
> > 
> >  	mutex_unlock(&vsock->dev.mutex);
> >  	return 0;
> > @@ -535,6 +580,7 @@ static int vhost_vsock_stop(struct vhost_vsock *vsock)
> >  		vq->private_data = NULL;
> >  		mutex_unlock(&vq->mutex);
> >  	}
> > +	vsock->state = 0;
> > 
> >  err:
> >  	mutex_unlock(&vsock->dev.mutex);
> > @@ -786,6 +832,27 @@ static struct miscdevice vhost_vsock_misc = {
> >  	.fops = &vhost_vsock_fops,
> >  };
> > 
> > +int vhost_transport_connect(struct vsock_sock *vsk) {
> > +	struct vhost_vsock *vsock;
> > +
> > +	rcu_read_lock();
> > +
> > +	/* Find the vhost_vsock according to guest context id  */
> > +	vsock = vhost_vsock_get(vsk->remote_addr.svm_cid);
> > +	if (!vsock) {
> > +		rcu_read_unlock();
> > +		return -ENODEV;
> > +	}
> > +
> > +	rcu_read_unlock();
> > +
> > +	if (vsock->state == 1) {
> > +		return virtio_transport_connect(vsk);
> > +	} else {
> > +		return -ECONNRESET;
> > +	}
> > +}
> > +
> >  static struct virtio_transport vhost_transport = {
> >  	.transport = {
> >  		.get_local_cid            = vhost_transport_get_local_cid,
> > @@ -793,7 +860,7 @@ static struct virtio_transport vhost_transport = {
> >  		.init                     = virtio_transport_do_socket_init,
> >  		.destruct                 = virtio_transport_destruct,
> >  		.release                  = virtio_transport_release,
> > -		.connect                  = virtio_transport_connect,
> > +		.connect                  = vhost_transport_connect,
> >  		.shutdown                 = virtio_transport_shutdown,
> >  		.cancel_pkt               = vhost_transport_cancel_pkt,
> 
> I'm not keen on adding a special case for vhost_vsock.ko connect.
> 
> Userspace APIs to avoid the 2 second wait already exist:
> 
> 1. The SO_VM_SOCKETS_CONNECT_TIMEOUT socket option controls the connect
>    timeout for this socket.
> 
> 2. Non-blocking connect allows the userspace process to do other things
>    while a connection attempt is being made.
> 
> But the best solution is the one I mentioned above.
> 
> Stefan



Re: [Qemu-devel] [PATCH v2] vhost-vsock: report QMP event when setrunning
Posted by Stefan Hajnoczi 4 years, 4 months ago
On Thu, Dec 12, 2019 at 06:24:55AM -0500, Michael S. Tsirkin wrote:
> On Thu, Dec 12, 2019 at 11:05:25AM +0000, Stefan Hajnoczi wrote:
> > On Thu, Nov 28, 2019 at 07:26:47PM +0800, ning.bo9@zte.com.cn wrote:
> > > Let me describe the issue with an example via `nc-vsock`:
> > > 
> > > Let's assume the Guest cid is 3.
> > > execute 'rmmod vmw_vsock_virtio_transport' in Guest,
> > > then execute 'while true; do nc-vsock 3 1234' in Host.
> > > 
> > > Host                             Guest
> > >                                  # rmmod vmw_vsock_virtio_transport
> > > 
> > > # while true; do ./nc-vsock 3 1234; done
> > > (after 2 second)
> > > connect: Connection timed out
> > > (after 2 second)
> > > connect: Connection timed out
> > > ...
> > > 
> > >                                  # modprobe vmw_vsock_virtio_transport
> > > 
> > > connect: Connection reset by peer
> > > connect: Connection reset by peer
> > > connect: Connection reset by peer
> > > ...
> > > 
> > >                                  # nc-vsock -l 1234
> > >                                  Connetion from cid 2 port ***...
> > > (stop printing)
> > > 
> > > 
> > > The above process simulates the communication process between
> > > the `kata-runtime` and `kata-agent` after starting the Guest.
> > > In order to connect to `kata-agent` as soon as possible, 
> > > `kata-runtime` will continuously try to connect to `kata-agent` in a loop.
> > > see https://github.com/kata-containers/runtime/blob/d054556f60f092335a22a288011fa29539ad4ccc/vendor/github.com/kata-containers/agent/protocols/client/client.go#L327
> > > But when the vsock device in the Guest is not ready, the connection
> > > will block for 2 seconds. This situation actually slows down
> > > the entire startup time of `kata-runtime`.
> > 
> > This can be done efficiently as follows:
> > 1. kata-runtime listens on a vsock port
> > 2. kata-agent-port=PORT is added to the kernel command-line options
> > 3. kata-agent parses the port number and connects to the host
> > 
> > This eliminates the reconnection attempts.
> 
> Then we'll get the same problem in reverse, won't we?
> Agent must now be running before guest can boot ...
> Or did I miss anything?

kata-runtime launches QEMU.  The QEMU guest runs kata-agent.  Therefore
it is guaranteed that kata-runtime's listen socket will be set up before
the agent executes.

Stefan