keepalive default

[PATCH 0/2] keepalive default

Posted by Vladimir Sementsov-Ogievskiy 3 years, 9 months ago

Hi all!

We understood, that keepalive is almost superfluous with default 2 hours
in /proc/tcp_keepalive_time. Forcing user to setup keepalive for the
whole system doesn't seem right, better setup it per-socket.

These series suggests to set some defaults for keepalive settings as
well as set keepalive itself by default.

keepalive helps to not hang on io other side is down.

Vladimir Sementsov-Ogievskiy (2):
  sockets: keep-alive settings
  util/qemu-sockets: make keep-alive enabled by default

 qapi/sockets.json   | 33 +++++++++++++++++-
 util/qemu-sockets.c | 81 +++++++++++++++++++++++++++++++++++++++------
 2 files changed, 102 insertions(+), 12 deletions(-)

-- 
2.21.0

Re: [PATCH 0/2] keepalive default

Posted by no-reply@patchew.org 3 years, 9 months ago

Patchew URL: https://patchew.org/QEMU/20200708191540.28455-1-vsementsov@virtuozzo.com/



Hi,

This series failed the docker-mingw@fedora build test. Please find the testing commands and
their output below. If you have Docker installed, you can probably reproduce it
locally.

=== TEST SCRIPT BEGIN ===
#! /bin/bash
export ARCH=x86_64
make docker-image-fedora V=1 NETWORK=1
time make docker-test-mingw@fedora J=14 NETWORK=1
=== TEST SCRIPT END ===

  CC      hw/hppa/trace.o
In file included from /tmp/qemu-test/src/util/qemu-sockets.c:24:
/tmp/qemu-test/src/util/qemu-sockets.c: In function 'inet_set_keepalive':
/tmp/qemu-test/src/util/qemu-sockets.c:469:46: error: 'TCP_KEEPIDLE' undeclared (first use in this function)
  469 |     ret = qemu_setsockopt(sock, IPPROTO_TCP, TCP_KEEPIDLE, &val, sizeof(val));
      |                                              ^~~~~~~~~~~~
/tmp/qemu-test/src/include/qemu-common.h:48:31: note: in definition of macro 'qemu_setsockopt'
---
/tmp/qemu-test/src/include/qemu-common.h:48:31: note: in definition of macro 'qemu_setsockopt'
   48 |     setsockopt(sockfd, level, optname, (const void *)optval, optlen)
      |                               ^~~~~~~
/tmp/qemu-test/src/util/qemu-sockets.c:476:46: error: 'TCP_KEEPINTVL' undeclared (first use in this function)
  476 |     ret = qemu_setsockopt(sock, IPPROTO_TCP, TCP_KEEPINTVL, &val, sizeof(val));
      |                                              ^~~~~~~~~~~~~
/tmp/qemu-test/src/include/qemu-common.h:48:31: note: in definition of macro 'qemu_setsockopt'
   48 |     setsockopt(sockfd, level, optname, (const void *)optval, optlen)
      |                               ^~~~~~~
/tmp/qemu-test/src/util/qemu-sockets.c:483:46: error: 'TCP_KEEPCNT' undeclared (first use in this function)
  483 |     ret = qemu_setsockopt(sock, IPPROTO_TCP, TCP_KEEPCNT, &val, sizeof(val));
      |                                              ^~~~~~~~~~~
/tmp/qemu-test/src/include/qemu-common.h:48:31: note: in definition of macro 'qemu_setsockopt'
   48 |     setsockopt(sockfd, level, optname, (const void *)optval, optlen)
      |                               ^~~~~~~
make: *** [/tmp/qemu-test/src/rules.mak:69: util/qemu-sockets.o] Error 1
make: *** Waiting for unfinished jobs....
Traceback (most recent call last):
  File "./tests/docker/docker.py", line 669, in <module>
---
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['sudo', '-n', 'docker', 'run', '--label', 'com.qemu.instance.uuid=bab255245d94432aabefeb4189c6aced', '-u', '1001', '--security-opt', 'seccomp=unconfined', '--rm', '-e', 'TARGET_LIST=', '-e', 'EXTRA_CONFIGURE_OPTS=', '-e', 'V=', '-e', 'J=14', '-e', 'DEBUG=', '-e', 'SHOW_ENV=', '-e', 'CCACHE_DIR=/var/tmp/ccache', '-v', '/home/patchew/.cache/qemu-docker-ccache:/var/tmp/ccache:z', '-v', '/var/tmp/patchew-tester-tmp-mck04b5u/src/docker-src.2020-07-08-15.38.08.17376:/var/tmp/qemu:z,ro', 'qemu:fedora', '/var/tmp/qemu/run', 'test-mingw']' returned non-zero exit status 2.
filter=--filter=label=com.qemu.instance.uuid=bab255245d94432aabefeb4189c6aced
make[1]: *** [docker-run] Error 1
make[1]: Leaving directory `/var/tmp/patchew-tester-tmp-mck04b5u/src'
make: *** [docker-run-test-mingw@fedora] Error 2

real    3m11.264s
user    0m8.976s


The full log is available at
http://patchew.org/logs/20200708191540.28455-1-vsementsov@virtuozzo.com/testing.docker-mingw@fedora/?type=message.
---
Email generated automatically by Patchew [https://patchew.org/].
Please send your feedback to patchew-devel@redhat.com

Re: [PATCH 0/2] keepalive default

Posted by Daniel P. Berrangé 3 years, 9 months ago

On Wed, Jul 08, 2020 at 10:15:37PM +0300, Vladimir Sementsov-Ogievskiy wrote:
> Hi all!
> 
> We understood, that keepalive is almost superfluous with default 2 hours
> in /proc/tcp_keepalive_time. Forcing user to setup keepalive for the
> whole system doesn't seem right, better setup it per-socket.

Adding the ability to explicitly configure the keepalive settings makes
sense for QEMU. Completely ignoring system defaults when no explicit
settings are given though is not valid IMHO.


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [PATCH 0/2] keepalive default

Posted by Eric Blake 3 years, 9 months ago

On 7/9/20 3:35 AM, Daniel P. Berrangé wrote:
> On Wed, Jul 08, 2020 at 10:15:37PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>> Hi all!
>>
>> We understood, that keepalive is almost superfluous with default 2 hours
>> in /proc/tcp_keepalive_time. Forcing user to setup keepalive for the
>> whole system doesn't seem right, better setup it per-socket.
> 
> Adding the ability to explicitly configure the keepalive settings makes
> sense for QEMU. Completely ignoring system defaults when no explicit
> settings are given though is not valid IMHO.

We already have the ability to add per-socket keepalive (see commit 
aec21d3175, in 4.2).  I guess what you are trying to further do is 
determine whether the default value for that field, when not explicitly 
specified by the user, can have saner semantics (default off for chardev 
sockets, default on for nbd clients where retry was enabled).  But since 
you already have to explicitly opt-in to nbd retry, what's so hard about 
opting in to keepalive at the same time, other than more typing?  Given 
that the easiest way to do this is via a machine-coded generation of the 
command line or QMP command, it doesn't seem that hard to just keep 
things as they are with explicit opt-in to per-socket keepalive.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH 0/2] keepalive default

Posted by Vladimir Sementsov-Ogievskiy 3 years, 9 months ago

09.07.2020 18:34, Eric Blake wrote:
> On 7/9/20 3:35 AM, Daniel P. Berrangé wrote:
>> On Wed, Jul 08, 2020 at 10:15:37PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>>> Hi all!
>>>
>>> We understood, that keepalive is almost superfluous with default 2 hours
>>> in /proc/tcp_keepalive_time. Forcing user to setup keepalive for the
>>> whole system doesn't seem right, better setup it per-socket.
>>
>> Adding the ability to explicitly configure the keepalive settings makes
>> sense for QEMU. Completely ignoring system defaults when no explicit
>> settings are given though is not valid IMHO.
> 
> We already have the ability to add per-socket keepalive (see commit aec21d3175, in 4.2).  I guess what you are trying to further do is determine whether the default value for that field, when not explicitly specified by the user, can have saner semantics (default off for chardev sockets, default on for nbd clients where retry was enabled).  But since you already have to explicitly opt-in to nbd retry, what's so hard about opting in to keepalive at the same time, other than more typing?  Given that the easiest way to do this is via a machine-coded generation of the command line or QMP command, it doesn't seem that hard to just keep things as they are with explicit opt-in to per-socket keepalive.
> 

Hmm. Looking at the code, I remember that reconnect is not optional, it works by default now. The option we have is "reconnect-delay" which only specify, after how much seconds we'll automatically fail the requests, not retrying them (0 seconds by default). Still, NBD tries to reconnect in background anyway.

In our downstream we have now old version of nbd-reconnect interface and enabled non-zero "delay" by default.

-- 
Best regards,
Vladimir

Re: [PATCH 0/2] keepalive default

Posted by Vladimir Sementsov-Ogievskiy 3 years, 9 months ago

09.07.2020 20:14, Vladimir Sementsov-Ogievskiy wrote:
> 09.07.2020 18:34, Eric Blake wrote:
>> On 7/9/20 3:35 AM, Daniel P. Berrangé wrote:
>>> On Wed, Jul 08, 2020 at 10:15:37PM +0300, Vladimir Sementsov-Ogievskiy wrote:
>>>> Hi all!
>>>>
>>>> We understood, that keepalive is almost superfluous with default 2 hours
>>>> in /proc/tcp_keepalive_time. Forcing user to setup keepalive for the
>>>> whole system doesn't seem right, better setup it per-socket.
>>>
>>> Adding the ability to explicitly configure the keepalive settings makes
>>> sense for QEMU. Completely ignoring system defaults when no explicit
>>> settings are given though is not valid IMHO.
>>
>> We already have the ability to add per-socket keepalive (see commit aec21d3175, in 4.2).  I guess what you are trying to further do is determine whether the default value for that field, when not explicitly specified by the user, can have saner semantics (default off for chardev sockets, default on for nbd clients where retry was enabled).  But since you already have to explicitly opt-in to nbd retry, what's so hard about opting in to keepalive at the same time, other than more typing?  Given that the easiest way to do this is via a machine-coded generation of the command line or QMP command, it doesn't seem that hard to just keep things as they are with explicit opt-in to per-socket keepalive.
>>
> 
> Hmm. Looking at the code, I remember that reconnect is not optional, it works by default now. The option we have is "reconnect-delay" which only specify, after how much seconds we'll automatically fail the requests, not retrying them (0 seconds by default). Still, NBD tries to reconnect in background anyway.
> 
> In our downstream we have now old version of nbd-reconnect interface and enabled non-zero "delay" by default.
> 


And now we need to migrate to upstream code. Hmm. So I have several options:

1. Set a downstream default for reconnect-delay to be something like 5min. [most simple thing to do]
2. Set an upstream default to 5min. [needs a discussion, but may be useful]
3. Force all users (not customers I mean, but other teams (and even one another company)) to set reconnect-delay option. [just for completeness, I will not go this way]


So, what do you think about [2]? This includes:

  - non-zero reconnect-delay by default, so requests will wait some time for reconnect and retry
  - enabled keep-alive and some keep-alive default parameters, to not hang in recvmsg for a long time


Some other related ideas:

  - non-blocking nbd_open
    - move open to the coroutine
    - use non-blocking socket connect (for example use qio_channel_socket_connect_async, which runs connect in thread). Currently, if you try to blockdev-add some unavailable nbd host, vm hangs during connect() call which may be about a minute
  - make blockdev-add to be async qmp command (Kevin's series)
  - allow reconnect on open

also, recently I noted, that bdrv_close may hang due to reconnect: it does bdrv_flush, which waits for NBD to reconnect.. This seems not convenient, probably we should disable reconnect before bdrv_drained_begin() in bdrv_close().

-- 
Best regards,
Vladimir