vhost: stick to -errno error return convention

[PATCH 04/10] chardev/char-fe: don't allow EAGAIN from blocking read

Posted by Roman Kagan 4 years, 3 months ago

As its name suggests, ChardevClass.chr_sync_read is supposed to do a
blocking read.  The only implementation of it, tcp_chr_sync_read, does
set the underlying io channel to the blocking mode indeed.

Therefore a failure return with EAGAIN is not expected from this call.

So do not retry it in qemu_chr_fe_read_all; instead place an assertion
that it doesn't fail with EAGAIN.

Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
---
 chardev/char-fe.c | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/chardev/char-fe.c b/chardev/char-fe.c
index 7789f7be9c..f94efe928e 100644
--- a/chardev/char-fe.c
+++ b/chardev/char-fe.c
@@ -68,13 +68,10 @@ int qemu_chr_fe_read_all(CharBackend *be, uint8_t *buf, int len)
     }
 
     while (offset < len) {
-    retry:
         res = CHARDEV_GET_CLASS(s)->chr_sync_read(s, buf + offset,
                                                   len - offset);
-        if (res == -1 && errno == EAGAIN) {
-            g_usleep(100);
-            goto retry;
-        }
+        /* ->chr_sync_read should block */
+        assert(!(res < 0 && errno == EAGAIN));
 
         if (res == 0) {
             break;
-- 
2.33.1

Re: [PATCH 04/10] chardev/char-fe: don't allow EAGAIN from blocking read

Posted by Marc-André Lureau 4 years, 3 months ago

Hi

On Thu, Nov 11, 2021 at 7:44 PM Roman Kagan <rvkagan@yandex-team.ru> wrote:

> As its name suggests, ChardevClass.chr_sync_read is supposed to do a
> blocking read.  The only implementation of it, tcp_chr_sync_read, does
> set the underlying io channel to the blocking mode indeed.
>
> Therefore a failure return with EAGAIN is not expected from this call.
>
> So do not retry it in qemu_chr_fe_read_all; instead place an assertion
> that it doesn't fail with EAGAIN.
>

The code was introduced in :
commit 7b0bfdf52d694c9a3a96505aa42ce3f8d63acd35
Author: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Date:   Tue May 27 15:03:48 2014 +0300

    Add chardev API qemu_chr_fe_read_all

Also touched later by Daniel in:
commit 53628efbc8aa7a7ab5354d24b971f4d69452151d
Author: Daniel P. Berrangé <berrange@redhat.com>
Date:   Thu Mar 31 16:29:27 2016 +0100

    char: fix broken EAGAIN retry on OS-X due to errno clobbering



> Signed-off-by: Roman Kagan <rvkagan@yandex-team.ru>
> ---
>  chardev/char-fe.c | 7 ++-----
>  1 file changed, 2 insertions(+), 5 deletions(-)
>
> diff --git a/chardev/char-fe.c b/chardev/char-fe.c
> index 7789f7be9c..f94efe928e 100644
> --- a/chardev/char-fe.c
> +++ b/chardev/char-fe.c
> @@ -68,13 +68,10 @@ int qemu_chr_fe_read_all(CharBackend *be, uint8_t
> *buf, int len)
>      }
>
>      while (offset < len) {
> -    retry:
>          res = CHARDEV_GET_CLASS(s)->chr_sync_read(s, buf + offset,
>                                                    len - offset);
> -        if (res == -1 && errno == EAGAIN) {
> -            g_usleep(100);
> -            goto retry;
> -        }
> +        /* ->chr_sync_read should block */
> +        assert(!(res < 0 && errno == EAGAIN));
>
>
While I agree with the rationale to clean this code a bit, I am not so sure
about replacing it with an assert(). In the past, when we did such things
we had unexpected regressions :)

A slightly better approach perhaps is g_warn_if_fail(), although it's not
very popular in qemu.



>          if (res == 0) {
>              break;
> --
> 2.33.1
>
>
>

-- 
Marc-André Lureau

Re: [PATCH 04/10] chardev/char-fe: don't allow EAGAIN from blocking read

Posted by Roman Kagan 4 years, 2 months ago

On Fri, Nov 12, 2021 at 12:24:06PM +0400, Marc-André Lureau wrote:
> Hi
> 
> On Thu, Nov 11, 2021 at 7:44 PM Roman Kagan <rvkagan@yandex-team.ru> wrote:
> 
> > As its name suggests, ChardevClass.chr_sync_read is supposed to do a
> > blocking read.  The only implementation of it, tcp_chr_sync_read, does
> > set the underlying io channel to the blocking mode indeed.
> >
> > Therefore a failure return with EAGAIN is not expected from this call.
> >
> > So do not retry it in qemu_chr_fe_read_all; instead place an assertion
> > that it doesn't fail with EAGAIN.
> >
> 
> The code was introduced in :
> commit 7b0bfdf52d694c9a3a96505aa42ce3f8d63acd35
> Author: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
> Date:   Tue May 27 15:03:48 2014 +0300
> 
>     Add chardev API qemu_chr_fe_read_all

Right, but at that point chr_sync_read wasn't made to block.  It
happened later in

commit bcdeb9be566ded2eb35233aaccf38742a21e5daa
Author: Marc-André Lureau <marcandre.lureau@redhat.com>
Date:   Thu Jul 6 19:03:53 2017 +0200

    chardev: block during sync read
    
    A sync read should block until all requested data is
    available (instead of retrying in qemu_chr_fe_read_all). Change the
    channel to blocking during sync_read.

> > @@ -68,13 +68,10 @@ int qemu_chr_fe_read_all(CharBackend *be, uint8_t
> > *buf, int len)
> >      }
> >
> >      while (offset < len) {
> > -    retry:
> >          res = CHARDEV_GET_CLASS(s)->chr_sync_read(s, buf + offset,
> >                                                    len - offset);
> > -        if (res == -1 && errno == EAGAIN) {
> > -            g_usleep(100);
> > -            goto retry;
> > -        }
> > +        /* ->chr_sync_read should block */
> > +        assert(!(res < 0 && errno == EAGAIN));
> >
> >
> While I agree with the rationale to clean this code a bit, I am not so sure
> about replacing it with an assert(). In the past, when we did such things
> we had unexpected regressions :)

Valid point, qemu may be run against some OS where a blocking call may
sporadically return -EAGAIN, and it would be hard to reliably catch this
with testing.

> A slightly better approach perhaps is g_warn_if_fail(), although it's not
> very popular in qemu.

I think the first thing to decide is whether -EAGAIN from a blocking
call isn't broken enough, and justifies (unlimited) retries.  I'm
tempted to just remove any special handling of -EAGAIN and treat it as
any other error, leaving up to the caller to handle (most probably to
fail the call and initiate a recovery, if possible).

Does this make sense?

Thanks,
Roman.