[PATCH] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected

Thomas Huth posted 1 patch 9 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230816210743.1319018-1-thuth@redhat.com
Maintainers: "Marc-André Lureau" <marcandre.lureau@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
chardev/char-pty.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
[PATCH] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
Posted by Thomas Huth 9 months ago
When starting a guest via libvirt with "virsh start --console ...",
the first second of the console output is missing. This is especially
annoying on s390x that only has a text console by default and no graphical
output - if the bios fails to boot here, the information about what went
wrong is completely lost.

One part of the problem (there is also some things to be done on the
libvirt side) is that QEMU only checks with a 1 second timer whether
the other side of the pty is already connected, so the first second of
the console output is always lost.

This likely used to work better in the past, since the code once checked
for a re-connection during write, but this has been removed in commit
f8278c7d74 ("char-pty: remove the check for connection on write") to avoid
some locking.

To ease the situation here at least a little bit, let's check with g_poll()
whether we could send out the data anyway, even if the connection has not
been marked as "connected" yet. The file descriptor is marked as non-blocking
anyway since commit fac6688a18 ("Do not hang on full PTY"), so this should
not cause any trouble if the other side is not ready for receiving yet.

With this patch applied, I can now successfully see the bios output of
a s390x guest when running it with "virsh start --console" (with a patched
version of virsh that fixes the remaining issues there, too).

Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 chardev/char-pty.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/chardev/char-pty.c b/chardev/char-pty.c
index 4e5deac18a..fad12dfef3 100644
--- a/chardev/char-pty.c
+++ b/chardev/char-pty.c
@@ -106,11 +106,27 @@ static void pty_chr_update_read_handler(Chardev *chr)
 static int char_pty_chr_write(Chardev *chr, const uint8_t *buf, int len)
 {
     PtyChardev *s = PTY_CHARDEV(chr);
+    GPollFD pfd;
+    int rc;
 
-    if (!s->connected) {
-        return len;
+    if (s->connected) {
+        return io_channel_send(s->ioc, buf, len);
     }
-    return io_channel_send(s->ioc, buf, len);
+
+    /*
+     * The other side might already be re-connected, but the timer might
+     * not have fired yet. So let's check here whether we can write again:
+     */
+    pfd.fd = QIO_CHANNEL_FILE(s->ioc)->fd;
+    pfd.events = G_IO_OUT;
+    pfd.revents = 0;
+    rc = RETRY_ON_EINTR(g_poll(&pfd, 1, 0));
+    g_assert(rc >= 0);
+    if (!(pfd.revents & G_IO_HUP) && (pfd.revents & G_IO_OUT)) {
+        io_channel_send(s->ioc, buf, len);
+    }
+
+    return len;
 }
 
 static GSource *pty_chr_add_watch(Chardev *chr, GIOCondition cond)
-- 
2.39.3
Re: [PATCH] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
Posted by Daniel P. Berrangé 7 months, 3 weeks ago
On Wed, Aug 16, 2023 at 11:07:43PM +0200, Thomas Huth wrote:
> When starting a guest via libvirt with "virsh start --console ...",
> the first second of the console output is missing. This is especially
> annoying on s390x that only has a text console by default and no graphical
> output - if the bios fails to boot here, the information about what went
> wrong is completely lost.
> 
> One part of the problem (there is also some things to be done on the
> libvirt side) is that QEMU only checks with a 1 second timer whether
> the other side of the pty is already connected, so the first second of
> the console output is always lost.
> 
> This likely used to work better in the past, since the code once checked
> for a re-connection during write, but this has been removed in commit
> f8278c7d74 ("char-pty: remove the check for connection on write") to avoid
> some locking.
> 
> To ease the situation here at least a little bit, let's check with g_poll()
> whether we could send out the data anyway, even if the connection has not
> been marked as "connected" yet. The file descriptor is marked as non-blocking
> anyway since commit fac6688a18 ("Do not hang on full PTY"), so this should
> not cause any trouble if the other side is not ready for receiving yet.
> 
> With this patch applied, I can now successfully see the bios output of
> a s390x guest when running it with "virsh start --console" (with a patched
> version of virsh that fixes the remaining issues there, too).
> 
> Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  chardev/char-pty.c | 22 +++++++++++++++++++---
>  1 file changed, 19 insertions(+), 3 deletions(-)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
Posted by Amaury Pouly 8 months, 4 weeks ago
>> When starting a guest via libvirt with "virsh start --console ...",
>> the first second of the console output is missing. This is especially
>> annoying on s390x that only has a text console by default and no graphical
>> output - if the bios fails to boot here, the information about what went
>> wrong is completely lost.

Hi, we recently ran into this problem in a different scenario where we
manually invoke QEMU and connect to the PTY before starting the machine.
With the existing code, we have to add a one second delay before starting just
to make sure that we capture the early boot message. We are now running a 
patched version of QEMU with pretty much the same change as yours and it seems
to work well. We are very interested in seeing this change, or a variant
thereof merged.
Re: [PATCH] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
Posted by Daniel P. Berrangé 9 months ago
On Wed, Aug 16, 2023 at 11:07:43PM +0200, Thomas Huth wrote:
> When starting a guest via libvirt with "virsh start --console ...",
> the first second of the console output is missing. This is especially
> annoying on s390x that only has a text console by default and no graphical
> output - if the bios fails to boot here, the information about what went
> wrong is completely lost.
> 
> One part of the problem (there is also some things to be done on the
> libvirt side) is that QEMU only checks with a 1 second timer whether
> the other side of the pty is already connected, so the first second of
> the console output is always lost.
> 
> This likely used to work better in the past, since the code once checked
> for a re-connection during write, but this has been removed in commit
> f8278c7d74 ("char-pty: remove the check for connection on write") to avoid
> some locking.
> 
> To ease the situation here at least a little bit, let's check with g_poll()
> whether we could send out the data anyway, even if the connection has not
> been marked as "connected" yet. The file descriptor is marked as non-blocking
> anyway since commit fac6688a18 ("Do not hang on full PTY"), so this should
> not cause any trouble if the other side is not ready for receiving yet.
> 
> With this patch applied, I can now successfully see the bios output of
> a s390x guest when running it with "virsh start --console" (with a patched
> version of virsh that fixes the remaining issues there, too).
> 
> Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  chardev/char-pty.c | 22 +++++++++++++++++++---
>  1 file changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/chardev/char-pty.c b/chardev/char-pty.c
> index 4e5deac18a..fad12dfef3 100644
> --- a/chardev/char-pty.c
> +++ b/chardev/char-pty.c
> @@ -106,11 +106,27 @@ static void pty_chr_update_read_handler(Chardev *chr)
>  static int char_pty_chr_write(Chardev *chr, const uint8_t *buf, int len)
>  {
>      PtyChardev *s = PTY_CHARDEV(chr);
> +    GPollFD pfd;
> +    int rc;
>  
> -    if (!s->connected) {
> -        return len;
> +    if (s->connected) {
> +        return io_channel_send(s->ioc, buf, len);
>      }
> -    return io_channel_send(s->ioc, buf, len);
> +
> +    /*
> +     * The other side might already be re-connected, but the timer might
> +     * not have fired yet. So let's check here whether we can write again:
> +     */
> +    pfd.fd = QIO_CHANNEL_FILE(s->ioc)->fd;
> +    pfd.events = G_IO_OUT;
> +    pfd.revents = 0;
> +    rc = RETRY_ON_EINTR(g_poll(&pfd, 1, 0));
> +    g_assert(rc >= 0);
> +    if (!(pfd.revents & G_IO_HUP) && (pfd.revents & G_IO_OUT)) {

Should (can?) we call

   pty_chr_state(chr, 1);

here ?


> +        io_channel_send(s->ioc, buf, len);

As it feels a little dirty to be sending data before setting the
'connected == 1' and thus issuing the 'CHR_EVENT_OPENED' event 

> +    }
> +
> +    return len;
>  }
>  
>  static GSource *pty_chr_add_watch(Chardev *chr, GIOCondition cond)

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|