[PATCH] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected

Thomas Huth posted 1 patch 2 years, 5 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20230816210743.1319018-1-thuth@redhat.com
Maintainers: "Marc-André Lureau" <marcandre.lureau@redhat.com>, Paolo Bonzini <pbonzini@redhat.com>
chardev/char-pty.c | 22 +++++++++++++++++++---
1 file changed, 19 insertions(+), 3 deletions(-)
[PATCH] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
Posted by Thomas Huth 2 years, 5 months ago
When starting a guest via libvirt with "virsh start --console ...",
the first second of the console output is missing. This is especially
annoying on s390x that only has a text console by default and no graphical
output - if the bios fails to boot here, the information about what went
wrong is completely lost.

One part of the problem (there is also some things to be done on the
libvirt side) is that QEMU only checks with a 1 second timer whether
the other side of the pty is already connected, so the first second of
the console output is always lost.

This likely used to work better in the past, since the code once checked
for a re-connection during write, but this has been removed in commit
f8278c7d74 ("char-pty: remove the check for connection on write") to avoid
some locking.

To ease the situation here at least a little bit, let's check with g_poll()
whether we could send out the data anyway, even if the connection has not
been marked as "connected" yet. The file descriptor is marked as non-blocking
anyway since commit fac6688a18 ("Do not hang on full PTY"), so this should
not cause any trouble if the other side is not ready for receiving yet.

With this patch applied, I can now successfully see the bios output of
a s390x guest when running it with "virsh start --console" (with a patched
version of virsh that fixes the remaining issues there, too).

Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
---
 chardev/char-pty.c | 22 +++++++++++++++++++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/chardev/char-pty.c b/chardev/char-pty.c
index 4e5deac18a..fad12dfef3 100644
--- a/chardev/char-pty.c
+++ b/chardev/char-pty.c
@@ -106,11 +106,27 @@ static void pty_chr_update_read_handler(Chardev *chr)
 static int char_pty_chr_write(Chardev *chr, const uint8_t *buf, int len)
 {
     PtyChardev *s = PTY_CHARDEV(chr);
+    GPollFD pfd;
+    int rc;
 
-    if (!s->connected) {
-        return len;
+    if (s->connected) {
+        return io_channel_send(s->ioc, buf, len);
     }
-    return io_channel_send(s->ioc, buf, len);
+
+    /*
+     * The other side might already be re-connected, but the timer might
+     * not have fired yet. So let's check here whether we can write again:
+     */
+    pfd.fd = QIO_CHANNEL_FILE(s->ioc)->fd;
+    pfd.events = G_IO_OUT;
+    pfd.revents = 0;
+    rc = RETRY_ON_EINTR(g_poll(&pfd, 1, 0));
+    g_assert(rc >= 0);
+    if (!(pfd.revents & G_IO_HUP) && (pfd.revents & G_IO_OUT)) {
+        io_channel_send(s->ioc, buf, len);
+    }
+
+    return len;
 }
 
 static GSource *pty_chr_add_watch(Chardev *chr, GIOCondition cond)
-- 
2.39.3
Re: [PATCH] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
Posted by Philippe Mathieu-Daudé 11 months ago
Hi Thomas,

(patch merged as commit 4f7689f0817)

On 16/8/23 23:07, Thomas Huth wrote:
> When starting a guest via libvirt with "virsh start --console ...",
> the first second of the console output is missing. This is especially
> annoying on s390x that only has a text console by default and no graphical
> output - if the bios fails to boot here, the information about what went
> wrong is completely lost.
> 
> One part of the problem (there is also some things to be done on the
> libvirt side) is that QEMU only checks with a 1 second timer whether
> the other side of the pty is already connected, so the first second of
> the console output is always lost.
> 
> This likely used to work better in the past, since the code once checked
> for a re-connection during write, but this has been removed in commit
> f8278c7d74 ("char-pty: remove the check for connection on write") to avoid
> some locking.
> 
> To ease the situation here at least a little bit, let's check with g_poll()
> whether we could send out the data anyway, even if the connection has not
> been marked as "connected" yet. The file descriptor is marked as non-blocking
> anyway since commit fac6688a18 ("Do not hang on full PTY"), so this should
> not cause any trouble if the other side is not ready for receiving yet.
> 
> With this patch applied, I can now successfully see the bios output of
> a s390x guest when running it with "virsh start --console" (with a patched
> version of virsh that fixes the remaining issues there, too).
> 
> Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>   chardev/char-pty.c | 22 +++++++++++++++++++---
>   1 file changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/chardev/char-pty.c b/chardev/char-pty.c
> index 4e5deac18a..fad12dfef3 100644
> --- a/chardev/char-pty.c
> +++ b/chardev/char-pty.c
> @@ -106,11 +106,27 @@ static void pty_chr_update_read_handler(Chardev *chr)
>   static int char_pty_chr_write(Chardev *chr, const uint8_t *buf, int len)
>   {
>       PtyChardev *s = PTY_CHARDEV(chr);
> +    GPollFD pfd;
> +    int rc;
>   
> -    if (!s->connected) {
> -        return len;
> +    if (s->connected) {
> +        return io_channel_send(s->ioc, buf, len);
>       }
> -    return io_channel_send(s->ioc, buf, len);
> +
> +    /*
> +     * The other side might already be re-connected, but the timer might
> +     * not have fired yet. So let's check here whether we can write again:
> +     */
> +    pfd.fd = QIO_CHANNEL_FILE(s->ioc)->fd;
> +    pfd.events = G_IO_OUT;
> +    pfd.revents = 0;
> +    rc = RETRY_ON_EINTR(g_poll(&pfd, 1, 0));
> +    g_assert(rc >= 0);
> +    if (!(pfd.revents & G_IO_HUP) && (pfd.revents & G_IO_OUT)) {
> +        io_channel_send(s->ioc, buf, len);

Could io_channel_send() return -1 in this case, and if so is it OK to 
ignore it?

> +    }
> +
> +    return len;
>   }
>   
>   static GSource *pty_chr_add_watch(Chardev *chr, GIOCondition cond)
Re: [PATCH] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
Posted by Thomas Huth 11 months ago
On 12/03/2025 13.18, Philippe Mathieu-Daudé wrote:
> Hi Thomas,
> 
> (patch merged as commit 4f7689f0817)
> 
> On 16/8/23 23:07, Thomas Huth wrote:
>> When starting a guest via libvirt with "virsh start --console ...",
>> the first second of the console output is missing. This is especially
>> annoying on s390x that only has a text console by default and no graphical
>> output - if the bios fails to boot here, the information about what went
>> wrong is completely lost.
>>
>> One part of the problem (there is also some things to be done on the
>> libvirt side) is that QEMU only checks with a 1 second timer whether
>> the other side of the pty is already connected, so the first second of
>> the console output is always lost.
>>
>> This likely used to work better in the past, since the code once checked
>> for a re-connection during write, but this has been removed in commit
>> f8278c7d74 ("char-pty: remove the check for connection on write") to avoid
>> some locking.
>>
>> To ease the situation here at least a little bit, let's check with g_poll()
>> whether we could send out the data anyway, even if the connection has not
>> been marked as "connected" yet. The file descriptor is marked as non-blocking
>> anyway since commit fac6688a18 ("Do not hang on full PTY"), so this should
>> not cause any trouble if the other side is not ready for receiving yet.
>>
>> With this patch applied, I can now successfully see the bios output of
>> a s390x guest when running it with "virsh start --console" (with a patched
>> version of virsh that fixes the remaining issues there, too).
>>
>> Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>> ---
>>   chardev/char-pty.c | 22 +++++++++++++++++++---
>>   1 file changed, 19 insertions(+), 3 deletions(-)
>>
>> diff --git a/chardev/char-pty.c b/chardev/char-pty.c
>> index 4e5deac18a..fad12dfef3 100644
>> --- a/chardev/char-pty.c
>> +++ b/chardev/char-pty.c
>> @@ -106,11 +106,27 @@ static void pty_chr_update_read_handler(Chardev *chr)
>>   static int char_pty_chr_write(Chardev *chr, const uint8_t *buf, int len)
>>   {
>>       PtyChardev *s = PTY_CHARDEV(chr);
>> +    GPollFD pfd;
>> +    int rc;
>> -    if (!s->connected) {
>> -        return len;

Old code discarded the bytes here if it was not able to send them ...

>> +    if (s->connected) {
>> +        return io_channel_send(s->ioc, buf, len);
>>       }
>> -    return io_channel_send(s->ioc, buf, len);
>> +
>> +    /*
>> +     * The other side might already be re-connected, but the timer might
>> +     * not have fired yet. So let's check here whether we can write again:
>> +     */
>> +    pfd.fd = QIO_CHANNEL_FILE(s->ioc)->fd;
>> +    pfd.events = G_IO_OUT;
>> +    pfd.revents = 0;
>> +    rc = RETRY_ON_EINTR(g_poll(&pfd, 1, 0));
>> +    g_assert(rc >= 0);
>> +    if (!(pfd.revents & G_IO_HUP) && (pfd.revents & G_IO_OUT)) {
>> +        io_channel_send(s->ioc, buf, len);
> 
> Could io_channel_send() return -1 in this case, and if so is it OK to ignore 
> it?

... so now you get the same behavior here in case they could still not be sent.

Why do you ask, did you run into any problems with this new code here?

  Thomas


Re: [PATCH] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
Posted by Philippe Mathieu-Daudé 11 months ago
On 12/3/25 13:29, Thomas Huth wrote:
> On 12/03/2025 13.18, Philippe Mathieu-Daudé wrote:
>> Hi Thomas,
>>
>> (patch merged as commit 4f7689f0817)
>>
>> On 16/8/23 23:07, Thomas Huth wrote:
>>> When starting a guest via libvirt with "virsh start --console ...",
>>> the first second of the console output is missing. This is especially
>>> annoying on s390x that only has a text console by default and no 
>>> graphical
>>> output - if the bios fails to boot here, the information about what went
>>> wrong is completely lost.
>>>
>>> One part of the problem (there is also some things to be done on the
>>> libvirt side) is that QEMU only checks with a 1 second timer whether
>>> the other side of the pty is already connected, so the first second of
>>> the console output is always lost.
>>>
>>> This likely used to work better in the past, since the code once checked
>>> for a re-connection during write, but this has been removed in commit
>>> f8278c7d74 ("char-pty: remove the check for connection on write") to 
>>> avoid
>>> some locking.
>>>
>>> To ease the situation here at least a little bit, let's check with 
>>> g_poll()
>>> whether we could send out the data anyway, even if the connection has 
>>> not
>>> been marked as "connected" yet. The file descriptor is marked as non- 
>>> blocking
>>> anyway since commit fac6688a18 ("Do not hang on full PTY"), so this 
>>> should
>>> not cause any trouble if the other side is not ready for receiving yet.
>>>
>>> With this patch applied, I can now successfully see the bios output of
>>> a s390x guest when running it with "virsh start --console" (with a 
>>> patched
>>> version of virsh that fixes the remaining issues there, too).
>>>
>>> Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
>>> Signed-off-by: Thomas Huth <thuth@redhat.com>
>>> ---
>>>   chardev/char-pty.c | 22 +++++++++++++++++++---
>>>   1 file changed, 19 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/chardev/char-pty.c b/chardev/char-pty.c
>>> index 4e5deac18a..fad12dfef3 100644
>>> --- a/chardev/char-pty.c
>>> +++ b/chardev/char-pty.c
>>> @@ -106,11 +106,27 @@ static void pty_chr_update_read_handler(Chardev 
>>> *chr)
>>>   static int char_pty_chr_write(Chardev *chr, const uint8_t *buf, int 
>>> len)
>>>   {
>>>       PtyChardev *s = PTY_CHARDEV(chr);
>>> +    GPollFD pfd;
>>> +    int rc;
>>> -    if (!s->connected) {
>>> -        return len;
> 
> Old code discarded the bytes here if it was not able to send them ...
> 
>>> +    if (s->connected) {
>>> +        return io_channel_send(s->ioc, buf, len);
>>>       }
>>> -    return io_channel_send(s->ioc, buf, len);
>>> +
>>> +    /*
>>> +     * The other side might already be re-connected, but the timer 
>>> might
>>> +     * not have fired yet. So let's check here whether we can write 
>>> again:
>>> +     */
>>> +    pfd.fd = QIO_CHANNEL_FILE(s->ioc)->fd;
>>> +    pfd.events = G_IO_OUT;
>>> +    pfd.revents = 0;
>>> +    rc = RETRY_ON_EINTR(g_poll(&pfd, 1, 0));
>>> +    g_assert(rc >= 0);
>>> +    if (!(pfd.revents & G_IO_HUP) && (pfd.revents & G_IO_OUT)) {
>>> +        io_channel_send(s->ioc, buf, len);
>>
>> Could io_channel_send() return -1 in this case, and if so is it OK to 
>> ignore it?
> 
> ... so now you get the same behavior here in case they could still not 
> be sent.
> 
> Why do you ask, did you run into any problems with this new code here?

Peter found in chardev while testing a recent pl011 device patchset:
https://lore.kernel.org/qemu-devel/CAFEAcA_kEndvNtw4EHySXWwQPoGs029yAzZGGBcV=zGHaj7KUQ@mail.gmail.com/

For now I'm auditing our uses to understand better this subsystem,
and am looking at the returned values, and noticed this case.

Thanks,

Phil.

Re: [PATCH] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
Posted by Daniel P. Berrangé 2 years, 4 months ago
On Wed, Aug 16, 2023 at 11:07:43PM +0200, Thomas Huth wrote:
> When starting a guest via libvirt with "virsh start --console ...",
> the first second of the console output is missing. This is especially
> annoying on s390x that only has a text console by default and no graphical
> output - if the bios fails to boot here, the information about what went
> wrong is completely lost.
> 
> One part of the problem (there is also some things to be done on the
> libvirt side) is that QEMU only checks with a 1 second timer whether
> the other side of the pty is already connected, so the first second of
> the console output is always lost.
> 
> This likely used to work better in the past, since the code once checked
> for a re-connection during write, but this has been removed in commit
> f8278c7d74 ("char-pty: remove the check for connection on write") to avoid
> some locking.
> 
> To ease the situation here at least a little bit, let's check with g_poll()
> whether we could send out the data anyway, even if the connection has not
> been marked as "connected" yet. The file descriptor is marked as non-blocking
> anyway since commit fac6688a18 ("Do not hang on full PTY"), so this should
> not cause any trouble if the other side is not ready for receiving yet.
> 
> With this patch applied, I can now successfully see the bios output of
> a s390x guest when running it with "virsh start --console" (with a patched
> version of virsh that fixes the remaining issues there, too).
> 
> Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  chardev/char-pty.c | 22 +++++++++++++++++++---
>  1 file changed, 19 insertions(+), 3 deletions(-)

Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>


With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|


Re: [PATCH] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
Posted by Amaury Pouly 2 years, 5 months ago
>> When starting a guest via libvirt with "virsh start --console ...",
>> the first second of the console output is missing. This is especially
>> annoying on s390x that only has a text console by default and no graphical
>> output - if the bios fails to boot here, the information about what went
>> wrong is completely lost.

Hi, we recently ran into this problem in a different scenario where we
manually invoke QEMU and connect to the PTY before starting the machine.
With the existing code, we have to add a one second delay before starting just
to make sure that we capture the early boot message. We are now running a 
patched version of QEMU with pretty much the same change as yours and it seems
to work well. We are very interested in seeing this change, or a variant
thereof merged.
Re: [PATCH] chardev/char-pty: Avoid losing bytes when the other side just (re-)connected
Posted by Daniel P. Berrangé 2 years, 5 months ago
On Wed, Aug 16, 2023 at 11:07:43PM +0200, Thomas Huth wrote:
> When starting a guest via libvirt with "virsh start --console ...",
> the first second of the console output is missing. This is especially
> annoying on s390x that only has a text console by default and no graphical
> output - if the bios fails to boot here, the information about what went
> wrong is completely lost.
> 
> One part of the problem (there is also some things to be done on the
> libvirt side) is that QEMU only checks with a 1 second timer whether
> the other side of the pty is already connected, so the first second of
> the console output is always lost.
> 
> This likely used to work better in the past, since the code once checked
> for a re-connection during write, but this has been removed in commit
> f8278c7d74 ("char-pty: remove the check for connection on write") to avoid
> some locking.
> 
> To ease the situation here at least a little bit, let's check with g_poll()
> whether we could send out the data anyway, even if the connection has not
> been marked as "connected" yet. The file descriptor is marked as non-blocking
> anyway since commit fac6688a18 ("Do not hang on full PTY"), so this should
> not cause any trouble if the other side is not ready for receiving yet.
> 
> With this patch applied, I can now successfully see the bios output of
> a s390x guest when running it with "virsh start --console" (with a patched
> version of virsh that fixes the remaining issues there, too).
> 
> Reported-by: Marc Hartmayer <mhartmay@linux.ibm.com>
> Signed-off-by: Thomas Huth <thuth@redhat.com>
> ---
>  chardev/char-pty.c | 22 +++++++++++++++++++---
>  1 file changed, 19 insertions(+), 3 deletions(-)
> 
> diff --git a/chardev/char-pty.c b/chardev/char-pty.c
> index 4e5deac18a..fad12dfef3 100644
> --- a/chardev/char-pty.c
> +++ b/chardev/char-pty.c
> @@ -106,11 +106,27 @@ static void pty_chr_update_read_handler(Chardev *chr)
>  static int char_pty_chr_write(Chardev *chr, const uint8_t *buf, int len)
>  {
>      PtyChardev *s = PTY_CHARDEV(chr);
> +    GPollFD pfd;
> +    int rc;
>  
> -    if (!s->connected) {
> -        return len;
> +    if (s->connected) {
> +        return io_channel_send(s->ioc, buf, len);
>      }
> -    return io_channel_send(s->ioc, buf, len);
> +
> +    /*
> +     * The other side might already be re-connected, but the timer might
> +     * not have fired yet. So let's check here whether we can write again:
> +     */
> +    pfd.fd = QIO_CHANNEL_FILE(s->ioc)->fd;
> +    pfd.events = G_IO_OUT;
> +    pfd.revents = 0;
> +    rc = RETRY_ON_EINTR(g_poll(&pfd, 1, 0));
> +    g_assert(rc >= 0);
> +    if (!(pfd.revents & G_IO_HUP) && (pfd.revents & G_IO_OUT)) {

Should (can?) we call

   pty_chr_state(chr, 1);

here ?


> +        io_channel_send(s->ioc, buf, len);

As it feels a little dirty to be sending data before setting the
'connected == 1' and thus issuing the 'CHR_EVENT_OPENED' event 

> +    }
> +
> +    return len;
>  }
>  
>  static GSource *pty_chr_add_watch(Chardev *chr, GIOCondition cond)

With regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|