[PATCH 2/5] qemu-nbd: fix regression with qemu-nbd --fork run over ssh

Denis V. Lunev posted 5 patches 2 years, 6 months ago
Maintainers: Eric Blake <eblake@redhat.com>, Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
[PATCH 2/5] qemu-nbd: fix regression with qemu-nbd --fork run over ssh
Posted by Denis V. Lunev 2 years, 6 months ago
Commit e6df58a5578fee7a50bbf36f4a50a2781cff855d
    Author: Hanna Reitz <hreitz@redhat.com>
    Date:   Wed May 8 23:18:18 2019 +0200
    qemu-nbd: Do not close stderr
has introduced an interesting regression. Original behavior of
    ssh somehost qemu-nbd /home/den/tmp/file -f raw --fork
was the following:
 * qemu-nbd was started as a daemon
 * the command execution is done and ssh exited with success

The patch has changed this behavior and 'ssh' command now hangs forever.

According to the normal specification of the daemon() call, we should
endup with STDERR pointing to /dev/null. That should be done at the
very end of the successful startup sequence when the pipe to the
bootstrap process (used for diagnostics) is no longer needed.

This could be achived in the same way as done for 'qemu-nbd -c' case.
That was commit 0eaf453e, also fixing up e6df58a5. STDOUT copying to
STDERR does the trick.

This also leads to proper 'ssh' connection closing which fixes my
original problem.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
CC: Hanna Reitz <hreitz@redhat.com>
CC: <qemu-stable@nongnu.org>
---
 qemu-nbd.c | 13 ++++---------
 1 file changed, 4 insertions(+), 9 deletions(-)

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 77f98c736b..186ce9474c 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -274,6 +274,7 @@ static void *show_parts(void *arg)
 
 struct NbdClientOpts {
     char *device;
+    bool fork_process;
 };
 
 static void *nbd_client_thread(void *arg)
@@ -317,7 +318,7 @@ static void *nbd_client_thread(void *arg)
     /* update partition table */
     pthread_create(&show_parts_thread, NULL, show_parts, opts->device);
 
-    if (verbose) {
+    if (verbose && !opts->fork_process) {
         fprintf(stderr, "NBD device %s is now connected to %s\n",
                 opts->device, srcpath);
     } else {
@@ -579,7 +580,6 @@ int main(int argc, char **argv)
     bool writethrough = false; /* Client will flush as needed. */
     bool fork_process = false;
     bool list = false;
-    int old_stderr = -1;
     unsigned socket_activation;
     const char *pid_file_name = NULL;
     const char *selinux_label = NULL;
@@ -934,11 +934,6 @@ int main(int argc, char **argv)
         } else if (pid == 0) {
             close(stderr_fd[0]);
 
-            /* Remember parent's stderr if we will be restoring it. */
-            if (fork_process) {
-                old_stderr = dup(STDERR_FILENO);
-            }
-
             ret = qemu_daemon(1, 0);
 
             /* Temporarily redirect stderr to the parent's pipe...  */
@@ -1131,6 +1126,7 @@ int main(int argc, char **argv)
         int ret;
         struct NbdClientOpts opts = {
             .device = device,
+            .fork_process = fork_process,
         };
 
         ret = pthread_create(&client_thread, NULL, nbd_client_thread, &opts);
@@ -1159,8 +1155,7 @@ int main(int argc, char **argv)
     }
 
     if (fork_process) {
-        dup2(old_stderr, STDERR_FILENO);
-        close(old_stderr);
+        dup2(STDOUT_FILENO, STDERR_FILENO);
     }
 
     state = RUNNING;
-- 
2.34.1
Re: [PATCH 2/5] qemu-nbd: fix regression with qemu-nbd --fork run over ssh
Posted by Kevin Wolf 2 years, 5 months ago
Am 17.07.2023 um 16:55 hat Denis V. Lunev geschrieben:
> Commit e6df58a5578fee7a50bbf36f4a50a2781cff855d
>     Author: Hanna Reitz <hreitz@redhat.com>
>     Date:   Wed May 8 23:18:18 2019 +0200
>     qemu-nbd: Do not close stderr
> has introduced an interesting regression. Original behavior of
>     ssh somehost qemu-nbd /home/den/tmp/file -f raw --fork
> was the following:
>  * qemu-nbd was started as a daemon
>  * the command execution is done and ssh exited with success
> 
> The patch has changed this behavior and 'ssh' command now hangs forever.
> 
> According to the normal specification of the daemon() call, we should
> endup with STDERR pointing to /dev/null. That should be done at the
> very end of the successful startup sequence when the pipe to the
> bootstrap process (used for diagnostics) is no longer needed.
> 
> This could be achived in the same way as done for 'qemu-nbd -c' case.
> That was commit 0eaf453e, also fixing up e6df58a5. STDOUT copying to
> STDERR does the trick.
> 
> This also leads to proper 'ssh' connection closing which fixes my
> original problem.
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Eric Blake <eblake@redhat.com>
> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> CC: Hanna Reitz <hreitz@redhat.com>
> CC: <qemu-stable@nongnu.org>

This broke qemu-iotests 233 (Eric, please make sure to run the full
qemu-iotests suite before sending block related pull requests):

--- /home/kwolf/source/qemu/tests/qemu-iotests/233.out
+++ /home/kwolf/source/qemu/build-clang/scratch/raw-file-233/233.out.bad
@@ -99,14 +99,4 @@
 qemu-nbd: TLS handshake failed: The TLS connection was non-properly terminated.

 == final server log ==
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read from TLS channel: Software caused connection abort
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read from TLS channel: Software caused connection abort
-qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
-qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
-qemu-nbd: option negotiation failed: TLS x509 authz check for DISTINGUISHED-NAME is denied
-qemu-nbd: option negotiation failed: TLS x509 authz check for DISTINGUISHED-NAME is denied
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read from TLS channel: Software caused connection abort
-qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read from TLS channel: Software caused connection abort
-qemu-nbd: option negotiation failed: TLS handshake failed: An illegal parameter has been received.
-qemu-nbd: option negotiation failed: TLS handshake failed: An illegal parameter has been received.
 *** done

Do we really want to lose these error messages? This looks wrong to me.

Kevin
Re: [PATCH 2/5] qemu-nbd: fix regression with qemu-nbd --fork run over ssh
Posted by Eric Blake 2 years, 5 months ago
On Mon, Aug 14, 2023 at 04:14:41PM +0200, Kevin Wolf wrote:
> Am 17.07.2023 um 16:55 hat Denis V. Lunev geschrieben:
> > Commit e6df58a5578fee7a50bbf36f4a50a2781cff855d
> >     Author: Hanna Reitz <hreitz@redhat.com>
> >     Date:   Wed May 8 23:18:18 2019 +0200
> >     qemu-nbd: Do not close stderr
> > has introduced an interesting regression. Original behavior of
> >     ssh somehost qemu-nbd /home/den/tmp/file -f raw --fork
> > was the following:
> >  * qemu-nbd was started as a daemon
> >  * the command execution is done and ssh exited with success

Thinking about this more...

The original problem is that we broke 'ssh -c "qemu-nbd --fork ..."',
because the daemonized process hung on to the parent's stderr
indefinitely.

But when we pass -v, we WANT the parent's stderr to hang around, even
while we still want the parent process to see EOF on the handshake
socket used to let it know the child process got far enough along in
its initialization.

Should we be passing 'opt->verbose' instead of '0' to the second
parameter of qemu_daemon, to tell the child process the scenarios
where we want output to still be present?  If so, how does the
following patch look?

diff --git i/qemu-nbd.c w/qemu-nbd.c
index aaccaa33184..c316a91831d 100644
--- i/qemu-nbd.c
+++ w/qemu-nbd.c
@@ -944,9 +944,24 @@ int main(int argc, char **argv)

             close(stderr_fd[0]);

-            ret = qemu_daemon(1, 0);
+            ret = qemu_daemon(1, verbose);
             saved_errno = errno;    /* dup2 will overwrite error below */

+            if (verbose) {
+                /* We want stdin at /dev/null when qemu_daemon didn't do it */
+                stdin = freopen ("/dev/null", "r", stdin);
+                if (stdin == NULL) {
+                    error_report("Failed to redirect stdin: %s",
+                                 strerror(errno));
+                    exit(EXIT_FAILURE);
+                }
+                /* To keep the parent's stderr alive, copy it to stdout */
+                if (dup2(STDOUT_FILENO, STDERR_FILENO) < 0) {
+                    error_report("Failed to redirect stdout: %s",
+                                 strerror(errno));
+                    exit(EXIT_FAILURE);
+                }
+            }
             /* Temporarily redirect stderr to the parent's pipe...  */
             if (dup2(stderr_fd[1], STDERR_FILENO) < 0) {
                 char str[256];
@@ -1180,6 +1195,10 @@ int main(int argc, char **argv)
     }

     if (fork_process) {
+        /*
+         * See above. If verbose is false, stdout is /dev/null (thanks
+         * to qemu_daemon); otherwise, stdout is the parent's stderr.
+         */
         if (dup2(STDOUT_FILENO, STDERR_FILENO) < 0) {
             error_report("Could not set stderr to /dev/null: %s",
                          strerror(errno));


Note, however, that this still does not pass test 233 as written - the
error messages show up earlier in the run, rather than disappearing
altogether.

> > 
> > The patch has changed this behavior and 'ssh' command now hangs forever.
> > 
> > According to the normal specification of the daemon() call, we should
> > endup with STDERR pointing to /dev/null. That should be done at the
> > very end of the successful startup sequence when the pipe to the
> > bootstrap process (used for diagnostics) is no longer needed.
> > 
> > This could be achived in the same way as done for 'qemu-nbd -c' case.
> > That was commit 0eaf453e, also fixing up e6df58a5. STDOUT copying to
> > STDERR does the trick.
> > 
> > This also leads to proper 'ssh' connection closing which fixes my
> > original problem.
> > 
> > Signed-off-by: Denis V. Lunev <den@openvz.org>
> > CC: Eric Blake <eblake@redhat.com>
> > CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> > CC: Hanna Reitz <hreitz@redhat.com>
> > CC: <qemu-stable@nongnu.org>
> 
> This broke qemu-iotests 233 (Eric, please make sure to run the full
> qemu-iotests suite before sending block related pull requests):

My apologies; I keep forgetting that './check -nbd' does not catch all
the possible tests using NBD.  I've updated my checklists to make sure
I'm running a more thorough set of iotests before preparing a pull
request.

> 
> --- /home/kwolf/source/qemu/tests/qemu-iotests/233.out
> +++ /home/kwolf/source/qemu/build-clang/scratch/raw-file-233/233.out.bad
> @@ -99,14 +99,4 @@
>  qemu-nbd: TLS handshake failed: The TLS connection was non-properly terminated.
> 
>  == final server log ==
> -qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read from TLS channel: Software caused connection abort
> -qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read from TLS channel: Software caused connection abort
> -qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
> -qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
> -qemu-nbd: option negotiation failed: TLS x509 authz check for DISTINGUISHED-NAME is denied
> -qemu-nbd: option negotiation failed: TLS x509 authz check for DISTINGUISHED-NAME is denied
> -qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read from TLS channel: Software caused connection abort
> -qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read from TLS channel: Software caused connection abort
> -qemu-nbd: option negotiation failed: TLS handshake failed: An illegal parameter has been received.
> -qemu-nbd: option negotiation failed: TLS handshake failed: An illegal parameter has been received.
>  *** done
> 
> Do we really want to lose these error messages? This looks wrong to me.
> 
> Kevin
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org
Re: [PATCH 2/5] qemu-nbd: fix regression with qemu-nbd --fork run over ssh
Posted by Denis V. Lunev 2 years, 5 months ago
On 8/14/23 16:14, Kevin Wolf wrote:
> Am 17.07.2023 um 16:55 hat Denis V. Lunev geschrieben:
>> Commit e6df58a5578fee7a50bbf36f4a50a2781cff855d
>>      Author: Hanna Reitz <hreitz@redhat.com>
>>      Date:   Wed May 8 23:18:18 2019 +0200
>>      qemu-nbd: Do not close stderr
>> has introduced an interesting regression. Original behavior of
>>      ssh somehost qemu-nbd /home/den/tmp/file -f raw --fork
>> was the following:
>>   * qemu-nbd was started as a daemon
>>   * the command execution is done and ssh exited with success
>>
>> The patch has changed this behavior and 'ssh' command now hangs forever.
>>
>> According to the normal specification of the daemon() call, we should
>> endup with STDERR pointing to /dev/null. That should be done at the
>> very end of the successful startup sequence when the pipe to the
>> bootstrap process (used for diagnostics) is no longer needed.
>>
>> This could be achived in the same way as done for 'qemu-nbd -c' case.
>> That was commit 0eaf453e, also fixing up e6df58a5. STDOUT copying to
>> STDERR does the trick.
>>
>> This also leads to proper 'ssh' connection closing which fixes my
>> original problem.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Eric Blake <eblake@redhat.com>
>> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> CC: Hanna Reitz <hreitz@redhat.com>
>> CC: <qemu-stable@nongnu.org>
> This broke qemu-iotests 233 (Eric, please make sure to run the full
> qemu-iotests suite before sending block related pull requests):
>
> --- /home/kwolf/source/qemu/tests/qemu-iotests/233.out
> +++ /home/kwolf/source/qemu/build-clang/scratch/raw-file-233/233.out.bad
> @@ -99,14 +99,4 @@
>   qemu-nbd: TLS handshake failed: The TLS connection was non-properly terminated.
>
>   == final server log ==
> -qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read from TLS channel: Software caused connection abort
> -qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read from TLS channel: Software caused connection abort
> -qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
> -qemu-nbd: option negotiation failed: Verify failed: No certificate was found.
> -qemu-nbd: option negotiation failed: TLS x509 authz check for DISTINGUISHED-NAME is denied
> -qemu-nbd: option negotiation failed: TLS x509 authz check for DISTINGUISHED-NAME is denied
> -qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read from TLS channel: Software caused connection abort
> -qemu-nbd: option negotiation failed: Failed to read opts magic: Cannot read from TLS channel: Software caused connection abort
> -qemu-nbd: option negotiation failed: TLS handshake failed: An illegal parameter has been received.
> -qemu-nbd: option negotiation failed: TLS handshake failed: An illegal parameter has been received.
>   *** done
>
> Do we really want to lose these error messages? This looks wrong to me.
>
> Kevin
>
In that case likely we need to extend -v option behavior
and translate these messages in that case.

I'll take a look.

Thank you for the patience,
     Den

Re: [PATCH 2/5] qemu-nbd: fix regression with qemu-nbd --fork run over ssh
Posted by Denis V. Lunev 2 years, 5 months ago
On 8/15/23 12:40, Denis V. Lunev wrote:
> On 8/14/23 16:14, Kevin Wolf wrote:
>> Am 17.07.2023 um 16:55 hat Denis V. Lunev geschrieben:
>>> Commit e6df58a5578fee7a50bbf36f4a50a2781cff855d
>>>      Author: Hanna Reitz <hreitz@redhat.com>
>>>      Date:   Wed May 8 23:18:18 2019 +0200
>>>      qemu-nbd: Do not close stderr
>>> has introduced an interesting regression. Original behavior of
>>>      ssh somehost qemu-nbd /home/den/tmp/file -f raw --fork
>>> was the following:
>>>   * qemu-nbd was started as a daemon
>>>   * the command execution is done and ssh exited with success
>>>
>>> The patch has changed this behavior and 'ssh' command now hangs 
>>> forever.
>>>
>>> According to the normal specification of the daemon() call, we should
>>> endup with STDERR pointing to /dev/null. That should be done at the
>>> very end of the successful startup sequence when the pipe to the
>>> bootstrap process (used for diagnostics) is no longer needed.
>>>
>>> This could be achived in the same way as done for 'qemu-nbd -c' case.
>>> That was commit 0eaf453e, also fixing up e6df58a5. STDOUT copying to
>>> STDERR does the trick.
>>>
>>> This also leads to proper 'ssh' connection closing which fixes my
>>> original problem.
>>>
>>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>>> CC: Eric Blake <eblake@redhat.com>
>>> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>>> CC: Hanna Reitz <hreitz@redhat.com>
>>> CC: <qemu-stable@nongnu.org>
>> This broke qemu-iotests 233 (Eric, please make sure to run the full
>> qemu-iotests suite before sending block related pull requests):
>>
>> --- /home/kwolf/source/qemu/tests/qemu-iotests/233.out
>> +++ /home/kwolf/source/qemu/build-clang/scratch/raw-file-233/233.out.bad
>> @@ -99,14 +99,4 @@
>>   qemu-nbd: TLS handshake failed: The TLS connection was non-properly 
>> terminated.
>>
>>   == final server log ==
>> -qemu-nbd: option negotiation failed: Failed to read opts magic: 
>> Cannot read from TLS channel: Software caused connection abort
>> -qemu-nbd: option negotiation failed: Failed to read opts magic: 
>> Cannot read from TLS channel: Software caused connection abort
>> -qemu-nbd: option negotiation failed: Verify failed: No certificate 
>> was found.
>> -qemu-nbd: option negotiation failed: Verify failed: No certificate 
>> was found.
>> -qemu-nbd: option negotiation failed: TLS x509 authz check for 
>> DISTINGUISHED-NAME is denied
>> -qemu-nbd: option negotiation failed: TLS x509 authz check for 
>> DISTINGUISHED-NAME is denied
>> -qemu-nbd: option negotiation failed: Failed to read opts magic: 
>> Cannot read from TLS channel: Software caused connection abort
>> -qemu-nbd: option negotiation failed: Failed to read opts magic: 
>> Cannot read from TLS channel: Software caused connection abort
>> -qemu-nbd: option negotiation failed: TLS handshake failed: An 
>> illegal parameter has been received.
>> -qemu-nbd: option negotiation failed: TLS handshake failed: An 
>> illegal parameter has been received.
>>   *** done
>>
>> Do we really want to lose these error messages? This looks wrong to me.
>>
>> Kevin
>>
> In that case likely we need to extend -v option behavior
> and translate these messages in that case.
>
> I'll take a look.
>
> Thank you for the patience,
>     Den

Hi!

Small side note.

I am 100% sure that I have run this set of tests and
there was no fault. I have re-run them and once
again has not get the fault :-)

The reason for that is quite interesting:
* the test does not start due to the absence of the
   'certool' utility from gnutls

This brings the very important question.

Should we *FAIL* when important utility is missed
or skip? I believe that our goal is to fail to avoid such
cases.

What do you think?

Den

Re: [PATCH 2/5] qemu-nbd: fix regression with qemu-nbd --fork run over ssh
Posted by Kevin Wolf 2 years, 5 months ago
Am 15.08.2023 um 14:17 hat Denis V. Lunev geschrieben:
> Hi!
> 
> Small side note.
> 
> I am 100% sure that I have run this set of tests and
> there was no fault. I have re-run them and once
> again has not get the fault :-)
> 
> The reason for that is quite interesting:
> * the test does not start due to the absence of the
>   'certool' utility from gnutls
> 
> This brings the very important question.
> 
> Should we *FAIL* when important utility is missed
> or skip? I believe that our goal is to fail to avoid such
> cases.
> 
> What do you think?

In general I think it makes sense that FAIL means that the test could
run as expected, but we got an unexpected result (i.e. this is likely a
QEMU bug), and SKIP means that the test couldn't meaningfully be
performed on the host system.

Making more things hard dependencies for the test would mean that it's
harder to miss an instance like this, but it would also make it harder
to run the test suite on a system that doesn't have the dependencies
available (and you might not even have root privileges to install them).

I think I'd leave things as they are now, but recommend that you
occasionally check the tests reported as "not run" to see if you could
easily provide the thing they would need.

Kevin
Re: [PATCH 2/5] qemu-nbd: fix regression with qemu-nbd --fork run over ssh
Posted by Peter Maydell 2 years, 5 months ago
On Tue, 15 Aug 2023 at 15:59, Kevin Wolf <kwolf@redhat.com> wrote:
> In general I think it makes sense that FAIL means that the test could
> run as expected, but we got an unexpected result (i.e. this is likely a
> QEMU bug), and SKIP means that the test couldn't meaningfully be
> performed on the host system.
>
> Making more things hard dependencies for the test would mean that it's
> harder to miss an instance like this, but it would also make it harder
> to run the test suite on a system that doesn't have the dependencies
> available (and you might not even have root privileges to install them).
>
> I think I'd leave things as they are now, but recommend that you
> occasionally check the tests reported as "not run" to see if you could
> easily provide the thing they would need.

In a utopian world we might have a "make query-test-dependencies"
or something that would list all the tools/dependencies/etc that
are missing and which tests this will cause to be skipped...

-- PMM
Re: [PATCH 2/5] qemu-nbd: fix regression with qemu-nbd --fork run over ssh
Posted by Eric Blake 2 years, 6 months ago
On Mon, Jul 17, 2023 at 04:55:41PM +0200, Denis V. Lunev wrote:
> Commit e6df58a5578fee7a50bbf36f4a50a2781cff855d
>     Author: Hanna Reitz <hreitz@redhat.com>
>     Date:   Wed May 8 23:18:18 2019 +0200
>     qemu-nbd: Do not close stderr
> has introduced an interesting regression. Original behavior of
>     ssh somehost qemu-nbd /home/den/tmp/file -f raw --fork
> was the following:
>  * qemu-nbd was started as a daemon
>  * the command execution is done and ssh exited with success
> 
> The patch has changed this behavior and 'ssh' command now hangs forever.
> 
> According to the normal specification of the daemon() call, we should
> endup with STDERR pointing to /dev/null. That should be done at the
> very end of the successful startup sequence when the pipe to the
> bootstrap process (used for diagnostics) is no longer needed.
> 
> This could be achived in the same way as done for 'qemu-nbd -c' case.
> That was commit 0eaf453e, also fixing up e6df58a5. STDOUT copying to
> STDERR does the trick.
> 
> This also leads to proper 'ssh' connection closing which fixes my
> original problem.
> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Eric Blake <eblake@redhat.com>
> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
> CC: Hanna Reitz <hreitz@redhat.com>
> CC: <qemu-stable@nongnu.org>
> ---
>  qemu-nbd.c | 13 ++++---------
>  1 file changed, 4 insertions(+), 9 deletions(-)
> 
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index 77f98c736b..186ce9474c 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -274,6 +274,7 @@ static void *show_parts(void *arg)
>  
>  struct NbdClientOpts {
>      char *device;
> +    bool fork_process;
>  };
>  
>  static void *nbd_client_thread(void *arg)
> @@ -317,7 +318,7 @@ static void *nbd_client_thread(void *arg)
>      /* update partition table */
>      pthread_create(&show_parts_thread, NULL, show_parts, opts->device);
>  
> -    if (verbose) {
> +    if (verbose && !opts->fork_process) {

It seems a bit odd to use the global 'fork' but the local
'opts->fork_process' in the same conditional.  Perhaps patch 1/5
should be modified to also pass verbose through opts?

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org
Re: [PATCH 2/5] qemu-nbd: fix regression with qemu-nbd --fork run over ssh
Posted by Denis V. Lunev 2 years, 6 months ago
On 7/17/23 21:04, Eric Blake wrote:
> On Mon, Jul 17, 2023 at 04:55:41PM +0200, Denis V. Lunev wrote:
>> Commit e6df58a5578fee7a50bbf36f4a50a2781cff855d
>>      Author: Hanna Reitz <hreitz@redhat.com>
>>      Date:   Wed May 8 23:18:18 2019 +0200
>>      qemu-nbd: Do not close stderr
>> has introduced an interesting regression. Original behavior of
>>      ssh somehost qemu-nbd /home/den/tmp/file -f raw --fork
>> was the following:
>>   * qemu-nbd was started as a daemon
>>   * the command execution is done and ssh exited with success
>>
>> The patch has changed this behavior and 'ssh' command now hangs forever.
>>
>> According to the normal specification of the daemon() call, we should
>> endup with STDERR pointing to /dev/null. That should be done at the
>> very end of the successful startup sequence when the pipe to the
>> bootstrap process (used for diagnostics) is no longer needed.
>>
>> This could be achived in the same way as done for 'qemu-nbd -c' case.
>> That was commit 0eaf453e, also fixing up e6df58a5. STDOUT copying to
>> STDERR does the trick.
>>
>> This also leads to proper 'ssh' connection closing which fixes my
>> original problem.
>>
>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Eric Blake <eblake@redhat.com>
>> CC: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
>> CC: Hanna Reitz <hreitz@redhat.com>
>> CC: <qemu-stable@nongnu.org>
>> ---
>>   qemu-nbd.c | 13 ++++---------
>>   1 file changed, 4 insertions(+), 9 deletions(-)
>>
>> diff --git a/qemu-nbd.c b/qemu-nbd.c
>> index 77f98c736b..186ce9474c 100644
>> --- a/qemu-nbd.c
>> +++ b/qemu-nbd.c
>> @@ -274,6 +274,7 @@ static void *show_parts(void *arg)
>>   
>>   struct NbdClientOpts {
>>       char *device;
>> +    bool fork_process;
>>   };
>>   
>>   static void *nbd_client_thread(void *arg)
>> @@ -317,7 +318,7 @@ static void *nbd_client_thread(void *arg)
>>       /* update partition table */
>>       pthread_create(&show_parts_thread, NULL, show_parts, opts->device);
>>   
>> -    if (verbose) {
>> +    if (verbose && !opts->fork_process) {
> It seems a bit odd to use the global 'fork' but the local
> 'opts->fork_process' in the same conditional.  Perhaps patch 1/5
> should be modified to also pass verbose through opts?
>
> Reviewed-by: Eric Blake <eblake@redhat.com>
>
sent to the thread as [PATCH 6/5] for convenience

Den