[Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again

Peter Xu posted 1 patch 5 years, 11 months ago
Failed in applying to current master (apply log)
monitor.c        | 13 +++----------
tests/qmp-test.c |  2 +-
vl.c             |  9 ++++-----
3 files changed, 8 insertions(+), 16 deletions(-)
[Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again
Posted by Peter Xu 5 years, 11 months ago
We turned Out-Of-Band feature of monitors off for 2.12 release.  Now we
try to turn that on again.

Signed-off-by: Peter Xu <peterx@redhat.com>
--
Now OOB should be okay with all known tests (except iotest qcow2, since
it is still broken on master), and AFAIK now we should also be okay with
ARM+Libvirt (not testsed, but Eric Auger helped to verify that before
the release).  So I think it's now safe to turn OOB on again.  Please
feel free to test this against any of existing testsuites to see whether
it'll still break any stuff.  Thanks,

Signed-off-by: Peter Xu <peterx@redhat.com>
---
 monitor.c        | 13 +++----------
 tests/qmp-test.c |  2 +-
 vl.c             |  9 ++++-----
 3 files changed, 8 insertions(+), 16 deletions(-)

diff --git a/monitor.c b/monitor.c
index 46814af533..ce5cc5e34e 100644
--- a/monitor.c
+++ b/monitor.c
@@ -4560,16 +4560,9 @@ void monitor_init(Chardev *chr, int flags)
     bool use_readline = flags & MONITOR_USE_READLINE;
     bool use_oob = flags & MONITOR_USE_OOB;
 
-    if (use_oob) {
-        if (CHARDEV_IS_MUX(chr)) {
-            error_report("Monitor Out-Of-Band is not supported with "
-                         "MUX typed chardev backend");
-            exit(1);
-        }
-        if (use_readline) {
-            error_report("Monitor Out-Of-band is only supported by QMP");
-            exit(1);
-        }
+    if (CHARDEV_IS_MUX(chr)) {
+        /* MUX is still not supported for Out-Of-Band */
+        use_oob = false;
     }
 
     monitor_data_init(mon, false, use_oob);
diff --git a/tests/qmp-test.c b/tests/qmp-test.c
index 88f867f8c0..c85a3964d9 100644
--- a/tests/qmp-test.c
+++ b/tests/qmp-test.c
@@ -89,7 +89,7 @@ static void test_qmp_protocol(void)
     g_assert(q);
     test_version(qdict_get(q, "version"));
     capabilities = qdict_get_qlist(q, "capabilities");
-    g_assert(capabilities && qlist_empty(capabilities));
+    g_assert(capabilities);
     qobject_unref(resp);
 
     /* Test valid command before handshake */
diff --git a/vl.c b/vl.c
index 3b39bbd7a8..b71fb8eb25 100644
--- a/vl.c
+++ b/vl.c
@@ -2394,6 +2394,10 @@ static int mon_init_func(void *opaque, QemuOpts *opts, Error **errp)
         flags = MONITOR_USE_READLINE;
     } else if (strcmp(mode, "control") == 0) {
         flags = MONITOR_USE_CONTROL;
+        /* Out-Of-Band is on by default */
+        if (qemu_opt_get_bool(opts, "x-oob", 1)) {
+            flags |= MONITOR_USE_OOB;
+        }
     } else {
         error_report("unknown monitor mode \"%s\"", mode);
         exit(1);
@@ -2402,11 +2406,6 @@ static int mon_init_func(void *opaque, QemuOpts *opts, Error **errp)
     if (qemu_opt_get_bool(opts, "pretty", 0))
         flags |= MONITOR_USE_PRETTY;
 
-    /* OOB is off by default */
-    if (qemu_opt_get_bool(opts, "x-oob", 0)) {
-        flags |= MONITOR_USE_OOB;
-    }
-
     chardev = qemu_opt_get(opts, "chardev");
     chr = qemu_chr_find(chardev);
     if (chr == NULL) {
-- 
2.17.0


Re: [Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again
Posted by Daniel P. Berrangé 5 years, 11 months ago
On Mon, May 21, 2018 at 04:42:49PM +0800, Peter Xu wrote:
> We turned Out-Of-Band feature of monitors off for 2.12 release.  Now we
> try to turn that on again.
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> --
> Now OOB should be okay with all known tests (except iotest qcow2, since
> it is still broken on master), and AFAIK now we should also be okay with
> ARM+Libvirt (not testsed, but Eric Auger helped to verify that before
> the release).  So I think it's now safe to turn OOB on again.  Please
> feel free to test this against any of existing testsuites to see whether
> it'll still break any stuff.  Thanks,

Can you remind us what the problem was with ARM+Libvirt that this
caused ?

> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>  monitor.c        | 13 +++----------
>  tests/qmp-test.c |  2 +-
>  vl.c             |  9 ++++-----
>  3 files changed, 8 insertions(+), 16 deletions(-)
> 
> diff --git a/monitor.c b/monitor.c
> index 46814af533..ce5cc5e34e 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -4560,16 +4560,9 @@ void monitor_init(Chardev *chr, int flags)
>      bool use_readline = flags & MONITOR_USE_READLINE;
>      bool use_oob = flags & MONITOR_USE_OOB;
>  
> -    if (use_oob) {
> -        if (CHARDEV_IS_MUX(chr)) {
> -            error_report("Monitor Out-Of-Band is not supported with "
> -                         "MUX typed chardev backend");
> -            exit(1);
> -        }
> -        if (use_readline) {
> -            error_report("Monitor Out-Of-band is only supported by QMP");
> -            exit(1);
> -        }
> +    if (CHARDEV_IS_MUX(chr)) {
> +        /* MUX is still not supported for Out-Of-Band */
> +        use_oob = false;
>      }
>  
>      monitor_data_init(mon, false, use_oob);
> diff --git a/tests/qmp-test.c b/tests/qmp-test.c
> index 88f867f8c0..c85a3964d9 100644
> --- a/tests/qmp-test.c
> +++ b/tests/qmp-test.c
> @@ -89,7 +89,7 @@ static void test_qmp_protocol(void)
>      g_assert(q);
>      test_version(qdict_get(q, "version"));
>      capabilities = qdict_get_qlist(q, "capabilities");
> -    g_assert(capabilities && qlist_empty(capabilities));
> +    g_assert(capabilities);
>      qobject_unref(resp);
>  
>      /* Test valid command before handshake */
> diff --git a/vl.c b/vl.c
> index 3b39bbd7a8..b71fb8eb25 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2394,6 +2394,10 @@ static int mon_init_func(void *opaque, QemuOpts *opts, Error **errp)
>          flags = MONITOR_USE_READLINE;
>      } else if (strcmp(mode, "control") == 0) {
>          flags = MONITOR_USE_CONTROL;
> +        /* Out-Of-Band is on by default */
> +        if (qemu_opt_get_bool(opts, "x-oob", 1)) {
> +            flags |= MONITOR_USE_OOB;
> +        }
>      } else {
>          error_report("unknown monitor mode \"%s\"", mode);
>          exit(1);
> @@ -2402,11 +2406,6 @@ static int mon_init_func(void *opaque, QemuOpts *opts, Error **errp)
>      if (qemu_opt_get_bool(opts, "pretty", 0))
>          flags |= MONITOR_USE_PRETTY;
>  
> -    /* OOB is off by default */
> -    if (qemu_opt_get_bool(opts, "x-oob", 0)) {
> -        flags |= MONITOR_USE_OOB;
> -    }
> -
>      chardev = qemu_opt_get(opts, "chardev");
>      chr = qemu_chr_find(chardev);
>      if (chr == NULL) {
> -- 
> 2.17.0
> 

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

Re: [Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again
Posted by Peter Xu 5 years, 11 months ago
On Mon, May 21, 2018 at 09:59:07AM +0100, Daniel P. Berrangé wrote:
> On Mon, May 21, 2018 at 04:42:49PM +0800, Peter Xu wrote:
> > We turned Out-Of-Band feature of monitors off for 2.12 release.  Now we
> > try to turn that on again.
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > --
> > Now OOB should be okay with all known tests (except iotest qcow2, since
> > it is still broken on master), and AFAIK now we should also be okay with
> > ARM+Libvirt (not testsed, but Eric Auger helped to verify that before
> > the release).  So I think it's now safe to turn OOB on again.  Please
> > feel free to test this against any of existing testsuites to see whether
> > it'll still break any stuff.  Thanks,
> 
> Can you remind us what the problem was with ARM+Libvirt that this
> caused ?

Sure.  The regression reported by Eric Auger is:

http://lists.gnu.org/archive/html/qemu-devel/2018-03/msg06231.html

The fix is:

https://patchwork.kernel.org/patch/10332289/

And the fix is in master already.

Regards,

-- 
Peter Xu

Re: [Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again
Posted by Eric Blake 5 years, 11 months ago
On 05/21/2018 03:42 AM, Peter Xu wrote:
> We turned Out-Of-Band feature of monitors off for 2.12 release.  Now we
> try to turn that on again.

"try to turn" sounds weak, like you aren't sure of this patch.  If you 
aren't sure, then why should we feel safe in applying it?  This text is 
going in the permanent git history, so sound bold, rather than hesitant!

"We have resolved the issues from last time (commit 3fd2457d reverted by 
commit a4f90923):
- issue 1 ...
- issue 2 ...
So now we are ready to enable advertisement of the feature by default"

with better descriptions of the issues that you fixed (I can think of at 
least the fixes adding thread-safety to the current monitor, and fixing 
early use of the monitor before qmp_capabilities completes; there may 
also be other issues that you want to call out).

> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> --
> Now OOB should be okay with all known tests (except iotest qcow2, since
> it is still broken on master),

Which tests are still failing for you?  Ideally, you can still 
demonstrate that the tests not failing without this patch continue to 
pass with this patch, even if you call out the tests that have known 
issues to still be resolved.

> and AFAIK now we should also be okay with
> ARM+Libvirt (not testsed, but Eric Auger helped to verify that before
> the release).  So I think it's now safe to turn OOB on again.  Please
> feel free to test this against any of existing testsuites to see whether
> it'll still break any stuff.  Thanks,
> 
> Signed-off-by: Peter Xu <peterx@redhat.com>
> ---
>   monitor.c        | 13 +++----------
>   tests/qmp-test.c |  2 +-
>   vl.c             |  9 ++++-----
>   3 files changed, 8 insertions(+), 16 deletions(-)
> 
> diff --git a/monitor.c b/monitor.c
> index 46814af533..ce5cc5e34e 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -4560,16 +4560,9 @@ void monitor_init(Chardev *chr, int flags)
>       bool use_readline = flags & MONITOR_USE_READLINE;
>       bool use_oob = flags & MONITOR_USE_OOB;
>   
> -    if (use_oob) {
> -        if (CHARDEV_IS_MUX(chr)) {
> -            error_report("Monitor Out-Of-Band is not supported with "
> -                         "MUX typed chardev backend");
> -            exit(1);
> -        }
> -        if (use_readline) {
> -            error_report("Monitor Out-Of-band is only supported by QMP");
> -            exit(1);
> -        }
> +    if (CHARDEV_IS_MUX(chr)) {
> +        /* MUX is still not supported for Out-Of-Band */
> +        use_oob = false;

This isn't a mere reinstatement of 3fd2457d, but is now advertising OOB 
when using readline (which presumably is a synonym for using HMP).  Is 
that intentional?  If so, the commit message should mention it.

>       }
>   
>       monitor_data_init(mon, false, use_oob);
> diff --git a/tests/qmp-test.c b/tests/qmp-test.c
> index 88f867f8c0..c85a3964d9 100644
> --- a/tests/qmp-test.c
> +++ b/tests/qmp-test.c
> @@ -89,7 +89,7 @@ static void test_qmp_protocol(void)
>       g_assert(q);
>       test_version(qdict_get(q, "version"));
>       capabilities = qdict_get_qlist(q, "capabilities");
> -    g_assert(capabilities && qlist_empty(capabilities));
> +    g_assert(capabilities);
>       qobject_unref(resp);
>   
>       /* Test valid command before handshake */
> diff --git a/vl.c b/vl.c
> index 3b39bbd7a8..b71fb8eb25 100644
> --- a/vl.c
> +++ b/vl.c
> @@ -2394,6 +2394,10 @@ static int mon_init_func(void *opaque, QemuOpts *opts, Error **errp)
>           flags = MONITOR_USE_READLINE;
>       } else if (strcmp(mode, "control") == 0) {
>           flags = MONITOR_USE_CONTROL;
> +        /* Out-Of-Band is on by default */
> +        if (qemu_opt_get_bool(opts, "x-oob", 1)) {
> +            flags |= MONITOR_USE_OOB;
> +        }

Do we really still need the x-oob property, vs. outright deletion of 
this bandaid?  Then again, I guess keeping it for one more release makes 
it easier to forcefully turn things off for temporary testing when 
isolating whether OOB is a culprit in something breaking.

>       } else {
>           error_report("unknown monitor mode \"%s\"", mode);
>           exit(1);
> @@ -2402,11 +2406,6 @@ static int mon_init_func(void *opaque, QemuOpts *opts, Error **errp)
>       if (qemu_opt_get_bool(opts, "pretty", 0))
>           flags |= MONITOR_USE_PRETTY;
>   
> -    /* OOB is off by default */
> -    if (qemu_opt_get_bool(opts, "x-oob", 0)) {
> -        flags |= MONITOR_USE_OOB;
> -    }
> -
>       chardev = qemu_opt_get(opts, "chardev");
>       chr = qemu_chr_find(chardev);
>       if (chr == NULL) {
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again
Posted by Peter Xu 5 years, 11 months ago
On Mon, May 21, 2018 at 09:13:06AM -0500, Eric Blake wrote:
> On 05/21/2018 03:42 AM, Peter Xu wrote:
> > We turned Out-Of-Band feature of monitors off for 2.12 release.  Now we
> > try to turn that on again.
> 
> "try to turn" sounds weak, like you aren't sure of this patch.  If you
> aren't sure, then why should we feel safe in applying it?  This text is
> going in the permanent git history, so sound bold, rather than hesitant!

Yeah I am not really strong at turn that on by default, that's why I
marked the patch as RFC.  I wanted to hear from your opinions.  For
now IMHO even with x-oob, postcopy can start to work with network
recovery, then the requirement from my part is done.  However I'm
thinking maybe we should still turn that on for all the people.  One
reason is that we already have the QMP capability negotiation so it
seems redundant (as you mentioned before), meanwhile exposing it to
broader users can let broader users to leverage this new bits directly
(meanwhile easier to expose potential issues for OOB too).

Meanwhile I'm not confident too on that there can be other test cases
that has not yet been run with Out-Of-Band, so even if we solved all
existing problems I can't be sure that no further test will broke.
However I don't see it a problem for merging, since AFAIU I can't
really know what will break again (if there is) unless we apply that
to master again... :)

> 
> "We have resolved the issues from last time (commit 3fd2457d reverted by
> commit a4f90923):
> - issue 1 ...
> - issue 2 ...
> So now we are ready to enable advertisement of the feature by default"
> 
> with better descriptions of the issues that you fixed (I can think of at
> least the fixes adding thread-safety to the current monitor, and fixing
> early use of the monitor before qmp_capabilities completes; there may also
> be other issues that you want to call out).

Some of the monitor patches are not really related to previous OOB
breakage, the only one that really matters should be the ARM+Libvirt
one, which I will definitely mention in my next post.  The rest
(including per-thread cur_mon, monitor thread-safety, etc.) should
mostly for future new commands of Out-Of-Band but not for now.  For
example, current OOB commands are rare, now they don't use the
get_fd()/set_fd() interface, then the mon_fdsets won't need to be
protected at all.  But we can't guarantee that new OOB commands won't
use them too, so we still need to protect them with locks.

> 
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > --
> > Now OOB should be okay with all known tests (except iotest qcow2, since
> > it is still broken on master),
> 
> Which tests are still failing for you?  Ideally, you can still demonstrate
> that the tests not failing without this patch continue to pass with this
> patch, even if you call out the tests that have known issues to still be
> resolved.

I didn't remember.  We can first settle down on whether we'd like to
turn on this default value, then I can perform this test for my next
post to make sure good tests won't break.

> 
> > and AFAIK now we should also be okay with
> > ARM+Libvirt (not testsed, but Eric Auger helped to verify that before
> > the release).  So I think it's now safe to turn OOB on again.  Please
> > feel free to test this against any of existing testsuites to see whether
> > it'll still break any stuff.  Thanks,
> > 
> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > ---
> >   monitor.c        | 13 +++----------
> >   tests/qmp-test.c |  2 +-
> >   vl.c             |  9 ++++-----
> >   3 files changed, 8 insertions(+), 16 deletions(-)
> > 
> > diff --git a/monitor.c b/monitor.c
> > index 46814af533..ce5cc5e34e 100644
> > --- a/monitor.c
> > +++ b/monitor.c
> > @@ -4560,16 +4560,9 @@ void monitor_init(Chardev *chr, int flags)
> >       bool use_readline = flags & MONITOR_USE_READLINE;
> >       bool use_oob = flags & MONITOR_USE_OOB;
> > -    if (use_oob) {
> > -        if (CHARDEV_IS_MUX(chr)) {
> > -            error_report("Monitor Out-Of-Band is not supported with "
> > -                         "MUX typed chardev backend");
> > -            exit(1);
> > -        }
> > -        if (use_readline) {
> > -            error_report("Monitor Out-Of-band is only supported by QMP");
> > -            exit(1);
> > -        }
> > +    if (CHARDEV_IS_MUX(chr)) {
> > +        /* MUX is still not supported for Out-Of-Band */
> > +        use_oob = false;
> 
> This isn't a mere reinstatement of 3fd2457d, but is now advertising OOB when
> using readline (which presumably is a synonym for using HMP).  Is that
> intentional?  If so, the commit message should mention it.

At [1] below I directly moved the chunk into "mode=control" path, so
the QMP check is already there.  Here I turn OOB off explicitly for
MUX no matter HMP/QMP.  It should have the same affect as 3fd2457d.

> 
> >       }
> >       monitor_data_init(mon, false, use_oob);
> > diff --git a/tests/qmp-test.c b/tests/qmp-test.c
> > index 88f867f8c0..c85a3964d9 100644
> > --- a/tests/qmp-test.c
> > +++ b/tests/qmp-test.c
> > @@ -89,7 +89,7 @@ static void test_qmp_protocol(void)
> >       g_assert(q);
> >       test_version(qdict_get(q, "version"));
> >       capabilities = qdict_get_qlist(q, "capabilities");
> > -    g_assert(capabilities && qlist_empty(capabilities));
> > +    g_assert(capabilities);
> >       qobject_unref(resp);
> >       /* Test valid command before handshake */
> > diff --git a/vl.c b/vl.c
> > index 3b39bbd7a8..b71fb8eb25 100644
> > --- a/vl.c
> > +++ b/vl.c
> > @@ -2394,6 +2394,10 @@ static int mon_init_func(void *opaque, QemuOpts *opts, Error **errp)
> >           flags = MONITOR_USE_READLINE;
> >       } else if (strcmp(mode, "control") == 0) {
> >           flags = MONITOR_USE_CONTROL;
> > +        /* Out-Of-Band is on by default */
> > +        if (qemu_opt_get_bool(opts, "x-oob", 1)) {
> > +            flags |= MONITOR_USE_OOB;
> > +        }

[1]

> 
> Do we really still need the x-oob property, vs. outright deletion of this
> bandaid?  Then again, I guess keeping it for one more release makes it
> easier to forcefully turn things off for temporary testing when isolating
> whether OOB is a culprit in something breaking.

Yes, since we already have it, I didn't have strong opinion to remove
it, so I kept it.  Actually now we can't turn OOB off completely
already - it's fully embeded in QMP internal logic (e.g., now we'll
always have completely isolated parser, dispatcher, responder; we
can't go back to the old func-call ways any more).  So maybe I can
remove that parameter directly next time.

Thanks,

> 
> >       } else {
> >           error_report("unknown monitor mode \"%s\"", mode);
> >           exit(1);
> > @@ -2402,11 +2406,6 @@ static int mon_init_func(void *opaque, QemuOpts *opts, Error **errp)
> >       if (qemu_opt_get_bool(opts, "pretty", 0))
> >           flags |= MONITOR_USE_PRETTY;
> > -    /* OOB is off by default */
> > -    if (qemu_opt_get_bool(opts, "x-oob", 0)) {
> > -        flags |= MONITOR_USE_OOB;
> > -    }
> > -
> >       chardev = qemu_opt_get(opts, "chardev");
> >       chr = qemu_chr_find(chardev);
> >       if (chr == NULL) {
> > 
> 
> -- 
> Eric Blake, Principal Software Engineer
> Red Hat, Inc.           +1-919-301-3266
> Virtualization:  qemu.org | libvirt.org

-- 
Peter Xu

Re: [Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again
Posted by Markus Armbruster 5 years, 11 months ago
Peter Xu <peterx@redhat.com> writes:

> On Mon, May 21, 2018 at 09:13:06AM -0500, Eric Blake wrote:
>> On 05/21/2018 03:42 AM, Peter Xu wrote:
>> > We turned Out-Of-Band feature of monitors off for 2.12 release.  Now we
>> > try to turn that on again.
>> 
>> "try to turn" sounds weak, like you aren't sure of this patch.  If you
>> aren't sure, then why should we feel safe in applying it?  This text is
>> going in the permanent git history, so sound bold, rather than hesitant!
>
> Yeah I am not really strong at turn that on by default, that's why I
> marked the patch as RFC.  I wanted to hear from your opinions.  For
> now IMHO even with x-oob, postcopy can start to work with network
> recovery, then the requirement from my part is done.  However I'm
> thinking maybe we should still turn that on for all the people.  One
> reason is that we already have the QMP capability negotiation so it
> seems redundant (as you mentioned before), meanwhile exposing it to
> broader users can let broader users to leverage this new bits directly
> (meanwhile easier to expose potential issues for OOB too).
>
> Meanwhile I'm not confident too on that there can be other test cases
> that has not yet been run with Out-Of-Band, so even if we solved all
> existing problems I can't be sure that no further test will broke.
> However I don't see it a problem for merging, since AFAIU I can't
> really know what will break again (if there is) unless we apply that
> to master again... :)

The time to switch it on is now.  I don't think we can find remaining
issues (if any) unless we do.

The time for doubts is the day before rc0 ;)

>> "We have resolved the issues from last time (commit 3fd2457d reverted by
>> commit a4f90923):
>> - issue 1 ...
>> - issue 2 ...
>> So now we are ready to enable advertisement of the feature by default"
>> 
>> with better descriptions of the issues that you fixed (I can think of at
>> least the fixes adding thread-safety to the current monitor, and fixing
>> early use of the monitor before qmp_capabilities completes; there may also
>> be other issues that you want to call out).
>
> Some of the monitor patches are not really related to previous OOB
> breakage, the only one that really matters should be the ARM+Libvirt
> one, which I will definitely mention in my next post.  The rest
> (including per-thread cur_mon, monitor thread-safety, etc.) should
> mostly for future new commands of Out-Of-Band but not for now.  For
> example, current OOB commands are rare, now they don't use the
> get_fd()/set_fd() interface, then the mon_fdsets won't need to be
> protected at all.

A lock works only when all its critical sections are guaranteed to
execute quickly!  Remember, for OOB to work, all OOB commands must
execute quickly, always.  Rule of thumb: treat OOB command code like
realtime code.

>                    But we can't guarantee that new OOB commands won't
> use them too, so we still need to protect them with locks.

When creating or modifying an OOB command, you need to be extra careful
about shared state: you may have to add suitable synchronization.

Please add suitable warnings about use of allow-oob to
docs/qapi-code-gen.txt.  Perhaps even a separate document on OOB,
pointed to by qapi-code-gen.txt.

>> > 
>> > Signed-off-by: Peter Xu <peterx@redhat.com>
>> > --
>> > Now OOB should be okay with all known tests (except iotest qcow2, since
>> > it is still broken on master),
>> 
>> Which tests are still failing for you?  Ideally, you can still demonstrate
>> that the tests not failing without this patch continue to pass with this
>> patch, even if you call out the tests that have known issues to still be
>> resolved.
>
> I didn't remember.  We can first settle down on whether we'd like to
> turn on this default value, then I can perform this test for my next
> post to make sure good tests won't break.
>
>> 
>> > and AFAIK now we should also be okay with
>> > ARM+Libvirt (not testsed, but Eric Auger helped to verify that before
>> > the release).  So I think it's now safe to turn OOB on again.  Please
>> > feel free to test this against any of existing testsuites to see whether
>> > it'll still break any stuff.  Thanks,
>> > 
>> > Signed-off-by: Peter Xu <peterx@redhat.com>
>> > ---
>> >   monitor.c        | 13 +++----------
>> >   tests/qmp-test.c |  2 +-
>> >   vl.c             |  9 ++++-----
>> >   3 files changed, 8 insertions(+), 16 deletions(-)
>> > 
>> > diff --git a/monitor.c b/monitor.c
>> > index 46814af533..ce5cc5e34e 100644
>> > --- a/monitor.c
>> > +++ b/monitor.c
>> > @@ -4560,16 +4560,9 @@ void monitor_init(Chardev *chr, int flags)
>> >       bool use_readline = flags & MONITOR_USE_READLINE;
>> >       bool use_oob = flags & MONITOR_USE_OOB;
>> > -    if (use_oob) {
>> > -        if (CHARDEV_IS_MUX(chr)) {
>> > -            error_report("Monitor Out-Of-Band is not supported with "
>> > -                         "MUX typed chardev backend");
>> > -            exit(1);
>> > -        }
>> > -        if (use_readline) {
>> > -            error_report("Monitor Out-Of-band is only supported by QMP");
>> > -            exit(1);
>> > -        }
>> > +    if (CHARDEV_IS_MUX(chr)) {
>> > +        /* MUX is still not supported for Out-Of-Band */
>> > +        use_oob = false;
>> 
>> This isn't a mere reinstatement of 3fd2457d, but is now advertising OOB when
>> using readline (which presumably is a synonym for using HMP).

It effectively is a synonm now, but exploiting it would be unclean.  The
clean test for HMP is !(flags & MONITOR_USE_CONTROL).

>>                                                                Is that
>> intentional?  If so, the commit message should mention it.
>
> At [1] below I directly moved the chunk into "mode=control" path, so
> the QMP check is already there.  Here I turn OOB off explicitly for
> MUX no matter HMP/QMP.  It should have the same affect as 3fd2457d.

Switching off OOB silently for mux'ed chardev is ugly.  But I don't have
better ideas.

What's keeping us from supporting OOB there?

Is anyone using QMP over a mux'ed chardev in anger?  I guess we don't
know.

>> >       }
>> >       monitor_data_init(mon, false, use_oob);
>> > diff --git a/tests/qmp-test.c b/tests/qmp-test.c
>> > index 88f867f8c0..c85a3964d9 100644
>> > --- a/tests/qmp-test.c
>> > +++ b/tests/qmp-test.c
>> > @@ -89,7 +89,7 @@ static void test_qmp_protocol(void)
>> >       g_assert(q);
>> >       test_version(qdict_get(q, "version"));
>> >       capabilities = qdict_get_qlist(q, "capabilities");
>> > -    g_assert(capabilities && qlist_empty(capabilities));
>> > +    g_assert(capabilities);
>> >       qobject_unref(resp);
>> >       /* Test valid command before handshake */

Let's check contents of @capabilities matches expectations.

>> > diff --git a/vl.c b/vl.c
>> > index 3b39bbd7a8..b71fb8eb25 100644
>> > --- a/vl.c
>> > +++ b/vl.c
>> > @@ -2394,6 +2394,10 @@ static int mon_init_func(void *opaque, QemuOpts *opts, Error **errp)
>> >           flags = MONITOR_USE_READLINE;
>> >       } else if (strcmp(mode, "control") == 0) {
>> >           flags = MONITOR_USE_CONTROL;
>> > +        /* Out-Of-Band is on by default */
>> > +        if (qemu_opt_get_bool(opts, "x-oob", 1)) {
>> > +            flags |= MONITOR_USE_OOB;
>> > +        }
>
> [1]
>
>> 
>> Do we really still need the x-oob property, vs. outright deletion of this
>> bandaid?  Then again, I guess keeping it for one more release makes it
>> easier to forcefully turn things off for temporary testing when isolating
>> whether OOB is a culprit in something breaking.
>
> Yes, since we already have it, I didn't have strong opinion to remove
> it, so I kept it.  Actually now we can't turn OOB off completely
> already - it's fully embeded in QMP internal logic (e.g., now we'll
> always have completely isolated parser, dispatcher, responder; we
> can't go back to the old func-call ways any more).  So maybe I can
> remove that parameter directly next time.

I'm leaning towards removing it, because:

* There's no test coverage for x-oob=off

* We already have a way to disable OOB: the QMP client can decline the
  capability

* MONITOR_USE_OOB makes monitor flags even uglier: of the four flag
  bits' sixteen combinations, only a few actually occur:

    MONITOR_USE_READLINE  on   off
    MONITOR_USE_CONTROL   off  on
    MONITOR_USE_PRETTY    N/A  either
    MONITOR_USE_OOB       N/A  on unless mux'ed

  That's just three genuine configurations (HMP, QMP, pretty QMP), if we
  ignore the "mux'ed can't do OOB, yet" bit.

  Without x-oob, we can drop MONITOR_USE_OOB, I think.

>
> Thanks,
>
>> 
>> >       } else {
>> >           error_report("unknown monitor mode \"%s\"", mode);
>> >           exit(1);
>> > @@ -2402,11 +2406,6 @@ static int mon_init_func(void *opaque, QemuOpts *opts, Error **errp)
>> >       if (qemu_opt_get_bool(opts, "pretty", 0))
>> >           flags |= MONITOR_USE_PRETTY;
>> > -    /* OOB is off by default */
>> > -    if (qemu_opt_get_bool(opts, "x-oob", 0)) {
>> > -        flags |= MONITOR_USE_OOB;
>> > -    }
>> > -
>> >       chardev = qemu_opt_get(opts, "chardev");
>> >       chr = qemu_chr_find(chardev);
>> >       if (chr == NULL) {
>> > 
>> 
>> -- 
>> Eric Blake, Principal Software Engineer
>> Red Hat, Inc.           +1-919-301-3266
>> Virtualization:  qemu.org | libvirt.org

Re: [Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again
Posted by Peter Xu 5 years, 11 months ago
On Thu, May 24, 2018 at 09:08:58AM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Mon, May 21, 2018 at 09:13:06AM -0500, Eric Blake wrote:
> >> On 05/21/2018 03:42 AM, Peter Xu wrote:
> >> > We turned Out-Of-Band feature of monitors off for 2.12 release.  Now we
> >> > try to turn that on again.
> >> 
> >> "try to turn" sounds weak, like you aren't sure of this patch.  If you
> >> aren't sure, then why should we feel safe in applying it?  This text is
> >> going in the permanent git history, so sound bold, rather than hesitant!
> >
> > Yeah I am not really strong at turn that on by default, that's why I
> > marked the patch as RFC.  I wanted to hear from your opinions.  For
> > now IMHO even with x-oob, postcopy can start to work with network
> > recovery, then the requirement from my part is done.  However I'm
> > thinking maybe we should still turn that on for all the people.  One
> > reason is that we already have the QMP capability negotiation so it
> > seems redundant (as you mentioned before), meanwhile exposing it to
> > broader users can let broader users to leverage this new bits directly
> > (meanwhile easier to expose potential issues for OOB too).
> >
> > Meanwhile I'm not confident too on that there can be other test cases
> > that has not yet been run with Out-Of-Band, so even if we solved all
> > existing problems I can't be sure that no further test will broke.
> > However I don't see it a problem for merging, since AFAIU I can't
> > really know what will break again (if there is) unless we apply that
> > to master again... :)
> 
> The time to switch it on is now.  I don't think we can find remaining
> issues (if any) unless we do.
> 
> The time for doubts is the day before rc0 ;)

Haha. I bet you are right. :)

> 
> >> "We have resolved the issues from last time (commit 3fd2457d reverted by
> >> commit a4f90923):
> >> - issue 1 ...
> >> - issue 2 ...
> >> So now we are ready to enable advertisement of the feature by default"
> >> 
> >> with better descriptions of the issues that you fixed (I can think of at
> >> least the fixes adding thread-safety to the current monitor, and fixing
> >> early use of the monitor before qmp_capabilities completes; there may also
> >> be other issues that you want to call out).
> >
> > Some of the monitor patches are not really related to previous OOB
> > breakage, the only one that really matters should be the ARM+Libvirt
> > one, which I will definitely mention in my next post.  The rest
> > (including per-thread cur_mon, monitor thread-safety, etc.) should
> > mostly for future new commands of Out-Of-Band but not for now.  For
> > example, current OOB commands are rare, now they don't use the
> > get_fd()/set_fd() interface, then the mon_fdsets won't need to be
> > protected at all.
> 
> A lock works only when all its critical sections are guaranteed to
> execute quickly!  Remember, for OOB to work, all OOB commands must
> execute quickly, always.  Rule of thumb: treat OOB command code like
> realtime code.

Yes indeed I suspect the realtime work will have similar requirements.
It's a funny experience that we do realtime-alike considerations in
QEMU's control path. :)

I believe that's why Dave pointed out that when taking the mon_fdsets
lock (in the other patchset) we can't call close().  That's a good
lesson.

(and I just noticed I forgot to CC Dave for this...; adding in)

> 
> >                    But we can't guarantee that new OOB commands won't
> > use them too, so we still need to protect them with locks.
> 
> When creating or modifying an OOB command, you need to be extra careful
> about shared state: you may have to add suitable synchronization.
> 
> Please add suitable warnings about use of allow-oob to
> docs/qapi-code-gen.txt.  Perhaps even a separate document on OOB,
> pointed to by qapi-code-gen.txt.

Maybe append another bullet to existing?

==================
diff --git a/docs/devel/qapi-code-gen.txt b/docs/devel/qapi-code-gen.txt
index b9b6eabd08..13f86095d1 100644
--- a/docs/devel/qapi-code-gen.txt
+++ b/docs/devel/qapi-code-gen.txt
@@ -680,6 +680,9 @@ OOB command handlers must satisfy the following conditions:
 - It does not invoke system calls that may block,
 - It does not access guest RAM that may block when userfaultfd is
   enabled for postcopy live migration.
+- It needs to protect possible shared states, since as long as a
+  command supports Out-Of-Band it means the handler can be run in
+  parallel with the same handler running in the other thread.
 
 If in doubt, do not implement OOB execution support.
==================

> 
> >> > 
> >> > Signed-off-by: Peter Xu <peterx@redhat.com>
> >> > --
> >> > Now OOB should be okay with all known tests (except iotest qcow2, since
> >> > it is still broken on master),
> >> 
> >> Which tests are still failing for you?  Ideally, you can still demonstrate
> >> that the tests not failing without this patch continue to pass with this
> >> patch, even if you call out the tests that have known issues to still be
> >> resolved.
> >
> > I didn't remember.  We can first settle down on whether we'd like to
> > turn on this default value, then I can perform this test for my next
> > post to make sure good tests won't break.
> >
> >> 
> >> > and AFAIK now we should also be okay with
> >> > ARM+Libvirt (not testsed, but Eric Auger helped to verify that before
> >> > the release).  So I think it's now safe to turn OOB on again.  Please
> >> > feel free to test this against any of existing testsuites to see whether
> >> > it'll still break any stuff.  Thanks,
> >> > 
> >> > Signed-off-by: Peter Xu <peterx@redhat.com>
> >> > ---
> >> >   monitor.c        | 13 +++----------
> >> >   tests/qmp-test.c |  2 +-
> >> >   vl.c             |  9 ++++-----
> >> >   3 files changed, 8 insertions(+), 16 deletions(-)
> >> > 
> >> > diff --git a/monitor.c b/monitor.c
> >> > index 46814af533..ce5cc5e34e 100644
> >> > --- a/monitor.c
> >> > +++ b/monitor.c
> >> > @@ -4560,16 +4560,9 @@ void monitor_init(Chardev *chr, int flags)
> >> >       bool use_readline = flags & MONITOR_USE_READLINE;
> >> >       bool use_oob = flags & MONITOR_USE_OOB;
> >> > -    if (use_oob) {
> >> > -        if (CHARDEV_IS_MUX(chr)) {
> >> > -            error_report("Monitor Out-Of-Band is not supported with "
> >> > -                         "MUX typed chardev backend");
> >> > -            exit(1);
> >> > -        }
> >> > -        if (use_readline) {
> >> > -            error_report("Monitor Out-Of-band is only supported by QMP");
> >> > -            exit(1);
> >> > -        }
> >> > +    if (CHARDEV_IS_MUX(chr)) {
> >> > +        /* MUX is still not supported for Out-Of-Band */
> >> > +        use_oob = false;
> >> 
> >> This isn't a mere reinstatement of 3fd2457d, but is now advertising OOB when
> >> using readline (which presumably is a synonym for using HMP).
> 
> It effectively is a synonm now, but exploiting it would be unclean.  The
> clean test for HMP is !(flags & MONITOR_USE_CONTROL).
> 
> >>                                                                Is that
> >> intentional?  If so, the commit message should mention it.
> >
> > At [1] below I directly moved the chunk into "mode=control" path, so
> > the QMP check is already there.  Here I turn OOB off explicitly for
> > MUX no matter HMP/QMP.  It should have the same affect as 3fd2457d.
> 
> Switching off OOB silently for mux'ed chardev is ugly.  But I don't have
> better ideas.
> 
> What's keeping us from supporting OOB there?

Monitor is only a single chardev frontend.  The whole OOB work allows
monitor frontend to run isolatedly, but it never applies to any other
frontend.  Considering that MUX can support arbitrary frontends to
share the single backend chardev, I suppose "supporting OOB for MUX"
basically means to support OOB for every existing chardev frontend...

> 
> Is anyone using QMP over a mux'ed chardev in anger?  I guess we don't
> know.

IMHO it will be fine even for anyone who likes MUXed - he/she just
needs an extra QMP channel only for OOB.

(Or would the anger be a good driving force to push someone to add
 complete support for Out-Of-Band?  I confess, I am not angry
 enough... :)

> 
> >> >       }
> >> >       monitor_data_init(mon, false, use_oob);
> >> > diff --git a/tests/qmp-test.c b/tests/qmp-test.c
> >> > index 88f867f8c0..c85a3964d9 100644
> >> > --- a/tests/qmp-test.c
> >> > +++ b/tests/qmp-test.c
> >> > @@ -89,7 +89,7 @@ static void test_qmp_protocol(void)
> >> >       g_assert(q);
> >> >       test_version(qdict_get(q, "version"));
> >> >       capabilities = qdict_get_qlist(q, "capabilities");
> >> > -    g_assert(capabilities && qlist_empty(capabilities));
> >> > +    g_assert(capabilities);
> >> >       qobject_unref(resp);
> >> >       /* Test valid command before handshake */
> 
> Let's check contents of @capabilities matches expectations.

It'll be checked in the other OOB specified test (test_qmp_oob).

> 
> >> > diff --git a/vl.c b/vl.c
> >> > index 3b39bbd7a8..b71fb8eb25 100644
> >> > --- a/vl.c
> >> > +++ b/vl.c
> >> > @@ -2394,6 +2394,10 @@ static int mon_init_func(void *opaque, QemuOpts *opts, Error **errp)
> >> >           flags = MONITOR_USE_READLINE;
> >> >       } else if (strcmp(mode, "control") == 0) {
> >> >           flags = MONITOR_USE_CONTROL;
> >> > +        /* Out-Of-Band is on by default */
> >> > +        if (qemu_opt_get_bool(opts, "x-oob", 1)) {
> >> > +            flags |= MONITOR_USE_OOB;
> >> > +        }
> >
> > [1]
> >
> >> 
> >> Do we really still need the x-oob property, vs. outright deletion of this
> >> bandaid?  Then again, I guess keeping it for one more release makes it
> >> easier to forcefully turn things off for temporary testing when isolating
> >> whether OOB is a culprit in something breaking.
> >
> > Yes, since we already have it, I didn't have strong opinion to remove
> > it, so I kept it.  Actually now we can't turn OOB off completely
> > already - it's fully embeded in QMP internal logic (e.g., now we'll
> > always have completely isolated parser, dispatcher, responder; we
> > can't go back to the old func-call ways any more).  So maybe I can
> > remove that parameter directly next time.
> 
> I'm leaning towards removing it, because:
> 
> * There's no test coverage for x-oob=off
> 
> * We already have a way to disable OOB: the QMP client can decline the
>   capability
> 
> * MONITOR_USE_OOB makes monitor flags even uglier: of the four flag
>   bits' sixteen combinations, only a few actually occur:
> 
>     MONITOR_USE_READLINE  on   off
>     MONITOR_USE_CONTROL   off  on
>     MONITOR_USE_PRETTY    N/A  either
>     MONITOR_USE_OOB       N/A  on unless mux'ed
> 
>   That's just three genuine configurations (HMP, QMP, pretty QMP), if we
>   ignore the "mux'ed can't do OOB, yet" bit.
> 
>   Without x-oob, we can drop MONITOR_USE_OOB, I think.

Yes, let me remove it.

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again
Posted by Dr. David Alan Gilbert 5 years, 11 months ago
* Peter Xu (peterx@redhat.com) wrote:
> On Thu, May 24, 2018 at 09:08:58AM +0200, Markus Armbruster wrote:
> > Peter Xu <peterx@redhat.com> writes:
> > 
> > > On Mon, May 21, 2018 at 09:13:06AM -0500, Eric Blake wrote:
> > >> On 05/21/2018 03:42 AM, Peter Xu wrote:
> > >> > We turned Out-Of-Band feature of monitors off for 2.12 release.  Now we
> > >> > try to turn that on again.
> > >> 
> > >> "try to turn" sounds weak, like you aren't sure of this patch.  If you
> > >> aren't sure, then why should we feel safe in applying it?  This text is
> > >> going in the permanent git history, so sound bold, rather than hesitant!
> > >
> > > Yeah I am not really strong at turn that on by default, that's why I
> > > marked the patch as RFC.  I wanted to hear from your opinions.  For
> > > now IMHO even with x-oob, postcopy can start to work with network
> > > recovery, then the requirement from my part is done.  However I'm
> > > thinking maybe we should still turn that on for all the people.  One
> > > reason is that we already have the QMP capability negotiation so it
> > > seems redundant (as you mentioned before), meanwhile exposing it to
> > > broader users can let broader users to leverage this new bits directly
> > > (meanwhile easier to expose potential issues for OOB too).
> > >
> > > Meanwhile I'm not confident too on that there can be other test cases
> > > that has not yet been run with Out-Of-Band, so even if we solved all
> > > existing problems I can't be sure that no further test will broke.
> > > However I don't see it a problem for merging, since AFAIU I can't
> > > really know what will break again (if there is) unless we apply that
> > > to master again... :)
> > 
> > The time to switch it on is now.  I don't think we can find remaining
> > issues (if any) unless we do.
> > 
> > The time for doubts is the day before rc0 ;)
> 
> Haha. I bet you are right. :)
> 
> > 
> > >> "We have resolved the issues from last time (commit 3fd2457d reverted by
> > >> commit a4f90923):
> > >> - issue 1 ...
> > >> - issue 2 ...
> > >> So now we are ready to enable advertisement of the feature by default"
> > >> 
> > >> with better descriptions of the issues that you fixed (I can think of at
> > >> least the fixes adding thread-safety to the current monitor, and fixing
> > >> early use of the monitor before qmp_capabilities completes; there may also
> > >> be other issues that you want to call out).
> > >
> > > Some of the monitor patches are not really related to previous OOB
> > > breakage, the only one that really matters should be the ARM+Libvirt
> > > one, which I will definitely mention in my next post.  The rest
> > > (including per-thread cur_mon, monitor thread-safety, etc.) should
> > > mostly for future new commands of Out-Of-Band but not for now.  For
> > > example, current OOB commands are rare, now they don't use the
> > > get_fd()/set_fd() interface, then the mon_fdsets won't need to be
> > > protected at all.
> > 
> > A lock works only when all its critical sections are guaranteed to
> > execute quickly!  Remember, for OOB to work, all OOB commands must
> > execute quickly, always.  Rule of thumb: treat OOB command code like
> > realtime code.
> 
> Yes indeed I suspect the realtime work will have similar requirements.
> It's a funny experience that we do realtime-alike considerations in
> QEMU's control path. :)
> 
> I believe that's why Dave pointed out that when taking the mon_fdsets
> lock (in the other patchset) we can't call close().  That's a good
> lesson.

I don't worry too much about the length of time; what I worry about 
is things that might actually hang (OK, a TCP timeout is too long
a length of time!); so things like close and other things that work
on sockets are just asking for trouble.
I wish there was a way we could annotate code called from OOB
context and get some tools to automatically spot problems!

Dave

> (and I just noticed I forgot to CC Dave for this...; adding in)
> 
> > 
> > >                    But we can't guarantee that new OOB commands won't
> > > use them too, so we still need to protect them with locks.
> > 
> > When creating or modifying an OOB command, you need to be extra careful
> > about shared state: you may have to add suitable synchronization.
> > 
> > Please add suitable warnings about use of allow-oob to
> > docs/qapi-code-gen.txt.  Perhaps even a separate document on OOB,
> > pointed to by qapi-code-gen.txt.
> 
> Maybe append another bullet to existing?
> 
> ==================
> diff --git a/docs/devel/qapi-code-gen.txt b/docs/devel/qapi-code-gen.txt
> index b9b6eabd08..13f86095d1 100644
> --- a/docs/devel/qapi-code-gen.txt
> +++ b/docs/devel/qapi-code-gen.txt
> @@ -680,6 +680,9 @@ OOB command handlers must satisfy the following conditions:
>  - It does not invoke system calls that may block,
>  - It does not access guest RAM that may block when userfaultfd is
>    enabled for postcopy live migration.
> +- It needs to protect possible shared states, since as long as a
> +  command supports Out-Of-Band it means the handler can be run in
> +  parallel with the same handler running in the other thread.
>  
>  If in doubt, do not implement OOB execution support.
> ==================
> 
> > 
> > >> > 
> > >> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > >> > --
> > >> > Now OOB should be okay with all known tests (except iotest qcow2, since
> > >> > it is still broken on master),
> > >> 
> > >> Which tests are still failing for you?  Ideally, you can still demonstrate
> > >> that the tests not failing without this patch continue to pass with this
> > >> patch, even if you call out the tests that have known issues to still be
> > >> resolved.
> > >
> > > I didn't remember.  We can first settle down on whether we'd like to
> > > turn on this default value, then I can perform this test for my next
> > > post to make sure good tests won't break.
> > >
> > >> 
> > >> > and AFAIK now we should also be okay with
> > >> > ARM+Libvirt (not testsed, but Eric Auger helped to verify that before
> > >> > the release).  So I think it's now safe to turn OOB on again.  Please
> > >> > feel free to test this against any of existing testsuites to see whether
> > >> > it'll still break any stuff.  Thanks,
> > >> > 
> > >> > Signed-off-by: Peter Xu <peterx@redhat.com>
> > >> > ---
> > >> >   monitor.c        | 13 +++----------
> > >> >   tests/qmp-test.c |  2 +-
> > >> >   vl.c             |  9 ++++-----
> > >> >   3 files changed, 8 insertions(+), 16 deletions(-)
> > >> > 
> > >> > diff --git a/monitor.c b/monitor.c
> > >> > index 46814af533..ce5cc5e34e 100644
> > >> > --- a/monitor.c
> > >> > +++ b/monitor.c
> > >> > @@ -4560,16 +4560,9 @@ void monitor_init(Chardev *chr, int flags)
> > >> >       bool use_readline = flags & MONITOR_USE_READLINE;
> > >> >       bool use_oob = flags & MONITOR_USE_OOB;
> > >> > -    if (use_oob) {
> > >> > -        if (CHARDEV_IS_MUX(chr)) {
> > >> > -            error_report("Monitor Out-Of-Band is not supported with "
> > >> > -                         "MUX typed chardev backend");
> > >> > -            exit(1);
> > >> > -        }
> > >> > -        if (use_readline) {
> > >> > -            error_report("Monitor Out-Of-band is only supported by QMP");
> > >> > -            exit(1);
> > >> > -        }
> > >> > +    if (CHARDEV_IS_MUX(chr)) {
> > >> > +        /* MUX is still not supported for Out-Of-Band */
> > >> > +        use_oob = false;
> > >> 
> > >> This isn't a mere reinstatement of 3fd2457d, but is now advertising OOB when
> > >> using readline (which presumably is a synonym for using HMP).
> > 
> > It effectively is a synonm now, but exploiting it would be unclean.  The
> > clean test for HMP is !(flags & MONITOR_USE_CONTROL).
> > 
> > >>                                                                Is that
> > >> intentional?  If so, the commit message should mention it.
> > >
> > > At [1] below I directly moved the chunk into "mode=control" path, so
> > > the QMP check is already there.  Here I turn OOB off explicitly for
> > > MUX no matter HMP/QMP.  It should have the same affect as 3fd2457d.
> > 
> > Switching off OOB silently for mux'ed chardev is ugly.  But I don't have
> > better ideas.
> > 
> > What's keeping us from supporting OOB there?
> 
> Monitor is only a single chardev frontend.  The whole OOB work allows
> monitor frontend to run isolatedly, but it never applies to any other
> frontend.  Considering that MUX can support arbitrary frontends to
> share the single backend chardev, I suppose "supporting OOB for MUX"
> basically means to support OOB for every existing chardev frontend...
> 
> > 
> > Is anyone using QMP over a mux'ed chardev in anger?  I guess we don't
> > know.
> 
> IMHO it will be fine even for anyone who likes MUXed - he/she just
> needs an extra QMP channel only for OOB.
> 
> (Or would the anger be a good driving force to push someone to add
>  complete support for Out-Of-Band?  I confess, I am not angry
>  enough... :)
> 
> > 
> > >> >       }
> > >> >       monitor_data_init(mon, false, use_oob);
> > >> > diff --git a/tests/qmp-test.c b/tests/qmp-test.c
> > >> > index 88f867f8c0..c85a3964d9 100644
> > >> > --- a/tests/qmp-test.c
> > >> > +++ b/tests/qmp-test.c
> > >> > @@ -89,7 +89,7 @@ static void test_qmp_protocol(void)
> > >> >       g_assert(q);
> > >> >       test_version(qdict_get(q, "version"));
> > >> >       capabilities = qdict_get_qlist(q, "capabilities");
> > >> > -    g_assert(capabilities && qlist_empty(capabilities));
> > >> > +    g_assert(capabilities);
> > >> >       qobject_unref(resp);
> > >> >       /* Test valid command before handshake */
> > 
> > Let's check contents of @capabilities matches expectations.
> 
> It'll be checked in the other OOB specified test (test_qmp_oob).
> 
> > 
> > >> > diff --git a/vl.c b/vl.c
> > >> > index 3b39bbd7a8..b71fb8eb25 100644
> > >> > --- a/vl.c
> > >> > +++ b/vl.c
> > >> > @@ -2394,6 +2394,10 @@ static int mon_init_func(void *opaque, QemuOpts *opts, Error **errp)
> > >> >           flags = MONITOR_USE_READLINE;
> > >> >       } else if (strcmp(mode, "control") == 0) {
> > >> >           flags = MONITOR_USE_CONTROL;
> > >> > +        /* Out-Of-Band is on by default */
> > >> > +        if (qemu_opt_get_bool(opts, "x-oob", 1)) {
> > >> > +            flags |= MONITOR_USE_OOB;
> > >> > +        }
> > >
> > > [1]
> > >
> > >> 
> > >> Do we really still need the x-oob property, vs. outright deletion of this
> > >> bandaid?  Then again, I guess keeping it for one more release makes it
> > >> easier to forcefully turn things off for temporary testing when isolating
> > >> whether OOB is a culprit in something breaking.
> > >
> > > Yes, since we already have it, I didn't have strong opinion to remove
> > > it, so I kept it.  Actually now we can't turn OOB off completely
> > > already - it's fully embeded in QMP internal logic (e.g., now we'll
> > > always have completely isolated parser, dispatcher, responder; we
> > > can't go back to the old func-call ways any more).  So maybe I can
> > > remove that parameter directly next time.
> > 
> > I'm leaning towards removing it, because:
> > 
> > * There's no test coverage for x-oob=off
> > 
> > * We already have a way to disable OOB: the QMP client can decline the
> >   capability
> > 
> > * MONITOR_USE_OOB makes monitor flags even uglier: of the four flag
> >   bits' sixteen combinations, only a few actually occur:
> > 
> >     MONITOR_USE_READLINE  on   off
> >     MONITOR_USE_CONTROL   off  on
> >     MONITOR_USE_PRETTY    N/A  either
> >     MONITOR_USE_OOB       N/A  on unless mux'ed
> > 
> >   That's just three genuine configurations (HMP, QMP, pretty QMP), if we
> >   ignore the "mux'ed can't do OOB, yet" bit.
> > 
> >   Without x-oob, we can drop MONITOR_USE_OOB, I think.
> 
> Yes, let me remove it.
> 
> Thanks,
> 
> -- 
> Peter Xu
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK

Re: [Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again
Posted by Stefan Hajnoczi 5 years, 10 months ago
On Thu, May 24, 2018 at 09:08:58AM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Mon, May 21, 2018 at 09:13:06AM -0500, Eric Blake wrote:
> >> On 05/21/2018 03:42 AM, Peter Xu wrote:
> >> > We turned Out-Of-Band feature of monitors off for 2.12 release.  Now we
> >> > try to turn that on again.
> >> 
> >> "try to turn" sounds weak, like you aren't sure of this patch.  If you
> >> aren't sure, then why should we feel safe in applying it?  This text is
> >> going in the permanent git history, so sound bold, rather than hesitant!
> >
> > Yeah I am not really strong at turn that on by default, that's why I
> > marked the patch as RFC.  I wanted to hear from your opinions.  For
> > now IMHO even with x-oob, postcopy can start to work with network
> > recovery, then the requirement from my part is done.  However I'm
> > thinking maybe we should still turn that on for all the people.  One
> > reason is that we already have the QMP capability negotiation so it
> > seems redundant (as you mentioned before), meanwhile exposing it to
> > broader users can let broader users to leverage this new bits directly
> > (meanwhile easier to expose potential issues for OOB too).
> >
> > Meanwhile I'm not confident too on that there can be other test cases
> > that has not yet been run with Out-Of-Band, so even if we solved all
> > existing problems I can't be sure that no further test will broke.
> > However I don't see it a problem for merging, since AFAIU I can't
> > really know what will break again (if there is) unless we apply that
> > to master again... :)
> 
> The time to switch it on is now.  I don't think we can find remaining
> issues (if any) unless we do.
> 
> The time for doubts is the day before rc0 ;)

I agree with Markus, let's enable OOB by default now.

The path to OOB has been long.  It's an invasive monitor change that
solves the postcopy recovery problem.  There have been lots of
complications but Peter has resolved them.

I personally don't think OOB is suitable for much else beyond postcopy
recovery (real-time code is a pain to write!), but we should switch to
this new code path and make sure it's solid for QEMU 3.0.

Stefan
Re: [Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again
Posted by John Snow 5 years, 11 months ago

On 05/21/2018 10:13 AM, Eric Blake wrote:
> On 05/21/2018 03:42 AM, Peter Xu wrote:
>> We turned Out-Of-Band feature of monitors off for 2.12 release.  Now we
>> try to turn that on again.
> 
> "try to turn" sounds weak, like you aren't sure of this patch.  If you
> aren't sure, then why should we feel safe in applying it?  This text is
> going in the permanent git history, so sound bold, rather than hesitant!
> 
> "We have resolved the issues from last time (commit 3fd2457d reverted by
> commit a4f90923):
> - issue 1 ...
> - issue 2 ...
> So now we are ready to enable advertisement of the feature by default"
> 
> with better descriptions of the issues that you fixed (I can think of at
> least the fixes adding thread-safety to the current monitor, and fixing
> early use of the monitor before qmp_capabilities completes; there may
> also be other issues that you want to call out).
> 
>>
>> Signed-off-by: Peter Xu <peterx@redhat.com>
>> -- 
>> Now OOB should be okay with all known tests (except iotest qcow2, since
>> it is still broken on master),
> 
> Which tests are still failing for you?  Ideally, you can still
> demonstrate that the tests not failing without this patch continue to
> pass with this patch, even if you call out the tests that have known
> issues to still be resolved.
> 

Probably 91 and 169. If any others fail that's news to me.

>> and AFAIK now we should also be okay with
>> ARM+Libvirt (not testsed, but Eric Auger helped to verify that before
>> the release).  So I think it's now safe to turn OOB on again.  Please
>> feel free to test this against any of existing testsuites to see whether
>> it'll still break any stuff.  Thanks,
>>
>> Signed-off-by: Peter Xu <peterx@redhat.com>
>> ---
>>   monitor.c        | 13 +++----------
>>   tests/qmp-test.c |  2 +-
>>   vl.c             |  9 ++++-----
>>   3 files changed, 8 insertions(+), 16 deletions(-)
>>
>> diff --git a/monitor.c b/monitor.c
>> index 46814af533..ce5cc5e34e 100644
>> --- a/monitor.c
>> +++ b/monitor.c
>> @@ -4560,16 +4560,9 @@ void monitor_init(Chardev *chr, int flags)
>>       bool use_readline = flags & MONITOR_USE_READLINE;
>>       bool use_oob = flags & MONITOR_USE_OOB;
>>   -    if (use_oob) {
>> -        if (CHARDEV_IS_MUX(chr)) {
>> -            error_report("Monitor Out-Of-Band is not supported with "
>> -                         "MUX typed chardev backend");
>> -            exit(1);
>> -        }
>> -        if (use_readline) {
>> -            error_report("Monitor Out-Of-band is only supported by
>> QMP");
>> -            exit(1);
>> -        }
>> +    if (CHARDEV_IS_MUX(chr)) {
>> +        /* MUX is still not supported for Out-Of-Band */
>> +        use_oob = false;
> 
> This isn't a mere reinstatement of 3fd2457d, but is now advertising OOB
> when using readline (which presumably is a synonym for using HMP).  Is
> that intentional?  If so, the commit message should mention it.
> 
>>       }
>>         monitor_data_init(mon, false, use_oob);
>> diff --git a/tests/qmp-test.c b/tests/qmp-test.c
>> index 88f867f8c0..c85a3964d9 100644
>> --- a/tests/qmp-test.c
>> +++ b/tests/qmp-test.c
>> @@ -89,7 +89,7 @@ static void test_qmp_protocol(void)
>>       g_assert(q);
>>       test_version(qdict_get(q, "version"));
>>       capabilities = qdict_get_qlist(q, "capabilities");
>> -    g_assert(capabilities && qlist_empty(capabilities));
>> +    g_assert(capabilities);
>>       qobject_unref(resp);
>>         /* Test valid command before handshake */
>> diff --git a/vl.c b/vl.c
>> index 3b39bbd7a8..b71fb8eb25 100644
>> --- a/vl.c
>> +++ b/vl.c
>> @@ -2394,6 +2394,10 @@ static int mon_init_func(void *opaque, QemuOpts
>> *opts, Error **errp)
>>           flags = MONITOR_USE_READLINE;
>>       } else if (strcmp(mode, "control") == 0) {
>>           flags = MONITOR_USE_CONTROL;
>> +        /* Out-Of-Band is on by default */
>> +        if (qemu_opt_get_bool(opts, "x-oob", 1)) {
>> +            flags |= MONITOR_USE_OOB;
>> +        }
> 
> Do we really still need the x-oob property, vs. outright deletion of
> this bandaid?  Then again, I guess keeping it for one more release makes
> it easier to forcefully turn things off for temporary testing when
> isolating whether OOB is a culprit in something breaking.
> 
>>       } else {
>>           error_report("unknown monitor mode \"%s\"", mode);
>>           exit(1);
>> @@ -2402,11 +2406,6 @@ static int mon_init_func(void *opaque, QemuOpts
>> *opts, Error **errp)
>>       if (qemu_opt_get_bool(opts, "pretty", 0))
>>           flags |= MONITOR_USE_PRETTY;
>>   -    /* OOB is off by default */
>> -    if (qemu_opt_get_bool(opts, "x-oob", 0)) {
>> -        flags |= MONITOR_USE_OOB;
>> -    }
>> -
>>       chardev = qemu_opt_get(opts, "chardev");
>>       chr = qemu_chr_find(chardev);
>>       if (chr == NULL) {
>>
> 

Re: [Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again
Posted by Peter Xu 5 years, 11 months ago
On Tue, May 22, 2018 at 02:40:26PM -0400, John Snow wrote:
> 
> 
> On 05/21/2018 10:13 AM, Eric Blake wrote:
> > On 05/21/2018 03:42 AM, Peter Xu wrote:
> >> We turned Out-Of-Band feature of monitors off for 2.12 release.  Now we
> >> try to turn that on again.
> > 
> > "try to turn" sounds weak, like you aren't sure of this patch.  If you
> > aren't sure, then why should we feel safe in applying it?  This text is
> > going in the permanent git history, so sound bold, rather than hesitant!
> > 
> > "We have resolved the issues from last time (commit 3fd2457d reverted by
> > commit a4f90923):
> > - issue 1 ...
> > - issue 2 ...
> > So now we are ready to enable advertisement of the feature by default"
> > 
> > with better descriptions of the issues that you fixed (I can think of at
> > least the fixes adding thread-safety to the current monitor, and fixing
> > early use of the monitor before qmp_capabilities completes; there may
> > also be other issues that you want to call out).
> > 
> >>
> >> Signed-off-by: Peter Xu <peterx@redhat.com>
> >> -- 
> >> Now OOB should be okay with all known tests (except iotest qcow2, since
> >> it is still broken on master),
> > 
> > Which tests are still failing for you?  Ideally, you can still
> > demonstrate that the tests not failing without this patch continue to
> > pass with this patch, even if you call out the tests that have known
> > issues to still be resolved.
> > 
> 
> Probably 91 and 169. If any others fail that's news to me.

I just gave it a shot on my workstation too (./check -qcow2):

Not run: 045 059 064 070 075 076 077 078 081 083 084 088 092 093 094 101 106 109 113 116 119 123 128 131 135 136 146 148 149 160 162 171 173 175 199 207 210 3
Failures: 087 188 189 198 206
Failed 5 of 167 tests

I'm testing against master, e609fa7.

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] [RFC] monitor: turn on Out-Of-Band by default again
Posted by Peter Xu 5 years, 11 months ago
On Wed, May 30, 2018 at 04:04:58PM +0800, Peter Xu wrote:
> On Tue, May 22, 2018 at 02:40:26PM -0400, John Snow wrote:
> > 
> > 
> > On 05/21/2018 10:13 AM, Eric Blake wrote:
> > > On 05/21/2018 03:42 AM, Peter Xu wrote:
> > >> We turned Out-Of-Band feature of monitors off for 2.12 release.  Now we
> > >> try to turn that on again.
> > > 
> > > "try to turn" sounds weak, like you aren't sure of this patch.  If you
> > > aren't sure, then why should we feel safe in applying it?  This text is
> > > going in the permanent git history, so sound bold, rather than hesitant!
> > > 
> > > "We have resolved the issues from last time (commit 3fd2457d reverted by
> > > commit a4f90923):
> > > - issue 1 ...
> > > - issue 2 ...
> > > So now we are ready to enable advertisement of the feature by default"
> > > 
> > > with better descriptions of the issues that you fixed (I can think of at
> > > least the fixes adding thread-safety to the current monitor, and fixing
> > > early use of the monitor before qmp_capabilities completes; there may
> > > also be other issues that you want to call out).
> > > 
> > >>
> > >> Signed-off-by: Peter Xu <peterx@redhat.com>
> > >> -- 
> > >> Now OOB should be okay with all known tests (except iotest qcow2, since
> > >> it is still broken on master),
> > > 
> > > Which tests are still failing for you?  Ideally, you can still
> > > demonstrate that the tests not failing without this patch continue to
> > > pass with this patch, even if you call out the tests that have known
> > > issues to still be resolved.
> > > 
> > 
> > Probably 91 and 169. If any others fail that's news to me.
> 
> I just gave it a shot on my workstation too (./check -qcow2):
> 
> Not run: 045 059 064 070 075 076 077 078 081 083 084 088 092 093 094 101 106 109 113 116 119 123 128 131 135 136 146 148 149 160 162 171 173 175 199 207 210 3
> Failures: 087 188 189 198 206
> Failed 5 of 167 tests
> 
> I'm testing against master, e609fa7.

Hmm... I ran again the same test and the same master commit but this
time it passed all 167 tests on my laptop.  So I assume the previous 5
failures are unreproducable at least every time (or there might be
something wrong with my testbed).  I'll rerun some more times, and
when I post the OOB patch I'll cover all correct qcow2 tests.

Regards,

-- 
Peter Xu