[Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread

Peter Xu posted 1 patch 5 years, 9 months ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20180620071040.28729-1-peterx@redhat.com
Test checkpatch passed
Test docker-mingw@fedora passed
Test docker-quick@centos7 passed
Test s390x failed
include/monitor/monitor.h | 2 +-
monitor.c                 | 2 +-
stubs/monitor.c           | 2 +-
tests/test-util-sockets.c | 2 +-
4 files changed, 4 insertions(+), 4 deletions(-)
[Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread
Posted by Peter Xu 5 years, 9 months ago
After the Out-Of-Band work, the monitor iothread may be accessing the
cur_mon as well (via monitor_qmp_dispatch_one()).  Let's convert the
cur_mon variable to be a per-thread variable to make sure there won't be
a race between threads when accessing the variable.

Note that thread variables are not initialized to a valid value when new
thread is created.  However for our case we don't need to set it up,
since the cur_mon variable is only used in such a pattern:

  old_mon = cur_mon;
  cur_mon = xxx;
  (do something, read cur_mon if necessary in the stack)
  cur_mon = old_mon;

It plays a role as stack variable, so no need to be initialized at all.
We only need to make sure the variable won't be changed unexpectedly by
other threads.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[peterx: touch up commit message a bit]
Signed-off-by: Peter Xu <peterx@redhat.com>
---
v4:
- touch up the commit message since now we have OOB command already
v3:
- fix code style warning from patchew
v2:
- drop qemu-thread changes
---
 include/monitor/monitor.h | 2 +-
 monitor.c                 | 2 +-
 stubs/monitor.c           | 2 +-
 tests/test-util-sockets.c | 2 +-
 4 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
index d6ab70cae2..2ef5e04b37 100644
--- a/include/monitor/monitor.h
+++ b/include/monitor/monitor.h
@@ -6,7 +6,7 @@
 #include "qapi/qapi-types-misc.h"
 #include "qemu/readline.h"
 
-extern Monitor *cur_mon;
+extern __thread Monitor *cur_mon;
 
 /* flags for monitor_init */
 /* 0x01 unused */
diff --git a/monitor.c b/monitor.c
index 885e000f9b..94365a8ae6 100644
--- a/monitor.c
+++ b/monitor.c
@@ -282,7 +282,7 @@ static mon_cmd_t info_cmds[];
 
 QmpCommandList qmp_commands, qmp_cap_negotiation_commands;
 
-Monitor *cur_mon;
+__thread Monitor *cur_mon;
 
 static void monitor_command_cb(void *opaque, const char *cmdline,
                                void *readline_opaque);
diff --git a/stubs/monitor.c b/stubs/monitor.c
index e018c8f594..3890771bb5 100644
--- a/stubs/monitor.c
+++ b/stubs/monitor.c
@@ -3,7 +3,7 @@
 #include "qemu-common.h"
 #include "monitor/monitor.h"
 
-Monitor *cur_mon = NULL;
+__thread Monitor *cur_mon;
 
 int monitor_get_fd(Monitor *mon, const char *name, Error **errp)
 {
diff --git a/tests/test-util-sockets.c b/tests/test-util-sockets.c
index acadd85e8f..6195a3ac36 100644
--- a/tests/test-util-sockets.c
+++ b/tests/test-util-sockets.c
@@ -69,7 +69,7 @@ int monitor_get_fd(Monitor *mon, const char *fdname, Error **errp)
  * stubs/monitor.c is defined, to make sure monitor.o is discarded
  * otherwise we get duplicate syms at link time.
  */
-Monitor *cur_mon;
+__thread Monitor *cur_mon;
 void monitor_init(Chardev *chr, int flags) {}
 
 
-- 
2.17.1


Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread
Posted by no-reply@patchew.org 5 years, 9 months ago
Hi,

This series failed build test on s390x host. Please find the details below.

N/A. Internal error while reading log file



---
Email generated automatically by Patchew [http://patchew.org/].
Please send your feedback to patchew-devel@redhat.com
Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread
Posted by Markus Armbruster 5 years, 8 months ago
Peter Xu <peterx@redhat.com> writes:

> After the Out-Of-Band work, the monitor iothread may be accessing the
> cur_mon as well (via monitor_qmp_dispatch_one()).  Let's convert the
> cur_mon variable to be a per-thread variable to make sure there won't be
> a race between threads when accessing the variable.

Hmm... why hasn't the OOB work created such a race already?

A monitor reads, parses, dispatches and executes commands, formats and
sends replies.

Before OOB, all of that ran in the main thread.  Any access of cur_mon
should therefore be from the main thread.  No races.

OOB moves read, parse, format and send to an I/O thread.  Dispatch and
execute remain in the main thread.  *Except* for commands executed OOB,
dispatch and execute move to the I/O thread, too.

Why is this not racy?  I guess it relies on careful non-use of cur_mon
in any part that may now execute in the I/O thread.  Scary...

Should this go into 3.0 to reduce the risk of bugs?

> Note that thread variables are not initialized to a valid value when new
> thread is created.  However for our case we don't need to set it up,
> since the cur_mon variable is only used in such a pattern:
>
>   old_mon = cur_mon;
>   cur_mon = xxx;
>   (do something, read cur_mon if necessary in the stack)
>   cur_mon = old_mon;
>
> It plays a role as stack variable, so no need to be initialized at all.
> We only need to make sure the variable won't be changed unexpectedly by
> other threads.
>
> Reviewed-by: Eric Blake <eblake@redhat.com>
> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> [peterx: touch up commit message a bit]
> Signed-off-by: Peter Xu <peterx@redhat.com>

Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread
Posted by Peter Xu 5 years, 8 months ago
On Wed, Jul 18, 2018 at 05:38:11PM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > After the Out-Of-Band work, the monitor iothread may be accessing the
> > cur_mon as well (via monitor_qmp_dispatch_one()).  Let's convert the
> > cur_mon variable to be a per-thread variable to make sure there won't be
> > a race between threads when accessing the variable.
> 
> Hmm... why hasn't the OOB work created such a race already?
> 
> A monitor reads, parses, dispatches and executes commands, formats and
> sends replies.
> 
> Before OOB, all of that ran in the main thread.  Any access of cur_mon
> should therefore be from the main thread.  No races.
> 
> OOB moves read, parse, format and send to an I/O thread.  Dispatch and
> execute remain in the main thread.  *Except* for commands executed OOB,
> dispatch and execute move to the I/O thread, too.
> 
> Why is this not racy?  I guess it relies on careful non-use of cur_mon
> in any part that may now execute in the I/O thread.  Scary...

I think it's because cur_mon is not really used in out-of-band command
executions - now we only have a few out-of-band enabled commands, and
IIUC none of them is using cur_mon (for example, in
qmp_migrate_recover() we don't even call error_report, and the code
path is quite straight forward to make sure of that).  So IIUC cur_mon
variable is still only touched by main thread for now hence we should
be safe.  However that condition might change in the future when we
add more out-of-band capable commands.

(not to mention that I don't even know whether there are real users of
 out-of-band if we haven't yet started to support that for libvirt...)

> 
> Should this go into 3.0 to reduce the risk of bugs?

Yes I think it would be good to have that even for 3.0, since it still
can be seen as a bug fix of existing code.

Regards,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread
Posted by Markus Armbruster 5 years, 8 months ago
Peter Xu <peterx@redhat.com> writes:

> On Wed, Jul 18, 2018 at 05:38:11PM +0200, Markus Armbruster wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > After the Out-Of-Band work, the monitor iothread may be accessing the
>> > cur_mon as well (via monitor_qmp_dispatch_one()).  Let's convert the
>> > cur_mon variable to be a per-thread variable to make sure there won't be
>> > a race between threads when accessing the variable.
>> 
>> Hmm... why hasn't the OOB work created such a race already?
>> 
>> A monitor reads, parses, dispatches and executes commands, formats and
>> sends replies.
>> 
>> Before OOB, all of that ran in the main thread.  Any access of cur_mon
>> should therefore be from the main thread.  No races.
>> 
>> OOB moves read, parse, format and send to an I/O thread.  Dispatch and
>> execute remain in the main thread.  *Except* for commands executed OOB,
>> dispatch and execute move to the I/O thread, too.
>> 
>> Why is this not racy?  I guess it relies on careful non-use of cur_mon
>> in any part that may now execute in the I/O thread.  Scary...
>
> I think it's because cur_mon is not really used in out-of-band command
> executions - now we only have a few out-of-band enabled commands, and
> IIUC none of them is using cur_mon (for example, in
> qmp_migrate_recover() we don't even call error_report, and the code
> path is quite straight forward to make sure of that).  So IIUC cur_mon
> variable is still only touched by main thread for now hence we should
> be safe.  However that condition might change in the future when we
> add more out-of-band capable commands.
>
> (not to mention that I don't even know whether there are real users of
>  out-of-band if we haven't yet started to support that for libvirt...)

It's not just the actual OOB commands (there are just two), it's also
the monitor code to read, parse, format and send.

>> Should this go into 3.0 to reduce the risk of bugs?
>
> Yes I think it would be good to have that even for 3.0, since it still
> can be seen as a bug fix of existing code.

Agreed.

> Regards,
>
>> > Note that thread variables are not initialized to a valid value when new
>> > thread is created.

Confusing.  It sounds like @cur_mon's initial value would be
indeterminate, like an automatic variable's.  Not true.  Variables with
thread storage duration are initialized when the thread is created.
Since @cur_mon's declaration lacks an initializer, it'll be initialized
to a null pointer.  Your sentence is correct when you consider that null
pointer not a valid value.

>> >                     However for our case we don't need to set it up,
>> > since the cur_mon variable is only used in such a pattern:
>> > 
>> >   old_mon = cur_mon;
>> >   cur_mon = xxx;
>> >   (do something, read cur_mon if necessary in the stack)
>> >   cur_mon = old_mon;
>> > 
>> > It plays a role as stack variable, so no need to be initialized at all.
>> > We only need to make sure the variable won't be changed unexpectedly by
>> > other threads.

Do we need this paragraph?  The commit doesn't mess with @cur_mon's
initial value at all...

>> > Reviewed-by: Eric Blake <eblake@redhat.com>
>> > Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>> > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
>> > [peterx: touch up commit message a bit]
>> > Signed-off-by: Peter Xu <peterx@redhat.com>

Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread
Posted by Peter Xu 5 years, 8 months ago
On Thu, Jul 19, 2018 at 09:20:34AM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Wed, Jul 18, 2018 at 05:38:11PM +0200, Markus Armbruster wrote:
> >> Peter Xu <peterx@redhat.com> writes:
> >> 
> >> > After the Out-Of-Band work, the monitor iothread may be accessing the
> >> > cur_mon as well (via monitor_qmp_dispatch_one()).  Let's convert the
> >> > cur_mon variable to be a per-thread variable to make sure there won't be
> >> > a race between threads when accessing the variable.
> >> 
> >> Hmm... why hasn't the OOB work created such a race already?
> >> 
> >> A monitor reads, parses, dispatches and executes commands, formats and
> >> sends replies.
> >> 
> >> Before OOB, all of that ran in the main thread.  Any access of cur_mon
> >> should therefore be from the main thread.  No races.
> >> 
> >> OOB moves read, parse, format and send to an I/O thread.  Dispatch and
> >> execute remain in the main thread.  *Except* for commands executed OOB,
> >> dispatch and execute move to the I/O thread, too.
> >> 
> >> Why is this not racy?  I guess it relies on careful non-use of cur_mon
> >> in any part that may now execute in the I/O thread.  Scary...
> >
> > I think it's because cur_mon is not really used in out-of-band command
> > executions - now we only have a few out-of-band enabled commands, and
> > IIUC none of them is using cur_mon (for example, in
> > qmp_migrate_recover() we don't even call error_report, and the code
> > path is quite straight forward to make sure of that).  So IIUC cur_mon
> > variable is still only touched by main thread for now hence we should
> > be safe.  However that condition might change in the future when we
> > add more out-of-band capable commands.
> >
> > (not to mention that I don't even know whether there are real users of
> >  out-of-band if we haven't yet started to support that for libvirt...)
> 
> It's not just the actual OOB commands (there are just two), it's also
> the monitor code to read, parse, format and send.

My understanding is that read, parse, format, send will not touch
cur_mon (it was touched before but some patches in the out-of-band
series should have removed the last users when parsing).  So IIUC only
the dispatcher would touch that now.  I didn't consider the callers
like net_init_socket() and I'm only considering the monitor code (and
those callers should be only in the main thread too after all).

> 
> >> Should this go into 3.0 to reduce the risk of bugs?
> >
> > Yes I think it would be good to have that even for 3.0, since it still
> > can be seen as a bug fix of existing code.
> 
> Agreed.
> 
> > Regards,
> >
> >> > Note that thread variables are not initialized to a valid value when new
> >> > thread is created.
> 
> Confusing.  It sounds like @cur_mon's initial value would be
> indeterminate, like an automatic variable's.  Not true.  Variables with
> thread storage duration are initialized when the thread is created.
> Since @cur_mon's declaration lacks an initializer, it'll be initialized
> to a null pointer.  Your sentence is correct when you consider that null
> pointer not a valid value.

Yes that's what I meant.  So how about this?

  Note that the per-thread @cur_mon variable is not initialized to
  point to a valid Monitor struct when a new thread is created (the
  default value will be NULL).

Please feel free to tune it up.

> 
> >> >                     However for our case we don't need to set it up,
> >> > since the cur_mon variable is only used in such a pattern:
> >> > 
> >> >   old_mon = cur_mon;
> >> >   cur_mon = xxx;
> >> >   (do something, read cur_mon if necessary in the stack)

[1]

> >> >   cur_mon = old_mon;
> >> > 
> >> > It plays a role as stack variable, so no need to be initialized at all.
> >> > We only need to make sure the variable won't be changed unexpectedly by
> >> > other threads.
> 
> Do we need this paragraph?  The commit doesn't mess with @cur_mon's
> initial value at all...

I was trying to explain why we don't need to initialize that variable
for each thread.  A common idea (at least that's what I have had in
mind) is that when we create a new thread we should possibly inherit
that @cur_mon variable in a copy-on-write fashion for that new thread.
But that's not really necessary for the use case like above (as long
as we don't create thread during [1], and that's what we do).

If you think the patch explains itself better without these lines,
please feel free to drop it.

> 
> >> > Reviewed-by: Eric Blake <eblake@redhat.com>
> >> > Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> >> > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> >> > [peterx: touch up commit message a bit]
> >> > Signed-off-by: Peter Xu <peterx@redhat.com>

Thanks,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread
Posted by Markus Armbruster 5 years, 8 months ago
Peter Xu <peterx@redhat.com> writes:

> On Thu, Jul 19, 2018 at 09:20:34AM +0200, Markus Armbruster wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Wed, Jul 18, 2018 at 05:38:11PM +0200, Markus Armbruster wrote:
>> >> Peter Xu <peterx@redhat.com> writes:
>> >> 
>> >> > After the Out-Of-Band work, the monitor iothread may be accessing the
>> >> > cur_mon as well (via monitor_qmp_dispatch_one()).

Since renamed to monitor_qmp_dispatch().

Further down, we concluded that cur_mon isn't actually used from the I/O
thread, didn't we?

>> >> >                                                    Let's convert the
>> >> > cur_mon variable to be a per-thread variable to make sure there won't be
>> >> > a race between threads when accessing the variable.
>> >> 
>> >> Hmm... why hasn't the OOB work created such a race already?
>> >> 
>> >> A monitor reads, parses, dispatches and executes commands, formats and
>> >> sends replies.
>> >> 
>> >> Before OOB, all of that ran in the main thread.  Any access of cur_mon
>> >> should therefore be from the main thread.  No races.
>> >> 
>> >> OOB moves read, parse, format and send to an I/O thread.  Dispatch and
>> >> execute remain in the main thread.  *Except* for commands executed OOB,
>> >> dispatch and execute move to the I/O thread, too.
>> >> 
>> >> Why is this not racy?  I guess it relies on careful non-use of cur_mon
>> >> in any part that may now execute in the I/O thread.  Scary...
>> >
>> > I think it's because cur_mon is not really used in out-of-band command
>> > executions - now we only have a few out-of-band enabled commands, and
>> > IIUC none of them is using cur_mon (for example, in
>> > qmp_migrate_recover() we don't even call error_report, and the code
>> > path is quite straight forward to make sure of that).  So IIUC cur_mon
>> > variable is still only touched by main thread for now hence we should
>> > be safe.  However that condition might change in the future when we
>> > add more out-of-band capable commands.
>> >
>> > (not to mention that I don't even know whether there are real users of
>> >  out-of-band if we haven't yet started to support that for libvirt...)
>> 
>> It's not just the actual OOB commands (there are just two), it's also
>> the monitor code to read, parse, format and send.
>
> My understanding is that read, parse, format, send will not touch
> cur_mon (it was touched before but some patches in the out-of-band
> series should have removed the last users when parsing).  So IIUC only
> the dispatcher would touch that now.  I didn't consider the callers
> like net_init_socket() and I'm only considering the monitor code (and
> those callers should be only in the main thread too after all).

There *is* cur_mon use outside dispatch & execute, e.g.

    void error_vprintf(const char *fmt, va_list ap)
    {
        if (cur_mon && !monitor_cur_is_qmp()) {
            monitor_vprintf(cur_mon, fmt, ap);
        } else {
            vfprintf(stderr, fmt, ap);
        }
    }

Obviously unsafe to use outside the main thread.  Consider:

    bool monitor_cur_is_qmp(void)
    {
        return cur_mon && monitor_is_qmp(cur_mon);
    }

    static inline bool monitor_is_qmp(const Monitor *mon)
    {
        return (mon->flags & MONITOR_USE_CONTROL);
    }

If monitor_cur_is_qmp() reads cur_mon twice (which it is entitled to
do), this crashes when the main thread sets cur_mon back to null in
between.

Did the OOB work make things any worse?  Let's see.

@cur_mon is null unless the main thread is running monitor code, either
HMP within monitor_read():

    cur_mon = opaque;

    if (cur_mon->rs) {
        for (i = 0; i < size; i++)
            readline_handle_byte(cur_mon->rs, buf[i]);
    } else {
        if (size == 0 || buf[size - 1] != 0)
            monitor_printf(cur_mon, "corrupted command\n");
        else
            handle_hmp_command(cur_mon, (char *)buf);
    }

    cur_mon = old_mon;

or QMP within monitor_qmp_dispatch():

    old_mon = cur_mon;
    cur_mon = mon;

    rsp = qmp_dispatch(mon->qmp.commands, req, qmp_oob_enabled(mon));

    cur_mon = old_mon;

In both cases, old_mon is always null.

Fine print: before commit 227a07552f3 "monitor: move the cur_mon hack
deeper for QMP", we ran more code for QMP with cur_mon set, namely the
JSON parser, but that doesn't matter here.

More fine print: there's also qmp_human_monitor_command(), which stacks
an HMP monitor on top of the QMP monitor.  Also doesn't matter here.

The OOB work doesn't add any new races as long as

* it doesn't add assignments to @cur_mon, and

* none of the code it moves out of the main thread accesses @cur_mon.

The first condition obviously holds.  The second one isn't obvious, but
I figure it holds, too.

Okay, I think I've convince myself the OOB work didn't add
cur_mon-related races.

>> >> Should this go into 3.0 to reduce the risk of bugs?
>> >
>> > Yes I think it would be good to have that even for 3.0, since it still
>> > can be seen as a bug fix of existing code.
>> 
>> Agreed.
>> 
>> > Regards,
>> >
>> >> > Note that thread variables are not initialized to a valid value when new
>> >> > thread is created.
>> 
>> Confusing.  It sounds like @cur_mon's initial value would be
>> indeterminate, like an automatic variable's.  Not true.  Variables with
>> thread storage duration are initialized when the thread is created.
>> Since @cur_mon's declaration lacks an initializer, it'll be initialized
>> to a null pointer.  Your sentence is correct when you consider that null
>> pointer not a valid value.
>
> Yes that's what I meant.  So how about this?
>
>   Note that the per-thread @cur_mon variable is not initialized to
>   point to a valid Monitor struct when a new thread is created (the
>   default value will be NULL).
>
> Please feel free to tune it up.

I think what the patch really changes is the value of @cur_mon outside
the main thread: it remains null there now.  Before, it depended on what
the main thread was doing, and therefore could not be used safely.

In other words, the patch makes uses of @cur_mon like the one in
error_vprintf() shown above safe.

I think that's what we should explain in the commit message.  I can try
rewriting it, but right now I got to run.

>> 
>> >> >                     However for our case we don't need to set it up,
>> >> > since the cur_mon variable is only used in such a pattern:
>> >> > 
>> >> >   old_mon = cur_mon;
>> >> >   cur_mon = xxx;
>> >> >   (do something, read cur_mon if necessary in the stack)
>
> [1]
>
>> >> >   cur_mon = old_mon;
>> >> > 
>> >> > It plays a role as stack variable, so no need to be initialized at all.
>> >> > We only need to make sure the variable won't be changed unexpectedly by
>> >> > other threads.
>> 
>> Do we need this paragraph?  The commit doesn't mess with @cur_mon's
>> initial value at all...
>
> I was trying to explain why we don't need to initialize that variable
> for each thread.  A common idea (at least that's what I have had in
> mind) is that when we create a new thread we should possibly inherit
> that @cur_mon variable in a copy-on-write fashion for that new thread.
> But that's not really necessary for the use case like above (as long
> as we don't create thread during [1], and that's what we do).
>
> If you think the patch explains itself better without these lines,
> please feel free to drop it.
>
>> 
>> >> > Reviewed-by: Eric Blake <eblake@redhat.com>
>> >> > Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
>> >> > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
>> >> > [peterx: touch up commit message a bit]
>> >> > Signed-off-by: Peter Xu <peterx@redhat.com>
>
> Thanks,

Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread
Posted by Peter Xu 5 years, 8 months ago
On Thu, Jul 19, 2018 at 11:05:34AM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Thu, Jul 19, 2018 at 09:20:34AM +0200, Markus Armbruster wrote:
> >> Peter Xu <peterx@redhat.com> writes:
> >> 
> >> > On Wed, Jul 18, 2018 at 05:38:11PM +0200, Markus Armbruster wrote:
> >> >> Peter Xu <peterx@redhat.com> writes:
> >> >> 
> >> >> > After the Out-Of-Band work, the monitor iothread may be accessing the
> >> >> > cur_mon as well (via monitor_qmp_dispatch_one()).
> 
> Since renamed to monitor_qmp_dispatch().

True.

> 
> Further down, we concluded that cur_mon isn't actually used from the I/O
> thread, didn't we?

I think so; if not we should either fix it or apply this patch. :)

> 
> >> >> >                                                    Let's convert the
> >> >> > cur_mon variable to be a per-thread variable to make sure there won't be
> >> >> > a race between threads when accessing the variable.
> >> >> 
> >> >> Hmm... why hasn't the OOB work created such a race already?
> >> >> 
> >> >> A monitor reads, parses, dispatches and executes commands, formats and
> >> >> sends replies.
> >> >> 
> >> >> Before OOB, all of that ran in the main thread.  Any access of cur_mon
> >> >> should therefore be from the main thread.  No races.
> >> >> 
> >> >> OOB moves read, parse, format and send to an I/O thread.  Dispatch and
> >> >> execute remain in the main thread.  *Except* for commands executed OOB,
> >> >> dispatch and execute move to the I/O thread, too.
> >> >> 
> >> >> Why is this not racy?  I guess it relies on careful non-use of cur_mon
> >> >> in any part that may now execute in the I/O thread.  Scary...
> >> >
> >> > I think it's because cur_mon is not really used in out-of-band command
> >> > executions - now we only have a few out-of-band enabled commands, and
> >> > IIUC none of them is using cur_mon (for example, in
> >> > qmp_migrate_recover() we don't even call error_report, and the code
> >> > path is quite straight forward to make sure of that).  So IIUC cur_mon
> >> > variable is still only touched by main thread for now hence we should
> >> > be safe.  However that condition might change in the future when we
> >> > add more out-of-band capable commands.
> >> >
> >> > (not to mention that I don't even know whether there are real users of
> >> >  out-of-band if we haven't yet started to support that for libvirt...)
> >> 
> >> It's not just the actual OOB commands (there are just two), it's also
> >> the monitor code to read, parse, format and send.
> >
> > My understanding is that read, parse, format, send will not touch
> > cur_mon (it was touched before but some patches in the out-of-band
> > series should have removed the last users when parsing).  So IIUC only
> > the dispatcher would touch that now.  I didn't consider the callers
> > like net_init_socket() and I'm only considering the monitor code (and
> > those callers should be only in the main thread too after all).
> 
> There *is* cur_mon use outside dispatch & execute, e.g.
> 
>     void error_vprintf(const char *fmt, va_list ap)
>     {
>         if (cur_mon && !monitor_cur_is_qmp()) {
>             monitor_vprintf(cur_mon, fmt, ap);
>         } else {
>             vfprintf(stderr, fmt, ap);
>         }
>     }
> 
> Obviously unsafe to use outside the main thread.  Consider:
> 
>     bool monitor_cur_is_qmp(void)
>     {
>         return cur_mon && monitor_is_qmp(cur_mon);
>     }
> 
>     static inline bool monitor_is_qmp(const Monitor *mon)
>     {
>         return (mon->flags & MONITOR_USE_CONTROL);
>     }
> 
> If monitor_cur_is_qmp() reads cur_mon twice (which it is entitled to
> do), this crashes when the main thread sets cur_mon back to null in
> between.

Yes, but I thought we should not even use these error_vprintf() or
sister functions outside the QMP handlers, or at least that's what I
thought.  For example, in parsers, we should always use error_setg()
or something similar but never error_report().

> 
> Did the OOB work make things any worse?  Let's see.
> 
> @cur_mon is null unless the main thread is running monitor code, either
> HMP within monitor_read():
> 
>     cur_mon = opaque;
> 
>     if (cur_mon->rs) {
>         for (i = 0; i < size; i++)
>             readline_handle_byte(cur_mon->rs, buf[i]);
>     } else {
>         if (size == 0 || buf[size - 1] != 0)
>             monitor_printf(cur_mon, "corrupted command\n");
>         else
>             handle_hmp_command(cur_mon, (char *)buf);
>     }
> 
>     cur_mon = old_mon;
> 
> or QMP within monitor_qmp_dispatch():
> 
>     old_mon = cur_mon;
>     cur_mon = mon;
> 
>     rsp = qmp_dispatch(mon->qmp.commands, req, qmp_oob_enabled(mon));
> 
>     cur_mon = old_mon;
> 
> In both cases, old_mon is always null.
> 
> Fine print: before commit 227a07552f3 "monitor: move the cur_mon hack
> deeper for QMP", we ran more code for QMP with cur_mon set, namely the
> JSON parser, but that doesn't matter here.
> 
> More fine print: there's also qmp_human_monitor_command(), which stacks
> an HMP monitor on top of the QMP monitor.  Also doesn't matter here.
> 
> The OOB work doesn't add any new races as long as
> 
> * it doesn't add assignments to @cur_mon, and
> 
> * none of the code it moves out of the main thread accesses @cur_mon.
> 
> The first condition obviously holds.  The second one isn't obvious, but
> I figure it holds, too.
> 
> Okay, I think I've convince myself the OOB work didn't add
> cur_mon-related races.

Hopefully, yes.  Thanks for the double check.

> 
> >> >> Should this go into 3.0 to reduce the risk of bugs?
> >> >
> >> > Yes I think it would be good to have that even for 3.0, since it still
> >> > can be seen as a bug fix of existing code.
> >> 
> >> Agreed.
> >> 
> >> > Regards,
> >> >
> >> >> > Note that thread variables are not initialized to a valid value when new
> >> >> > thread is created.
> >> 
> >> Confusing.  It sounds like @cur_mon's initial value would be
> >> indeterminate, like an automatic variable's.  Not true.  Variables with
> >> thread storage duration are initialized when the thread is created.
> >> Since @cur_mon's declaration lacks an initializer, it'll be initialized
> >> to a null pointer.  Your sentence is correct when you consider that null
> >> pointer not a valid value.
> >
> > Yes that's what I meant.  So how about this?
> >
> >   Note that the per-thread @cur_mon variable is not initialized to
> >   point to a valid Monitor struct when a new thread is created (the
> >   default value will be NULL).
> >
> > Please feel free to tune it up.
> 
> I think what the patch really changes is the value of @cur_mon outside
> the main thread: it remains null there now.  Before, it depended on what
> the main thread was doing, and therefore could not be used safely.
> 
> In other words, the patch makes uses of @cur_mon like the one in
> error_vprintf() shown above safe.
> 
> I think that's what we should explain in the commit message.  I can try
> rewriting it,

I'll appreciate that if so.

> but right now I got to run.

Must be lunch time! :)

Regards,

> 
> >> 
> >> >> >                     However for our case we don't need to set it up,
> >> >> > since the cur_mon variable is only used in such a pattern:
> >> >> > 
> >> >> >   old_mon = cur_mon;
> >> >> >   cur_mon = xxx;
> >> >> >   (do something, read cur_mon if necessary in the stack)
> >
> > [1]
> >
> >> >> >   cur_mon = old_mon;
> >> >> > 
> >> >> > It plays a role as stack variable, so no need to be initialized at all.
> >> >> > We only need to make sure the variable won't be changed unexpectedly by
> >> >> > other threads.
> >> 
> >> Do we need this paragraph?  The commit doesn't mess with @cur_mon's
> >> initial value at all...
> >
> > I was trying to explain why we don't need to initialize that variable
> > for each thread.  A common idea (at least that's what I have had in
> > mind) is that when we create a new thread we should possibly inherit
> > that @cur_mon variable in a copy-on-write fashion for that new thread.
> > But that's not really necessary for the use case like above (as long
> > as we don't create thread during [1], and that's what we do).
> >
> > If you think the patch explains itself better without these lines,
> > please feel free to drop it.
> >
> >> 
> >> >> > Reviewed-by: Eric Blake <eblake@redhat.com>
> >> >> > Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
> >> >> > Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
> >> >> > [peterx: touch up commit message a bit]
> >> >> > Signed-off-by: Peter Xu <peterx@redhat.com>
> >
> > Thanks,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread
Posted by Markus Armbruster 5 years, 8 months ago
Peter Xu <peterx@redhat.com> writes:

> On Thu, Jul 19, 2018 at 11:05:34AM +0200, Markus Armbruster wrote:
>> Peter Xu <peterx@redhat.com> writes:
>> 
>> > On Thu, Jul 19, 2018 at 09:20:34AM +0200, Markus Armbruster wrote:
>> >> Peter Xu <peterx@redhat.com> writes:
>> >> 
>> >> > On Wed, Jul 18, 2018 at 05:38:11PM +0200, Markus Armbruster wrote:
>> >> >> Peter Xu <peterx@redhat.com> writes:
>> >> >> 
>> >> >> > After the Out-Of-Band work, the monitor iothread may be accessing the
>> >> >> > cur_mon as well (via monitor_qmp_dispatch_one()).
>> 
>> Since renamed to monitor_qmp_dispatch().
>
> True.
>
>> 
>> Further down, we concluded that cur_mon isn't actually used from the I/O
>> thread, didn't we?
>
> I think so; if not we should either fix it or apply this patch. :)
>
>> 
>> >> >> >                                                    Let's convert the
>> >> >> > cur_mon variable to be a per-thread variable to make sure there won't be
>> >> >> > a race between threads when accessing the variable.
>> >> >> 
>> >> >> Hmm... why hasn't the OOB work created such a race already?
>> >> >> 
>> >> >> A monitor reads, parses, dispatches and executes commands, formats and
>> >> >> sends replies.
>> >> >> 
>> >> >> Before OOB, all of that ran in the main thread.  Any access of cur_mon
>> >> >> should therefore be from the main thread.  No races.
>> >> >> 
>> >> >> OOB moves read, parse, format and send to an I/O thread.  Dispatch and
>> >> >> execute remain in the main thread.  *Except* for commands executed OOB,
>> >> >> dispatch and execute move to the I/O thread, too.
>> >> >> 
>> >> >> Why is this not racy?  I guess it relies on careful non-use of cur_mon
>> >> >> in any part that may now execute in the I/O thread.  Scary...
>> >> >
>> >> > I think it's because cur_mon is not really used in out-of-band command
>> >> > executions - now we only have a few out-of-band enabled commands, and
>> >> > IIUC none of them is using cur_mon (for example, in
>> >> > qmp_migrate_recover() we don't even call error_report, and the code
>> >> > path is quite straight forward to make sure of that).  So IIUC cur_mon
>> >> > variable is still only touched by main thread for now hence we should
>> >> > be safe.  However that condition might change in the future when we
>> >> > add more out-of-band capable commands.
>> >> >
>> >> > (not to mention that I don't even know whether there are real users of
>> >> >  out-of-band if we haven't yet started to support that for libvirt...)
>> >> 
>> >> It's not just the actual OOB commands (there are just two), it's also
>> >> the monitor code to read, parse, format and send.
>> >
>> > My understanding is that read, parse, format, send will not touch
>> > cur_mon (it was touched before but some patches in the out-of-band
>> > series should have removed the last users when parsing).  So IIUC only
>> > the dispatcher would touch that now.  I didn't consider the callers
>> > like net_init_socket() and I'm only considering the monitor code (and
>> > those callers should be only in the main thread too after all).
>> 
>> There *is* cur_mon use outside dispatch & execute, e.g.
>> 
>>     void error_vprintf(const char *fmt, va_list ap)
>>     {
>>         if (cur_mon && !monitor_cur_is_qmp()) {
>>             monitor_vprintf(cur_mon, fmt, ap);
>>         } else {
>>             vfprintf(stderr, fmt, ap);
>>         }
>>     }
>> 
>> Obviously unsafe to use outside the main thread.  Consider:
>> 
>>     bool monitor_cur_is_qmp(void)
>>     {
>>         return cur_mon && monitor_is_qmp(cur_mon);
>>     }
>> 
>>     static inline bool monitor_is_qmp(const Monitor *mon)
>>     {
>>         return (mon->flags & MONITOR_USE_CONTROL);
>>     }
>> 
>> If monitor_cur_is_qmp() reads cur_mon twice (which it is entitled to
>> do), this crashes when the main thread sets cur_mon back to null in
>> between.
>
> Yes, but I thought we should not even use these error_vprintf() or
> sister functions outside the QMP handlers, or at least that's what I
> thought.  For example, in parsers, we should always use error_setg()
> or something similar but never error_report().

error_report() & friends are for general use, not just for monitor
handlers.  They are for reporting errors to a human user.  error_setg()
is for passing errors to code.

>> Did the OOB work make things any worse?  Let's see.
>> 
>> @cur_mon is null unless the main thread is running monitor code, either
>> HMP within monitor_read():
>> 
>>     cur_mon = opaque;
>> 
>>     if (cur_mon->rs) {
>>         for (i = 0; i < size; i++)
>>             readline_handle_byte(cur_mon->rs, buf[i]);
>>     } else {
>>         if (size == 0 || buf[size - 1] != 0)
>>             monitor_printf(cur_mon, "corrupted command\n");
>>         else
>>             handle_hmp_command(cur_mon, (char *)buf);
>>     }
>> 
>>     cur_mon = old_mon;
>> 
>> or QMP within monitor_qmp_dispatch():
>> 
>>     old_mon = cur_mon;
>>     cur_mon = mon;
>> 
>>     rsp = qmp_dispatch(mon->qmp.commands, req, qmp_oob_enabled(mon));
>> 
>>     cur_mon = old_mon;
>> 
>> In both cases, old_mon is always null.
>> 
>> Fine print: before commit 227a07552f3 "monitor: move the cur_mon hack
>> deeper for QMP", we ran more code for QMP with cur_mon set, namely the
>> JSON parser, but that doesn't matter here.
>> 
>> More fine print: there's also qmp_human_monitor_command(), which stacks
>> an HMP monitor on top of the QMP monitor.  Also doesn't matter here.
>> 
>> The OOB work doesn't add any new races as long as
>> 
>> * it doesn't add assignments to @cur_mon, and
>> 
>> * none of the code it moves out of the main thread accesses @cur_mon.
>> 
>> The first condition obviously holds.  The second one isn't obvious, but
>> I figure it holds, too.
>> 
>> Okay, I think I've convince myself the OOB work didn't add
>> cur_mon-related races.
>
> Hopefully, yes.  Thanks for the double check.

Wait!  I think I found one.

Say we have an HMP monitor running in the main thread (they always do),
and a QMP monitor running in @mon_iothread.

Now both threads write to @cur_mon, without synchronization!  The main
thread writes in monitor_read() when reading HMP input, and in
monitor_qmp_dispatch() when dispatching in-band QMP commands.
@mon_iothread writes in monitor_qmp_dispatch() when dispatching
out-of-band QMP commands.

Example:

    in main thread:

        chardev reads input, calls
            monitor_read()
                old_mon = cur_mon; // null
                cur_mon = opaque;  // HMP monitor
                ...

    in mon_iothread:

        chardev reads input, calls
            monitor_qmp_read(), calls
                handle_qmp_command() via JSON parser
                    calls monitor_qmp_dispatch() to execute OOB command
                        old_mon = cur_mon; // HMP monitor!
                        cur_mon = mon;     // QMP monitor
                        ...

    in main thread
                ... HMP code runs with @cur_mon pointing to QMP monitor
                cur_mon = old_mon; // null

    in mon_iothread
                        old_mon = cur_mon; // HMP monitor!

    all threads
        non-monitor code runs with @cur_mon pointing to HMP monitor

>> >> >> Should this go into 3.0 to reduce the risk of bugs?
>> >> >
>> >> > Yes I think it would be good to have that even for 3.0, since it still
>> >> > can be seen as a bug fix of existing code.
>> >> 
>> >> Agreed.
>> >> 
>> >> > Regards,
>> >> >
>> >> >> > Note that thread variables are not initialized to a valid value when new
>> >> >> > thread is created.
>> >> 
>> >> Confusing.  It sounds like @cur_mon's initial value would be
>> >> indeterminate, like an automatic variable's.  Not true.  Variables with
>> >> thread storage duration are initialized when the thread is created.
>> >> Since @cur_mon's declaration lacks an initializer, it'll be initialized
>> >> to a null pointer.  Your sentence is correct when you consider that null
>> >> pointer not a valid value.
>> >
>> > Yes that's what I meant.  So how about this?
>> >
>> >   Note that the per-thread @cur_mon variable is not initialized to
>> >   point to a valid Monitor struct when a new thread is created (the
>> >   default value will be NULL).
>> >
>> > Please feel free to tune it up.
>> 
>> I think what the patch really changes is the value of @cur_mon outside
>> the main thread: it remains null there now.  Before, it depended on what
>> the main thread was doing, and therefore could not be used safely.
>> 
>> In other words, the patch makes uses of @cur_mon like the one in
>> error_vprintf() shown above safe.
>> 
>> I think that's what we should explain in the commit message.  I can try
>> rewriting it,
>
> I'll appreciate that if so.

Here's my try:

    monitor: Fix unsafe sharing of @cur_mon among threads

    @cur_mon is null unless the main thread is running monitor code, either
    HMP code within monitor_read(), or QMP code within
    monitor_qmp_dispatch().

    Use of @cur_mon outside the main thread is therefore unsafe.

    Most of its uses are in monitor command handlers.  These run in the main
    thread.

    However, there are also uses hiding elsewhere, such as in
    error_vprintf(), and thus error_report(), making these functions unsafe
    outside the main thread.  No such unsafe uses are known at this time.
    Regardless, this is an unnecessary trap.  It's an ancient trap, though.

    More recently, commit cf869d53172 "qmp: support out-of-band (oob)
    execution" spiced things up: the monitor I/O thread assigns to @cur_mon
    when executing commands out-of-band.  Having two threads save, set and
    restore @cur_mon without synchronization is definitely unsafe.  We can
    end up with @cur_mon null while the main thread runs monitor code, or
    non-null while it runs non-monitor code.

    We could fix this by making the I/O thread not mess with @cur_mon, but
    that would leave the trap armed and ready.

    Instead, make @cur_mon thread-local.  It's now reliably null unless the
    thread is running monitor code.
    
>> but right now I got to run.
>
> Must be lunch time! :)

Time to prepare lunch for the family, actually :)

Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread
Posted by Peter Xu 5 years, 8 months ago
On Thu, Jul 19, 2018 at 02:14:56PM +0200, Markus Armbruster wrote:
> Peter Xu <peterx@redhat.com> writes:
> 
> > On Thu, Jul 19, 2018 at 11:05:34AM +0200, Markus Armbruster wrote:
> >> Peter Xu <peterx@redhat.com> writes:
> >> 
> >> > On Thu, Jul 19, 2018 at 09:20:34AM +0200, Markus Armbruster wrote:
> >> >> Peter Xu <peterx@redhat.com> writes:
> >> >> 
> >> >> > On Wed, Jul 18, 2018 at 05:38:11PM +0200, Markus Armbruster wrote:
> >> >> >> Peter Xu <peterx@redhat.com> writes:
> >> >> >> 
> >> >> >> > After the Out-Of-Band work, the monitor iothread may be accessing the
> >> >> >> > cur_mon as well (via monitor_qmp_dispatch_one()).
> >> 
> >> Since renamed to monitor_qmp_dispatch().
> >
> > True.
> >
> >> 
> >> Further down, we concluded that cur_mon isn't actually used from the I/O
> >> thread, didn't we?
> >
> > I think so; if not we should either fix it or apply this patch. :)
> >
> >> 
> >> >> >> >                                                    Let's convert the
> >> >> >> > cur_mon variable to be a per-thread variable to make sure there won't be
> >> >> >> > a race between threads when accessing the variable.
> >> >> >> 
> >> >> >> Hmm... why hasn't the OOB work created such a race already?
> >> >> >> 
> >> >> >> A monitor reads, parses, dispatches and executes commands, formats and
> >> >> >> sends replies.
> >> >> >> 
> >> >> >> Before OOB, all of that ran in the main thread.  Any access of cur_mon
> >> >> >> should therefore be from the main thread.  No races.
> >> >> >> 
> >> >> >> OOB moves read, parse, format and send to an I/O thread.  Dispatch and
> >> >> >> execute remain in the main thread.  *Except* for commands executed OOB,
> >> >> >> dispatch and execute move to the I/O thread, too.
> >> >> >> 
> >> >> >> Why is this not racy?  I guess it relies on careful non-use of cur_mon
> >> >> >> in any part that may now execute in the I/O thread.  Scary...
> >> >> >
> >> >> > I think it's because cur_mon is not really used in out-of-band command
> >> >> > executions - now we only have a few out-of-band enabled commands, and
> >> >> > IIUC none of them is using cur_mon (for example, in
> >> >> > qmp_migrate_recover() we don't even call error_report, and the code
> >> >> > path is quite straight forward to make sure of that).  So IIUC cur_mon
> >> >> > variable is still only touched by main thread for now hence we should
> >> >> > be safe.  However that condition might change in the future when we
> >> >> > add more out-of-band capable commands.
> >> >> >
> >> >> > (not to mention that I don't even know whether there are real users of
> >> >> >  out-of-band if we haven't yet started to support that for libvirt...)
> >> >> 
> >> >> It's not just the actual OOB commands (there are just two), it's also
> >> >> the monitor code to read, parse, format and send.
> >> >
> >> > My understanding is that read, parse, format, send will not touch
> >> > cur_mon (it was touched before but some patches in the out-of-band
> >> > series should have removed the last users when parsing).  So IIUC only
> >> > the dispatcher would touch that now.  I didn't consider the callers
> >> > like net_init_socket() and I'm only considering the monitor code (and
> >> > those callers should be only in the main thread too after all).
> >> 
> >> There *is* cur_mon use outside dispatch & execute, e.g.
> >> 
> >>     void error_vprintf(const char *fmt, va_list ap)
> >>     {
> >>         if (cur_mon && !monitor_cur_is_qmp()) {
> >>             monitor_vprintf(cur_mon, fmt, ap);
> >>         } else {
> >>             vfprintf(stderr, fmt, ap);
> >>         }
> >>     }
> >> 
> >> Obviously unsafe to use outside the main thread.  Consider:
> >> 
> >>     bool monitor_cur_is_qmp(void)
> >>     {
> >>         return cur_mon && monitor_is_qmp(cur_mon);
> >>     }
> >> 
> >>     static inline bool monitor_is_qmp(const Monitor *mon)
> >>     {
> >>         return (mon->flags & MONITOR_USE_CONTROL);
> >>     }
> >> 
> >> If monitor_cur_is_qmp() reads cur_mon twice (which it is entitled to
> >> do), this crashes when the main thread sets cur_mon back to null in
> >> between.
> >
> > Yes, but I thought we should not even use these error_vprintf() or
> > sister functions outside the QMP handlers, or at least that's what I
> > thought.  For example, in parsers, we should always use error_setg()
> > or something similar but never error_report().
> 
> error_report() & friends are for general use, not just for monitor
> handlers.  They are for reporting errors to a human user.  error_setg()
> is for passing errors to code.

Ah yes, error_report() will dump to stderr if without @cur_mon, which
I obviously forgot.

> 
> >> Did the OOB work make things any worse?  Let's see.
> >> 
> >> @cur_mon is null unless the main thread is running monitor code, either
> >> HMP within monitor_read():
> >> 
> >>     cur_mon = opaque;
> >> 
> >>     if (cur_mon->rs) {
> >>         for (i = 0; i < size; i++)
> >>             readline_handle_byte(cur_mon->rs, buf[i]);
> >>     } else {
> >>         if (size == 0 || buf[size - 1] != 0)
> >>             monitor_printf(cur_mon, "corrupted command\n");
> >>         else
> >>             handle_hmp_command(cur_mon, (char *)buf);
> >>     }
> >> 
> >>     cur_mon = old_mon;
> >> 
> >> or QMP within monitor_qmp_dispatch():
> >> 
> >>     old_mon = cur_mon;
> >>     cur_mon = mon;
> >> 
> >>     rsp = qmp_dispatch(mon->qmp.commands, req, qmp_oob_enabled(mon));
> >> 
> >>     cur_mon = old_mon;
> >> 
> >> In both cases, old_mon is always null.
> >> 
> >> Fine print: before commit 227a07552f3 "monitor: move the cur_mon hack
> >> deeper for QMP", we ran more code for QMP with cur_mon set, namely the
> >> JSON parser, but that doesn't matter here.
> >> 
> >> More fine print: there's also qmp_human_monitor_command(), which stacks
> >> an HMP monitor on top of the QMP monitor.  Also doesn't matter here.
> >> 
> >> The OOB work doesn't add any new races as long as
> >> 
> >> * it doesn't add assignments to @cur_mon, and
> >> 
> >> * none of the code it moves out of the main thread accesses @cur_mon.
> >> 
> >> The first condition obviously holds.  The second one isn't obvious, but
> >> I figure it holds, too.
> >> 
> >> Okay, I think I've convince myself the OOB work didn't add
> >> cur_mon-related races.
> >
> > Hopefully, yes.  Thanks for the double check.
> 
> Wait!  I think I found one.
> 
> Say we have an HMP monitor running in the main thread (they always do),
> and a QMP monitor running in @mon_iothread.
> 
> Now both threads write to @cur_mon, without synchronization!  The main
> thread writes in monitor_read() when reading HMP input, and in
> monitor_qmp_dispatch() when dispatching in-band QMP commands.
> @mon_iothread writes in monitor_qmp_dispatch() when dispatching
> out-of-band QMP commands.
> 
> Example:
> 
>     in main thread:
> 
>         chardev reads input, calls
>             monitor_read()
>                 old_mon = cur_mon; // null
>                 cur_mon = opaque;  // HMP monitor
>                 ...
> 
>     in mon_iothread:
> 
>         chardev reads input, calls
>             monitor_qmp_read(), calls
>                 handle_qmp_command() via JSON parser
>                     calls monitor_qmp_dispatch() to execute OOB command
>                         old_mon = cur_mon; // HMP monitor!
>                         cur_mon = mon;     // QMP monitor
>                         ...
> 
>     in main thread
>                 ... HMP code runs with @cur_mon pointing to QMP monitor
>                 cur_mon = old_mon; // null
> 
>     in mon_iothread
>                         old_mon = cur_mon; // HMP monitor!
> 
>     all threads
>         non-monitor code runs with @cur_mon pointing to HMP monitor

Hmm seems so, then we might need this patch even eagerer...

> 
> >> >> >> Should this go into 3.0 to reduce the risk of bugs?
> >> >> >
> >> >> > Yes I think it would be good to have that even for 3.0, since it still
> >> >> > can be seen as a bug fix of existing code.
> >> >> 
> >> >> Agreed.
> >> >> 
> >> >> > Regards,
> >> >> >
> >> >> >> > Note that thread variables are not initialized to a valid value when new
> >> >> >> > thread is created.
> >> >> 
> >> >> Confusing.  It sounds like @cur_mon's initial value would be
> >> >> indeterminate, like an automatic variable's.  Not true.  Variables with
> >> >> thread storage duration are initialized when the thread is created.
> >> >> Since @cur_mon's declaration lacks an initializer, it'll be initialized
> >> >> to a null pointer.  Your sentence is correct when you consider that null
> >> >> pointer not a valid value.
> >> >
> >> > Yes that's what I meant.  So how about this?
> >> >
> >> >   Note that the per-thread @cur_mon variable is not initialized to
> >> >   point to a valid Monitor struct when a new thread is created (the
> >> >   default value will be NULL).
> >> >
> >> > Please feel free to tune it up.
> >> 
> >> I think what the patch really changes is the value of @cur_mon outside
> >> the main thread: it remains null there now.  Before, it depended on what
> >> the main thread was doing, and therefore could not be used safely.
> >> 
> >> In other words, the patch makes uses of @cur_mon like the one in
> >> error_vprintf() shown above safe.
> >> 
> >> I think that's what we should explain in the commit message.  I can try
> >> rewriting it,
> >
> > I'll appreciate that if so.
> 
> Here's my try:
> 
>     monitor: Fix unsafe sharing of @cur_mon among threads
> 
>     @cur_mon is null unless the main thread is running monitor code, either
>     HMP code within monitor_read(), or QMP code within
>     monitor_qmp_dispatch().
> 
>     Use of @cur_mon outside the main thread is therefore unsafe.
> 
>     Most of its uses are in monitor command handlers.  These run in the main
>     thread.
> 
>     However, there are also uses hiding elsewhere, such as in
>     error_vprintf(), and thus error_report(), making these functions unsafe
>     outside the main thread.  No such unsafe uses are known at this time.
>     Regardless, this is an unnecessary trap.  It's an ancient trap, though.
> 
>     More recently, commit cf869d53172 "qmp: support out-of-band (oob)
>     execution" spiced things up: the monitor I/O thread assigns to @cur_mon
>     when executing commands out-of-band.  Having two threads save, set and
>     restore @cur_mon without synchronization is definitely unsafe.  We can
>     end up with @cur_mon null while the main thread runs monitor code, or
>     non-null while it runs non-monitor code.
> 
>     We could fix this by making the I/O thread not mess with @cur_mon, but
>     that would leave the trap armed and ready.
> 
>     Instead, make @cur_mon thread-local.  It's now reliably null unless the
>     thread is running monitor code.

Looks good to me.

>     
> >> but right now I got to run.
> >
> > Must be lunch time! :)
> 
> Time to prepare lunch for the family, actually :)

Yeah I even knew that you cooked! :)

Regards,

-- 
Peter Xu

Re: [Qemu-devel] [PATCH v4] monitor: let cur_mon be per-thread
Posted by Markus Armbruster 5 years, 8 months ago
Peter Xu <peterx@redhat.com> writes:

> On Thu, Jul 19, 2018 at 02:14:56PM +0200, Markus Armbruster wrote:
[...]
>> Here's my try:
>> 
>>     monitor: Fix unsafe sharing of @cur_mon among threads
>> 
>>     @cur_mon is null unless the main thread is running monitor code, either
>>     HMP code within monitor_read(), or QMP code within
>>     monitor_qmp_dispatch().
>> 
>>     Use of @cur_mon outside the main thread is therefore unsafe.
>> 
>>     Most of its uses are in monitor command handlers.  These run in the main
>>     thread.
>> 
>>     However, there are also uses hiding elsewhere, such as in
>>     error_vprintf(), and thus error_report(), making these functions unsafe
>>     outside the main thread.  No such unsafe uses are known at this time.
>>     Regardless, this is an unnecessary trap.  It's an ancient trap, though.
>> 
>>     More recently, commit cf869d53172 "qmp: support out-of-band (oob)
>>     execution" spiced things up: the monitor I/O thread assigns to @cur_mon
>>     when executing commands out-of-band.  Having two threads save, set and
>>     restore @cur_mon without synchronization is definitely unsafe.  We can
>>     end up with @cur_mon null while the main thread runs monitor code, or
>>     non-null while it runs non-monitor code.
>> 
>>     We could fix this by making the I/O thread not mess with @cur_mon, but
>>     that would leave the trap armed and ready.
>> 
>>     Instead, make @cur_mon thread-local.  It's now reliably null unless the
>>     thread is running monitor code.
>
> Looks good to me.

With that commit message:
Reviewed-by: Markus Armbruster <armbru@redhat.com>