[v6] monitor: Optionally run handlers in coroutines

[PATCH v6 06/12] monitor: Make current monitor a per-coroutine property

Posted by Kevin Wolf 5 years, 8 months ago

This way, a monitor command handler will still be able to access the
current monitor, but when it yields, all other code code will correctly
get NULL from monitor_cur().

Outside of coroutine context, qemu_coroutine_self() returns the leader
coroutine of the current thread.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
---
 include/monitor/monitor.h |  2 +-
 monitor/hmp.c             |  4 ++--
 monitor/monitor.c         | 27 +++++++++++++++++++++------
 qapi/qmp-dispatch.c       |  4 ++--
 stubs/monitor-core.c      |  2 +-
 5 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
index 43cc746078..16072e325c 100644
--- a/include/monitor/monitor.h
+++ b/include/monitor/monitor.h
@@ -13,7 +13,7 @@ typedef struct MonitorOptions MonitorOptions;
 extern QemuOptsList qemu_mon_opts;
 
 Monitor *monitor_cur(void);
-void monitor_set_cur(Monitor *mon);
+void monitor_set_cur(Coroutine *co, Monitor *mon);
 bool monitor_cur_is_qmp(void);
 
 void monitor_init_globals(void);
diff --git a/monitor/hmp.c b/monitor/hmp.c
index 79be6f26de..3e73a4c3ce 100644
--- a/monitor/hmp.c
+++ b/monitor/hmp.c
@@ -1082,9 +1082,9 @@ void handle_hmp_command(MonitorHMP *mon, const char *cmdline)
 
     /* old_mon is non-NULL when called from qmp_human_monitor_command() */
     old_mon = monitor_cur();
-    monitor_set_cur(&mon->common);
+    monitor_set_cur(qemu_coroutine_self(), &mon->common);
     cmd->cmd(&mon->common, qdict);
-    monitor_set_cur(old_mon);
+    monitor_set_cur(qemu_coroutine_self(), old_mon);
 
     qobject_unref(qdict);
 }
diff --git a/monitor/monitor.c b/monitor/monitor.c
index 182ba136b4..35003bb486 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -58,24 +58,38 @@ IOThread *mon_iothread;
 /* Bottom half to dispatch the requests received from I/O thread */
 QEMUBH *qmp_dispatcher_bh;
 
-/* Protects mon_list, monitor_qapi_event_state, monitor_destroyed.  */
+/*
+ * Protects mon_list, monitor_qapi_event_state, coroutine_mon,
+ * monitor_destroyed.
+ */
 QemuMutex monitor_lock;
 static GHashTable *monitor_qapi_event_state;
+static GHashTable *coroutine_mon; /* Maps Coroutine* to Monitor* */
 
 MonitorList mon_list;
 int mon_refcount;
 static bool monitor_destroyed;
 
-static __thread Monitor *cur_monitor;
-
 Monitor *monitor_cur(void)
 {
-    return cur_monitor;
+    Monitor *mon;
+
+    qemu_mutex_lock(&monitor_lock);
+    mon = g_hash_table_lookup(coroutine_mon, qemu_coroutine_self());
+    qemu_mutex_unlock(&monitor_lock);
+
+    return mon;
 }
 
-void monitor_set_cur(Monitor *mon)
+void monitor_set_cur(Coroutine *co, Monitor *mon)
 {
-    cur_monitor = mon;
+    qemu_mutex_lock(&monitor_lock);
+    if (mon) {
+        g_hash_table_replace(coroutine_mon, co, mon);
+    } else {
+        g_hash_table_remove(coroutine_mon, co);
+    }
+    qemu_mutex_unlock(&monitor_lock);
 }
 
 /**
@@ -613,6 +627,7 @@ void monitor_init_globals_core(void)
 {
     monitor_qapi_event_init();
     qemu_mutex_init(&monitor_lock);
+    coroutine_mon = g_hash_table_new(NULL, NULL);
 
     /*
      * The dispatcher BH must run in the main loop thread, since we
diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
index 2fdbc0fba4..5677ba92ca 100644
--- a/qapi/qmp-dispatch.c
+++ b/qapi/qmp-dispatch.c
@@ -154,11 +154,11 @@ QDict *qmp_dispatch(const QmpCommandList *cmds, QObject *request,
     }
 
     assert(monitor_cur() == NULL);
-    monitor_set_cur(cur_mon);
+    monitor_set_cur(qemu_coroutine_self(), cur_mon);
 
     cmd->fn(args, &ret, &err);
 
-    monitor_set_cur(NULL);
+    monitor_set_cur(qemu_coroutine_self(), NULL);
     qobject_unref(args);
     if (err) {
         /* or assert(!ret) after reviewing all handlers: */
diff --git a/stubs/monitor-core.c b/stubs/monitor-core.c
index e493df1027..635e37a6ba 100644
--- a/stubs/monitor-core.c
+++ b/stubs/monitor-core.c
@@ -8,7 +8,7 @@ Monitor *monitor_cur(void)
     return NULL;
 }
 
-void monitor_set_cur(Monitor *mon)
+void monitor_set_cur(Coroutine *co, Monitor *mon)
 {
 }
 
-- 
2.25.4

Re: [PATCH v6 06/12] monitor: Make current monitor a per-coroutine property

Posted by Eric Blake 5 years, 8 months ago

On 5/28/20 10:37 AM, Kevin Wolf wrote:
> This way, a monitor command handler will still be able to access the
> current monitor, but when it yields, all other code code will correctly
> get NULL from monitor_cur().
> 
> Outside of coroutine context, qemu_coroutine_self() returns the leader
> coroutine of the current thread.
> 
> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---

Reviewed-by: Eric Blake <eblake@redhat.com>

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

Re: [PATCH v6 06/12] monitor: Make current monitor a per-coroutine property

Posted by Markus Armbruster 5 years, 6 months ago

Kevin Wolf <kwolf@redhat.com> writes:

> This way, a monitor command handler will still be able to access the
> current monitor, but when it yields, all other code code will correctly
> get NULL from monitor_cur().
>
> Outside of coroutine context, qemu_coroutine_self() returns the leader
> coroutine of the current thread.

Unsaid: you use it as a hash table key to map from coroutine to monitor,
and for that you need it to return a value unique to the coroutine in
coroutine context, and a value unique to the thread outside coroutine
context.  Which qemu_coroutine_self() does.  Correct?

The hash table works, but I hate it just as much as I hate
pthread_getspecific() / pthread_setspecific().

What we have here is a need for coroutine-local data.  Feels like a
perfectly natural concept to me.

Are we going to create another hash table whenever we need another piece
of coroutine-local data?  Or shall we reuse the hash table, suitably
renamed and moved to another file?

Why not simply associate an opaque pointer with each coroutine?  All it
takes is one more member of struct Coroutine.  Whatever creates the
coroutine decides what to use it for.  The monitor coroutine would use
it to point to the monitor.

At least, discuss the design alternatives in the commit message.

> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> ---
>  include/monitor/monitor.h |  2 +-
>  monitor/hmp.c             |  4 ++--
>  monitor/monitor.c         | 27 +++++++++++++++++++++------
>  qapi/qmp-dispatch.c       |  4 ++--
>  stubs/monitor-core.c      |  2 +-
>  5 files changed, 27 insertions(+), 12 deletions(-)
>
> diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
> index 43cc746078..16072e325c 100644
> --- a/include/monitor/monitor.h
> +++ b/include/monitor/monitor.h
> @@ -13,7 +13,7 @@ typedef struct MonitorOptions MonitorOptions;
>  extern QemuOptsList qemu_mon_opts;
>  
>  Monitor *monitor_cur(void);
> -void monitor_set_cur(Monitor *mon);
> +void monitor_set_cur(Coroutine *co, Monitor *mon);
>  bool monitor_cur_is_qmp(void);
>  
>  void monitor_init_globals(void);
> diff --git a/monitor/hmp.c b/monitor/hmp.c
> index 79be6f26de..3e73a4c3ce 100644
> --- a/monitor/hmp.c
> +++ b/monitor/hmp.c
> @@ -1082,9 +1082,9 @@ void handle_hmp_command(MonitorHMP *mon, const char *cmdline)
>  
>      /* old_mon is non-NULL when called from qmp_human_monitor_command() */
>      old_mon = monitor_cur();
> -    monitor_set_cur(&mon->common);
> +    monitor_set_cur(qemu_coroutine_self(), &mon->common);
>      cmd->cmd(&mon->common, qdict);
> -    monitor_set_cur(old_mon);
> +    monitor_set_cur(qemu_coroutine_self(), old_mon);
>  
>      qobject_unref(qdict);
>  }
> diff --git a/monitor/monitor.c b/monitor/monitor.c
> index 182ba136b4..35003bb486 100644
> --- a/monitor/monitor.c
> +++ b/monitor/monitor.c
> @@ -58,24 +58,38 @@ IOThread *mon_iothread;
>  /* Bottom half to dispatch the requests received from I/O thread */
>  QEMUBH *qmp_dispatcher_bh;
>  
> -/* Protects mon_list, monitor_qapi_event_state, monitor_destroyed.  */
> +/*
> + * Protects mon_list, monitor_qapi_event_state, coroutine_mon,
> + * monitor_destroyed.
> + */
>  QemuMutex monitor_lock;
>  static GHashTable *monitor_qapi_event_state;
> +static GHashTable *coroutine_mon; /* Maps Coroutine* to Monitor* */
>  
>  MonitorList mon_list;
>  int mon_refcount;
>  static bool monitor_destroyed;
>  
> -static __thread Monitor *cur_monitor;
> -
>  Monitor *monitor_cur(void)
>  {
> -    return cur_monitor;
> +    Monitor *mon;
> +
> +    qemu_mutex_lock(&monitor_lock);
> +    mon = g_hash_table_lookup(coroutine_mon, qemu_coroutine_self());
> +    qemu_mutex_unlock(&monitor_lock);
> +
> +    return mon;
>  }
>  
> -void monitor_set_cur(Monitor *mon)
> +void monitor_set_cur(Coroutine *co, Monitor *mon)
>  {
> -    cur_monitor = mon;
> +    qemu_mutex_lock(&monitor_lock);
> +    if (mon) {
> +        g_hash_table_replace(coroutine_mon, co, mon);
> +    } else {
> +        g_hash_table_remove(coroutine_mon, co);
> +    }
> +    qemu_mutex_unlock(&monitor_lock);
>  }

You really need a contract now: any call to monitor_set_cur() with a
non-null @mon must be followed by a call with a null @mon.

>  
>  /**
> @@ -613,6 +627,7 @@ void monitor_init_globals_core(void)
>  {
>      monitor_qapi_event_init();
>      qemu_mutex_init(&monitor_lock);
> +    coroutine_mon = g_hash_table_new(NULL, NULL);
>  
>      /*
>       * The dispatcher BH must run in the main loop thread, since we
> diff --git a/qapi/qmp-dispatch.c b/qapi/qmp-dispatch.c
> index 2fdbc0fba4..5677ba92ca 100644
> --- a/qapi/qmp-dispatch.c
> +++ b/qapi/qmp-dispatch.c
> @@ -154,11 +154,11 @@ QDict *qmp_dispatch(const QmpCommandList *cmds, QObject *request,
>      }
>  
>      assert(monitor_cur() == NULL);
> -    monitor_set_cur(cur_mon);
> +    monitor_set_cur(qemu_coroutine_self(), cur_mon);
>  
>      cmd->fn(args, &ret, &err);
>  
> -    monitor_set_cur(NULL);
> +    monitor_set_cur(qemu_coroutine_self(), NULL);
>      qobject_unref(args);
>      if (err) {
>          /* or assert(!ret) after reviewing all handlers: */
> diff --git a/stubs/monitor-core.c b/stubs/monitor-core.c
> index e493df1027..635e37a6ba 100644
> --- a/stubs/monitor-core.c
> +++ b/stubs/monitor-core.c
> @@ -8,7 +8,7 @@ Monitor *monitor_cur(void)
>      return NULL;
>  }
>  
> -void monitor_set_cur(Monitor *mon)
> +void monitor_set_cur(Coroutine *co, Monitor *mon)
>  {
>  }

Re: [PATCH v6 06/12] monitor: Make current monitor a per-coroutine property

Posted by Kevin Wolf 5 years, 6 months ago

Am 04.08.2020 um 15:50 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > This way, a monitor command handler will still be able to access the
> > current monitor, but when it yields, all other code code will correctly
> > get NULL from monitor_cur().
> >
> > Outside of coroutine context, qemu_coroutine_self() returns the leader
> > coroutine of the current thread.
> 
> Unsaid: you use it as a hash table key to map from coroutine to monitor,
> and for that you need it to return a value unique to the coroutine in
> coroutine context, and a value unique to the thread outside coroutine
> context.  Which qemu_coroutine_self() does.  Correct?

Correct.

> The hash table works, but I hate it just as much as I hate
> pthread_getspecific() / pthread_setspecific().
> 
> What we have here is a need for coroutine-local data.  Feels like a
> perfectly natural concept to me.

If you have a good concept how to implement this in a generic way that
doesn't impact the I/O fast path, feel free to implement it and I'll
happily use it.

But the hash table is simple and works for this use case, so I see
little reason to invest a lot of time in something that we haven't ever
had another user for.

> Are we going to create another hash table whenever we need another piece
> of coroutine-local data?  Or shall we reuse the hash table, suitably
> renamed and moved to another file?

I think I would vote for separate hash tables rather than having a hash
table containing a struct that mixes values from all subsystems, but
this can be discussed when (if) the need arises.

> Why not simply associate an opaque pointer with each coroutine?  All it
> takes is one more member of struct Coroutine.  Whatever creates the
> coroutine decides what to use it for.  The monitor coroutine would use
> it to point to the monitor.

This doesn't work. error_report() is called from all kinds of
coroutines, not just from coroutines created from the monitor, and it
wants to know the current monitor.

> At least, discuss the design alternatives in the commit message.

*sigh* Fine. Tell me which set of alternatives to discuss.

> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> > ---
> >  include/monitor/monitor.h |  2 +-
> >  monitor/hmp.c             |  4 ++--
> >  monitor/monitor.c         | 27 +++++++++++++++++++++------
> >  qapi/qmp-dispatch.c       |  4 ++--
> >  stubs/monitor-core.c      |  2 +-
> >  5 files changed, 27 insertions(+), 12 deletions(-)
> >
> > diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
> > index 43cc746078..16072e325c 100644
> > --- a/include/monitor/monitor.h
> > +++ b/include/monitor/monitor.h
> > @@ -13,7 +13,7 @@ typedef struct MonitorOptions MonitorOptions;
> >  extern QemuOptsList qemu_mon_opts;
> >  
> >  Monitor *monitor_cur(void);
> > -void monitor_set_cur(Monitor *mon);
> > +void monitor_set_cur(Coroutine *co, Monitor *mon);
> >  bool monitor_cur_is_qmp(void);
> >  
> >  void monitor_init_globals(void);
> > diff --git a/monitor/hmp.c b/monitor/hmp.c
> > index 79be6f26de..3e73a4c3ce 100644
> > --- a/monitor/hmp.c
> > +++ b/monitor/hmp.c
> > @@ -1082,9 +1082,9 @@ void handle_hmp_command(MonitorHMP *mon, const char *cmdline)
> >  
> >      /* old_mon is non-NULL when called from qmp_human_monitor_command() */
> >      old_mon = monitor_cur();
> > -    monitor_set_cur(&mon->common);
> > +    monitor_set_cur(qemu_coroutine_self(), &mon->common);
> >      cmd->cmd(&mon->common, qdict);
> > -    monitor_set_cur(old_mon);
> > +    monitor_set_cur(qemu_coroutine_self(), old_mon);
> >  
> >      qobject_unref(qdict);
> >  }
> > diff --git a/monitor/monitor.c b/monitor/monitor.c
> > index 182ba136b4..35003bb486 100644
> > --- a/monitor/monitor.c
> > +++ b/monitor/monitor.c
> > @@ -58,24 +58,38 @@ IOThread *mon_iothread;
> >  /* Bottom half to dispatch the requests received from I/O thread */
> >  QEMUBH *qmp_dispatcher_bh;
> >  
> > -/* Protects mon_list, monitor_qapi_event_state, monitor_destroyed.  */
> > +/*
> > + * Protects mon_list, monitor_qapi_event_state, coroutine_mon,
> > + * monitor_destroyed.
> > + */
> >  QemuMutex monitor_lock;
> >  static GHashTable *monitor_qapi_event_state;
> > +static GHashTable *coroutine_mon; /* Maps Coroutine* to Monitor* */
> >  
> >  MonitorList mon_list;
> >  int mon_refcount;
> >  static bool monitor_destroyed;
> >  
> > -static __thread Monitor *cur_monitor;
> > -
> >  Monitor *monitor_cur(void)
> >  {
> > -    return cur_monitor;
> > +    Monitor *mon;
> > +
> > +    qemu_mutex_lock(&monitor_lock);
> > +    mon = g_hash_table_lookup(coroutine_mon, qemu_coroutine_self());
> > +    qemu_mutex_unlock(&monitor_lock);
> > +
> > +    return mon;
> >  }
> >  
> > -void monitor_set_cur(Monitor *mon)
> > +void monitor_set_cur(Coroutine *co, Monitor *mon)
> >  {
> > -    cur_monitor = mon;
> > +    qemu_mutex_lock(&monitor_lock);
> > +    if (mon) {
> > +        g_hash_table_replace(coroutine_mon, co, mon);
> > +    } else {
> > +        g_hash_table_remove(coroutine_mon, co);
> > +    }
> > +    qemu_mutex_unlock(&monitor_lock);
> >  }
> 
> You really need a contract now: any call to monitor_set_cur() with a
> non-null @mon must be followed by a call with a null @mon.

Why? g_hash_table_replace() removes the old value and replaces it with
the new one.

Kevin

Re: [PATCH v6 06/12] monitor: Make current monitor a per-coroutine property

Posted by Markus Armbruster 5 years, 6 months ago

Kevin Wolf <kwolf@redhat.com> writes:

> Am 04.08.2020 um 15:50 hat Markus Armbruster geschrieben:
>> Kevin Wolf <kwolf@redhat.com> writes:
>> 
>> > This way, a monitor command handler will still be able to access the
>> > current monitor, but when it yields, all other code code will correctly
>> > get NULL from monitor_cur().
>> >
>> > Outside of coroutine context, qemu_coroutine_self() returns the leader
>> > coroutine of the current thread.
>> 
>> Unsaid: you use it as a hash table key to map from coroutine to monitor,
>> and for that you need it to return a value unique to the coroutine in
>> coroutine context, and a value unique to the thread outside coroutine
>> context.  Which qemu_coroutine_self() does.  Correct?
>
> Correct.
>
>> The hash table works, but I hate it just as much as I hate
>> pthread_getspecific() / pthread_setspecific().
>> 
>> What we have here is a need for coroutine-local data.  Feels like a
>> perfectly natural concept to me.
>
> If you have a good concept how to implement this in a generic way that
> doesn't impact the I/O fast path, feel free to implement it and I'll
> happily use it.

Fair enough; I'll give it a shot.

> But the hash table is simple and works for this use case, so I see
> little reason to invest a lot of time in something that we haven't ever
> had another user for.
>
>> Are we going to create another hash table whenever we need another piece
>> of coroutine-local data?  Or shall we reuse the hash table, suitably
>> renamed and moved to another file?
>
> I think I would vote for separate hash tables rather than having a hash
> table containing a struct that mixes values from all subsystems, but
> this can be discussed when (if) the need arises.
>
>> Why not simply associate an opaque pointer with each coroutine?  All it
>> takes is one more member of struct Coroutine.  Whatever creates the
>> coroutine decides what to use it for.  The monitor coroutine would use
>> it to point to the monitor.
>
> This doesn't work. error_report() is called from all kinds of
> coroutines, not just from coroutines created from the monitor, and it
> wants to know the current monitor.

Yup, monitor_cur() and monitor_set_cur() need to work both in coroutine
context and outside coroutine context.

>> At least, discuss the design alternatives in the commit message.
>
> *sigh* Fine. Tell me which set of alternatives to discuss.

Let me first play with the alternative I suggested.

>> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
>> > ---
>> >  include/monitor/monitor.h |  2 +-
>> >  monitor/hmp.c             |  4 ++--
>> >  monitor/monitor.c         | 27 +++++++++++++++++++++------
>> >  qapi/qmp-dispatch.c       |  4 ++--
>> >  stubs/monitor-core.c      |  2 +-
>> >  5 files changed, 27 insertions(+), 12 deletions(-)
>> >
>> > diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
>> > index 43cc746078..16072e325c 100644
>> > --- a/include/monitor/monitor.h
>> > +++ b/include/monitor/monitor.h
>> > @@ -13,7 +13,7 @@ typedef struct MonitorOptions MonitorOptions;
>> >  extern QemuOptsList qemu_mon_opts;
>> >  
>> >  Monitor *monitor_cur(void);
>> > -void monitor_set_cur(Monitor *mon);
>> > +void monitor_set_cur(Coroutine *co, Monitor *mon);
>> >  bool monitor_cur_is_qmp(void);
>> >  
>> >  void monitor_init_globals(void);
>> > diff --git a/monitor/hmp.c b/monitor/hmp.c
>> > index 79be6f26de..3e73a4c3ce 100644
>> > --- a/monitor/hmp.c
>> > +++ b/monitor/hmp.c
>> > @@ -1082,9 +1082,9 @@ void handle_hmp_command(MonitorHMP *mon, const char *cmdline)
>> >  
>> >      /* old_mon is non-NULL when called from qmp_human_monitor_command() */
>> >      old_mon = monitor_cur();
>> > -    monitor_set_cur(&mon->common);
>> > +    monitor_set_cur(qemu_coroutine_self(), &mon->common);
>> >      cmd->cmd(&mon->common, qdict);
>> > -    monitor_set_cur(old_mon);
>> > +    monitor_set_cur(qemu_coroutine_self(), old_mon);
>> >  
>> >      qobject_unref(qdict);
>> >  }
>> > diff --git a/monitor/monitor.c b/monitor/monitor.c
>> > index 182ba136b4..35003bb486 100644
>> > --- a/monitor/monitor.c
>> > +++ b/monitor/monitor.c
>> > @@ -58,24 +58,38 @@ IOThread *mon_iothread;
>> >  /* Bottom half to dispatch the requests received from I/O thread */
>> >  QEMUBH *qmp_dispatcher_bh;
>> >  
>> > -/* Protects mon_list, monitor_qapi_event_state, monitor_destroyed.  */
>> > +/*
>> > + * Protects mon_list, monitor_qapi_event_state, coroutine_mon,
>> > + * monitor_destroyed.
>> > + */
>> >  QemuMutex monitor_lock;
>> >  static GHashTable *monitor_qapi_event_state;
>> > +static GHashTable *coroutine_mon; /* Maps Coroutine* to Monitor* */
>> >  
>> >  MonitorList mon_list;
>> >  int mon_refcount;
>> >  static bool monitor_destroyed;
>> >  
>> > -static __thread Monitor *cur_monitor;
>> > -
>> >  Monitor *monitor_cur(void)
>> >  {
>> > -    return cur_monitor;
>> > +    Monitor *mon;
>> > +
>> > +    qemu_mutex_lock(&monitor_lock);
>> > +    mon = g_hash_table_lookup(coroutine_mon, qemu_coroutine_self());
>> > +    qemu_mutex_unlock(&monitor_lock);
>> > +
>> > +    return mon;
>> >  }
>> >  
>> > -void monitor_set_cur(Monitor *mon)
>> > +void monitor_set_cur(Coroutine *co, Monitor *mon)
>> >  {
>> > -    cur_monitor = mon;
>> > +    qemu_mutex_lock(&monitor_lock);
>> > +    if (mon) {
>> > +        g_hash_table_replace(coroutine_mon, co, mon);
>> > +    } else {
>> > +        g_hash_table_remove(coroutine_mon, co);
>> > +    }
>> > +    qemu_mutex_unlock(&monitor_lock);
>> >  }
>> 
>> You really need a contract now: any call to monitor_set_cur() with a
>> non-null @mon must be followed by a call with a null @mon.
>
> Why? g_hash_table_replace() removes the old value and replaces it with
> the new one.

If you monitor_set_cur(NULL) is forgotten or bypassed somehow, the hash
table entry stays even when the coroutine dies.  Minor memory leak.  If
another coroutine gets created at the same address, it "inherits" the
current monitor.  Not good.  If the monitor has died meanwhile, dangling
pointer.  Fortunately, monitors die only during shutdown, except for the
dummy in qmp_human_monitor_command().

Re: [PATCH v6 06/12] monitor: Make current monitor a per-coroutine property

Posted by Kevin Wolf 5 years, 6 months ago

Am 05.08.2020 um 09:28 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 04.08.2020 um 15:50 hat Markus Armbruster geschrieben:
> >> Kevin Wolf <kwolf@redhat.com> writes:
> >> 
> >> > This way, a monitor command handler will still be able to access the
> >> > current monitor, but when it yields, all other code code will correctly
> >> > get NULL from monitor_cur().
> >> >
> >> > Outside of coroutine context, qemu_coroutine_self() returns the leader
> >> > coroutine of the current thread.
> >> 
> >> Unsaid: you use it as a hash table key to map from coroutine to monitor,
> >> and for that you need it to return a value unique to the coroutine in
> >> coroutine context, and a value unique to the thread outside coroutine
> >> context.  Which qemu_coroutine_self() does.  Correct?
> >
> > Correct.
> >
> >> The hash table works, but I hate it just as much as I hate
> >> pthread_getspecific() / pthread_setspecific().
> >> 
> >> What we have here is a need for coroutine-local data.  Feels like a
> >> perfectly natural concept to me.
> >
> > If you have a good concept how to implement this in a generic way that
> > doesn't impact the I/O fast path, feel free to implement it and I'll
> > happily use it.
> 
> Fair enough; I'll give it a shot.
> 
> > But the hash table is simple and works for this use case, so I see
> > little reason to invest a lot of time in something that we haven't ever
> > had another user for.
> >
> >> Are we going to create another hash table whenever we need another piece
> >> of coroutine-local data?  Or shall we reuse the hash table, suitably
> >> renamed and moved to another file?
> >
> > I think I would vote for separate hash tables rather than having a hash
> > table containing a struct that mixes values from all subsystems, but
> > this can be discussed when (if) the need arises.
> >
> >> Why not simply associate an opaque pointer with each coroutine?  All it
> >> takes is one more member of struct Coroutine.  Whatever creates the
> >> coroutine decides what to use it for.  The monitor coroutine would use
> >> it to point to the monitor.
> >
> > This doesn't work. error_report() is called from all kinds of
> > coroutines, not just from coroutines created from the monitor, and it
> > wants to know the current monitor.
> 
> Yup, monitor_cur() and monitor_set_cur() need to work both in coroutine
> context and outside coroutine context.

And in coroutine contexts, but in coroutine created by someone else than
the monitor.

> >> At least, discuss the design alternatives in the commit message.
> >
> > *sigh* Fine. Tell me which set of alternatives to discuss.
> 
> Let me first play with the alternative I suggested.
> 
> >> > Signed-off-by: Kevin Wolf <kwolf@redhat.com>
> >> > ---
> >> >  include/monitor/monitor.h |  2 +-
> >> >  monitor/hmp.c             |  4 ++--
> >> >  monitor/monitor.c         | 27 +++++++++++++++++++++------
> >> >  qapi/qmp-dispatch.c       |  4 ++--
> >> >  stubs/monitor-core.c      |  2 +-
> >> >  5 files changed, 27 insertions(+), 12 deletions(-)
> >> >
> >> > diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h
> >> > index 43cc746078..16072e325c 100644
> >> > --- a/include/monitor/monitor.h
> >> > +++ b/include/monitor/monitor.h
> >> > @@ -13,7 +13,7 @@ typedef struct MonitorOptions MonitorOptions;
> >> >  extern QemuOptsList qemu_mon_opts;
> >> >  
> >> >  Monitor *monitor_cur(void);
> >> > -void monitor_set_cur(Monitor *mon);
> >> > +void monitor_set_cur(Coroutine *co, Monitor *mon);
> >> >  bool monitor_cur_is_qmp(void);
> >> >  
> >> >  void monitor_init_globals(void);
> >> > diff --git a/monitor/hmp.c b/monitor/hmp.c
> >> > index 79be6f26de..3e73a4c3ce 100644
> >> > --- a/monitor/hmp.c
> >> > +++ b/monitor/hmp.c
> >> > @@ -1082,9 +1082,9 @@ void handle_hmp_command(MonitorHMP *mon, const char *cmdline)
> >> >  
> >> >      /* old_mon is non-NULL when called from qmp_human_monitor_command() */
> >> >      old_mon = monitor_cur();
> >> > -    monitor_set_cur(&mon->common);
> >> > +    monitor_set_cur(qemu_coroutine_self(), &mon->common);
> >> >      cmd->cmd(&mon->common, qdict);
> >> > -    monitor_set_cur(old_mon);
> >> > +    monitor_set_cur(qemu_coroutine_self(), old_mon);
> >> >  
> >> >      qobject_unref(qdict);
> >> >  }
> >> > diff --git a/monitor/monitor.c b/monitor/monitor.c
> >> > index 182ba136b4..35003bb486 100644
> >> > --- a/monitor/monitor.c
> >> > +++ b/monitor/monitor.c
> >> > @@ -58,24 +58,38 @@ IOThread *mon_iothread;
> >> >  /* Bottom half to dispatch the requests received from I/O thread */
> >> >  QEMUBH *qmp_dispatcher_bh;
> >> >  
> >> > -/* Protects mon_list, monitor_qapi_event_state, monitor_destroyed.  */
> >> > +/*
> >> > + * Protects mon_list, monitor_qapi_event_state, coroutine_mon,
> >> > + * monitor_destroyed.
> >> > + */
> >> >  QemuMutex monitor_lock;
> >> >  static GHashTable *monitor_qapi_event_state;
> >> > +static GHashTable *coroutine_mon; /* Maps Coroutine* to Monitor* */
> >> >  
> >> >  MonitorList mon_list;
> >> >  int mon_refcount;
> >> >  static bool monitor_destroyed;
> >> >  
> >> > -static __thread Monitor *cur_monitor;
> >> > -
> >> >  Monitor *monitor_cur(void)
> >> >  {
> >> > -    return cur_monitor;
> >> > +    Monitor *mon;
> >> > +
> >> > +    qemu_mutex_lock(&monitor_lock);
> >> > +    mon = g_hash_table_lookup(coroutine_mon, qemu_coroutine_self());
> >> > +    qemu_mutex_unlock(&monitor_lock);
> >> > +
> >> > +    return mon;
> >> >  }
> >> >  
> >> > -void monitor_set_cur(Monitor *mon)
> >> > +void monitor_set_cur(Coroutine *co, Monitor *mon)
> >> >  {
> >> > -    cur_monitor = mon;
> >> > +    qemu_mutex_lock(&monitor_lock);
> >> > +    if (mon) {
> >> > +        g_hash_table_replace(coroutine_mon, co, mon);
> >> > +    } else {
> >> > +        g_hash_table_remove(coroutine_mon, co);
> >> > +    }
> >> > +    qemu_mutex_unlock(&monitor_lock);
> >> >  }
> >> 
> >> You really need a contract now: any call to monitor_set_cur() with a
> >> non-null @mon must be followed by a call with a null @mon.
> >
> > Why? g_hash_table_replace() removes the old value and replaces it with
> > the new one.
> 
> If you monitor_set_cur(NULL) is forgotten or bypassed somehow, the hash
> table entry stays even when the coroutine dies.  Minor memory leak.  If
> another coroutine gets created at the same address, it "inherits" the
> current monitor.  Not good.  If the monitor has died meanwhile, dangling
> pointer.  Fortunately, monitors die only during shutdown, except for the
> dummy in qmp_human_monitor_command().

Ah, yes, fair. I can document this.

In practice not a problem because the QMP dispatcher coroutine and HMP
command handler coroutines are the only places that set (and reset) it.

In fact, HMP needs to be fixed to reset to NULL before the coroutine
terminates.

Kevin

Ways to do per-coroutine properties (was: [PATCH v6 06/12] monitor: Make current monitor a per-coroutine property)

Posted by Markus Armbruster 5 years, 6 months ago

I called for a discussion of design alternatives, because I dislike the
one I got.  Here we go.

= Context: the "current monitor" =

Output of HMP commands needs to go to the HMP monitor executing the
command.  Trivial in HMP command handlers: the handler function takes a
monitor argument.  Not so trivial in code used both by HMP command
handlers and other users, such as CLI.  In particular, passing the
monitor through multiple layers that don't want to know anything about
monitors to the point that reports an error just so we can make the
error report go where it needs to go would be impractical.  We made
error_report() & friends do the right thing without such help.

To let them do that, we maintain a "current monitor".

    Invariant: while executing a monitor command, thread-local variable
    @cur_mon points to the monitor executing the command.  When the
    thread is not executing a monitor command, @cur_mon is null.

Now error_report() can do the right thing easily: print to @cur_mon if
non-null, else to stderr.

We also use @cur_mon for getting file descriptors stored in the monitor.
Could perhaps do without @cur_mon, but since it's there anyway...

= Problem at hand: "current monitor" for coroutine-enabled commands =

We want to be able to run monitor commands in a coroutine, so they can
yield instead of blocking the main loop.

Simply yielding in a monitor command violates the invariant: we're no
longer executing a monitor command[*], but @cur_mon is still non-null.

This is because the current monitor is no longer a property of the
thread, but a property of the coroutine.  Thread-local variable @cur_mon
doesn't fit the bill anymore.

= Solution 1: A separate map coroutine -> current monitor =

Kevin implemented this, using a hash table.

PRO:

* Stays off the coroutine switch hot path (by staying off coroutine code
  entirely).

CON

* It's a one-off (but at least it's confined to monitor.c)

* It's slow, and uses locks (but that's probably okay for this use; see
  also one-off).

* We get to worry about consistency between coroutines and the hash
  table.

While this looks servicable, I wonder whether we can we come up with
something a bit more elegant.

= Solution 2: Put the map into struct Coroutine =

The hash table can be replaced by putting a @cur_mon member right into
struct Coroutine, together with a setter and a getter function.

PRO

* Stays off the coroutine switch hot path.

CON

* It's a one off.

* HMP bleeds into the coroutine subsystem, which really doesn't want to
  know anything about monitors.

Thanks, but no thanks.

= Solution 3: Put abstract maps into struct Coroutine =

Daniel's proposal: instead of putting a Monitor * member into struct
Coroutine, put an array of void * there, indexed by well-known data
keys.  Initially, there is just one data key, for the current monitor.

This is basically pthread_setspecific(), pthread_getspecific() for
coroutines, with pthread_key_create() dumbed down to a static set of
well-known keys.

PRO

* Stays off the coroutine switch hot path.

* Similar to how thread-local storage works with traditional pthreads.

CON

* Similar to how thread-local storage works with traditional pthreads.

= Solution 4: Fixed coroutine-local storage =

Whereas solution 3 is like traditional pthreads, this solution works
more like __thread does under the hood: we allocate memory for
coroutine-local storage on coroutine creation, maintain a global pointer
on thread switch, and free the memory on destruction.

We can keep the global pointer in struct Coroutine, and have a getter
return it.

If accessing coroutine-local storage ever becomes a performance
bottleneck, we can either open-code the getter, or store the pointer in
thread-local storage (but then we need to update it in the coroutine
switch hot path).  No need to worry about all that now.

Since we don't have compiler and linker support, we have to collect the
coroutine-local variables in a struct manually.

PRO

* Stays off the coroutine switch hot path.

* Access could be made quite fast if need be.

CON

* The struct of coroutine-local variable crosses subsystem boundaries.

= Solution 5: Optional coroutine-specific storage =

When creating a coroutine, you can optionally ask for a certain amount
of coroutine-specific memory.  It's malloced, stored in struct
Coroutine, and freed when on deletion.

A getter returns the coroutine-specific memory.  To actually use it, you
have to know the coroutine's coroutine-specific memory layout.

PRO

* Stays off the coroutine switch hot path.

* Access could be made quite fast if need be.

CON

* Having to know the coroutine's coroutine-specifc memory layout could
  turn out to be impractical for some applications of "property of a
  coroutine".

This is the solution I had in mind from the start.  I have prototype
code that passes basic testing.

= Solution 6: Exploit there is just two coroutines involved =

A simpler solution is possible, but to understand it, you first have to
understand how the threads and coroutines work together.  Let me
recapitulate.

In old QEMU, all monitors run in the main thread's main loop, and
together execute one command after the other.  @cur_mon was a global
variable, to be accessed only by the main thread.

Commit 62aa1d887f "monitor: Fix unsafe sharing of @cur_mon among
threads" (v3.0.0) made @cur_mon thread-local.  "Fix" was a bit of an
overstatement; no unsafe access was known.

The OOB work moved a part of the QMP monitor work from the main loop
into @mon_iothread.  @mon_iothread sends commands to the main thread for
execution, except for commands executed "out-of-band".

This series moves the main thread's QMP command dispatch into coroutine
@qmp_dispatcher_co.  Commands that aren't coroutine-capable get
dispatched to a one-shot bottom half, also in the main thread.

The series modifies the main thread's HMP command dispatch to wrap
execution of each coroutine-capable command in a newly created
coroutine.

We have:

* OOB commands running in @mon_iothread, outside coroutine context

* Coroutine-incapable QMP commands running in the main thread, outside
  coroutine context (detail: in a bottom half)

* Coroutine-incapable HMP commands running in the main thread, outside
  coroutine-incapable context

* Coroutine-capable QMP commands running in the main thread, in
  coroutine @qmp_dispatcher_co

* Coroutine-capable HMP commands runnning in the main thread, in a
  coroutine created just for the command

* At most one non-OOB command is executing at any time.

Let's ignore HMP for now.  Observe:

* As long as there is just one @qmp_dispatcher_co, there is just one
  current monitor for coroutine-capable QMP commands at any time.  It
  can therefore be stored in a simple global variable
  @qmp_dispatcher_co_mon.

* For the coroutine-incapable commands, thread-local variable @cur_mon
  suffices.

* If qemu_coroutine_self() == qmp_dispatcher_co, the current monitor is
  @qmp_dispatcher_co_mon.  Else it's @cur_mon.

To extend this to HMP, we have to make the handle_hmp_command()'s local
variable @co a global one.

PRO:

* Stays off the coroutine switch hot path (by staying off coroutine code
  entirely).

* Simple code.

CON

* It's a one-off (but at least it's confined to monitor.c).

* The argument behind the code is less than simple (see above).

* Should our monitor coroutines multiply, say because we pull off
  executing (some) in-band commands in monitor I/O thread(s), the
  solution falls apart.

I have prototype code that passes basic testing.

Opinions?

I'll post my two prototypes shortly.


[*] In theory, we could yield to a coroutine that is executing another
monitor's monitor command.  In practice, we haven't implemented that.

[PATCH] Simple & stupid coroutine-aware monitor_cur()

Posted by Markus Armbruster 5 years, 6 months ago

This is just a sketch.  It's incomplete, needs comments and a real
commit message.

Support for "[PATCH v6 09/12] hmp: Add support for coroutine command
handlers" is missing.  Marked FIXME.

As is, it goes on top of Kevin's series.  It is meant to be squashed
into PATCH 06, except for the FIXME, which needs to be resolved in PATCH
09 instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 monitor/monitor.c | 35 +++++++++++++++--------------------
 1 file changed, 15 insertions(+), 20 deletions(-)

diff --git a/monitor/monitor.c b/monitor/monitor.c
index 50fb5b20d3..8601340285 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -82,38 +82,34 @@ bool qmp_dispatcher_co_shutdown;
  */
 bool qmp_dispatcher_co_busy;
 
-/*
- * Protects mon_list, monitor_qapi_event_state, coroutine_mon,
- * monitor_destroyed.
- */
+/* Protects mon_list, monitor_qapi_event_state, * monitor_destroyed. */
 QemuMutex monitor_lock;
 static GHashTable *monitor_qapi_event_state;
-static GHashTable *coroutine_mon; /* Maps Coroutine* to Monitor* */
 
 MonitorList mon_list;
 int mon_refcount;
 static bool monitor_destroyed;
 
+static Monitor **monitor_curp(Coroutine *co)
+{
+    static __thread Monitor *thread_local_mon;
+    static Monitor *qmp_dispatcher_co_mon;
+
+    if (qemu_coroutine_self() == qmp_dispatcher_co) {
+        return &qmp_dispatcher_co_mon;
+    }
+    /* FIXME the coroutine hidden in handle_hmp_command() */
+    return &thread_local_mon;
+}
+
 Monitor *monitor_cur(void)
 {
-    Monitor *mon;
-
-    qemu_mutex_lock(&monitor_lock);
-    mon = g_hash_table_lookup(coroutine_mon, qemu_coroutine_self());
-    qemu_mutex_unlock(&monitor_lock);
-
-    return mon;
+    return *monitor_curp(qemu_coroutine_self());
 }
 
 void monitor_set_cur(Coroutine *co, Monitor *mon)
 {
-    qemu_mutex_lock(&monitor_lock);
-    if (mon) {
-        g_hash_table_replace(coroutine_mon, co, mon);
-    } else {
-        g_hash_table_remove(coroutine_mon, co);
-    }
-    qemu_mutex_unlock(&monitor_lock);
+    *monitor_curp(co) = mon;
 }
 
 /**
@@ -666,7 +662,6 @@ void monitor_init_globals_core(void)
 {
     monitor_qapi_event_init();
     qemu_mutex_init(&monitor_lock);
-    coroutine_mon = g_hash_table_new(NULL, NULL);
 
     /*
      * The dispatcher BH must run in the main loop thread, since we
-- 
2.26.2

Re: [PATCH] Simple & stupid coroutine-aware monitor_cur()

Posted by Kevin Wolf 5 years, 6 months ago

Am 07.08.2020 um 15:27 hat Markus Armbruster geschrieben:
> This is just a sketch.  It's incomplete, needs comments and a real
> commit message.
> 
> Support for "[PATCH v6 09/12] hmp: Add support for coroutine command
> handlers" is missing.  Marked FIXME.
> 
> As is, it goes on top of Kevin's series.  It is meant to be squashed
> into PATCH 06, except for the FIXME, which needs to be resolved in PATCH
> 09 instead.
> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>  monitor/monitor.c | 35 +++++++++++++++--------------------
>  1 file changed, 15 insertions(+), 20 deletions(-)
> 
> diff --git a/monitor/monitor.c b/monitor/monitor.c
> index 50fb5b20d3..8601340285 100644
> --- a/monitor/monitor.c
> +++ b/monitor/monitor.c
> @@ -82,38 +82,34 @@ bool qmp_dispatcher_co_shutdown;
>   */
>  bool qmp_dispatcher_co_busy;
>  
> -/*
> - * Protects mon_list, monitor_qapi_event_state, coroutine_mon,
> - * monitor_destroyed.
> - */
> +/* Protects mon_list, monitor_qapi_event_state, * monitor_destroyed. */
>  QemuMutex monitor_lock;
>  static GHashTable *monitor_qapi_event_state;
> -static GHashTable *coroutine_mon; /* Maps Coroutine* to Monitor* */
>  
>  MonitorList mon_list;
>  int mon_refcount;
>  static bool monitor_destroyed;
>  
> +static Monitor **monitor_curp(Coroutine *co)
> +{
> +    static __thread Monitor *thread_local_mon;
> +    static Monitor *qmp_dispatcher_co_mon;
> +
> +    if (qemu_coroutine_self() == qmp_dispatcher_co) {
> +        return &qmp_dispatcher_co_mon;
> +    }
> +    /* FIXME the coroutine hidden in handle_hmp_command() */
> +    return &thread_local_mon;
> +}

Is thread_local_mon supposed to ever be set? The only callers of
monitor_set_cur() are the HMP and QMP dispatchers, which will return
something different.

So should we return NULL insetad of thread_local_mon...

>  Monitor *monitor_cur(void)
>  {
> -    Monitor *mon;
> -
> -    qemu_mutex_lock(&monitor_lock);
> -    mon = g_hash_table_lookup(coroutine_mon, qemu_coroutine_self());
> -    qemu_mutex_unlock(&monitor_lock);
> -
> -    return mon;
> +    return *monitor_curp(qemu_coroutine_self());
>  }

...and return NULL here if monitor_curp() returned NULL...

>  void monitor_set_cur(Coroutine *co, Monitor *mon)
>  {
> -    qemu_mutex_lock(&monitor_lock);
> -    if (mon) {
> -        g_hash_table_replace(coroutine_mon, co, mon);
> -    } else {
> -        g_hash_table_remove(coroutine_mon, co);
> -    }
> -    qemu_mutex_unlock(&monitor_lock);
> +    *monitor_curp(co) = mon;

...and assert(monitor_curp(co) != NULL) here?

This approach looks workable, though the implementation of
monitor_curp() feels a bit brittle. The code is not significantly
simpler than the hash table based approach, but the assumptions it makes
are a bit more hidden.

Saving the locks is more a theoretical improvement because all callers
are slows paths anyway.

Kevin

Re: [PATCH] Simple & stupid coroutine-aware monitor_cur()

Posted by Markus Armbruster 5 years, 5 months ago

Kevin Wolf <kwolf@redhat.com> writes:

> Am 07.08.2020 um 15:27 hat Markus Armbruster geschrieben:
>> This is just a sketch.  It's incomplete, needs comments and a real
>> commit message.
>> 
>> Support for "[PATCH v6 09/12] hmp: Add support for coroutine command
>> handlers" is missing.  Marked FIXME.
>> 
>> As is, it goes on top of Kevin's series.  It is meant to be squashed
>> into PATCH 06, except for the FIXME, which needs to be resolved in PATCH
>> 09 instead.
>> 
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>>  monitor/monitor.c | 35 +++++++++++++++--------------------
>>  1 file changed, 15 insertions(+), 20 deletions(-)
>> 
>> diff --git a/monitor/monitor.c b/monitor/monitor.c
>> index 50fb5b20d3..8601340285 100644
>> --- a/monitor/monitor.c
>> +++ b/monitor/monitor.c
>> @@ -82,38 +82,34 @@ bool qmp_dispatcher_co_shutdown;
>>   */
>>  bool qmp_dispatcher_co_busy;
>>  
>> -/*
>> - * Protects mon_list, monitor_qapi_event_state, coroutine_mon,
>> - * monitor_destroyed.
>> - */
>> +/* Protects mon_list, monitor_qapi_event_state, * monitor_destroyed. */
>>  QemuMutex monitor_lock;
>>  static GHashTable *monitor_qapi_event_state;
>> -static GHashTable *coroutine_mon; /* Maps Coroutine* to Monitor* */
>>  
>>  MonitorList mon_list;
>>  int mon_refcount;
>>  static bool monitor_destroyed;
>>  
>> +static Monitor **monitor_curp(Coroutine *co)
>> +{
>> +    static __thread Monitor *thread_local_mon;
>> +    static Monitor *qmp_dispatcher_co_mon;
>> +
>> +    if (qemu_coroutine_self() == qmp_dispatcher_co) {
>> +        return &qmp_dispatcher_co_mon;
>> +    }
>> +    /* FIXME the coroutine hidden in handle_hmp_command() */
>> +    return &thread_local_mon;
>> +}
>
> Is thread_local_mon supposed to ever be set? The only callers of
> monitor_set_cur() are the HMP and QMP dispatchers, which will return
> something different.

OOB commands are executed in @mon_iothread, outside coroutine context.
qmp_dispatch() calls monitor_set_cur(), which sets thread_local_mon
then.

Since there is just one @mon_iothread, a @global_mon without __thread
would do, but I don't see a need to exploit that here.

> So should we return NULL insetad of thread_local_mon...
>
>>  Monitor *monitor_cur(void)
>>  {
>> -    Monitor *mon;
>> -
>> -    qemu_mutex_lock(&monitor_lock);
>> -    mon = g_hash_table_lookup(coroutine_mon, qemu_coroutine_self());
>> -    qemu_mutex_unlock(&monitor_lock);
>> -
>> -    return mon;
>> +    return *monitor_curp(qemu_coroutine_self());
>>  }
>
> ...and return NULL here if monitor_curp() returned NULL...
>
>>  void monitor_set_cur(Coroutine *co, Monitor *mon)
>>  {
>> -    qemu_mutex_lock(&monitor_lock);
>> -    if (mon) {
>> -        g_hash_table_replace(coroutine_mon, co, mon);
>> -    } else {
>> -        g_hash_table_remove(coroutine_mon, co);
>> -    }
>> -    qemu_mutex_unlock(&monitor_lock);
>> +    *monitor_curp(co) = mon;
>
> ...and assert(monitor_curp(co) != NULL) here?
>
> This approach looks workable, though the implementation of
> monitor_curp() feels a bit brittle. The code is not significantly
> simpler than the hash table based approach, but the assumptions it makes
> are a bit more hidden.
>
> Saving the locks is more a theoretical improvement because all callers
> are slows paths anyway.

The hash table only ever has three keys: qmp_dispatcher_co, the
coroutine hidden in handle_hmp_command(), and mon_iothread's leader (not
in coroutine context).

My version replaces the hash table by three pointer variables (two in
the sketch above, because I didn't implement the third).

You point out my code relies on an argument about which coroutines can
execute commands.  True.  But I have to make that argument anyway to
understand how the coroutine-enabled monitor works.

On the other hand, it doesn't rely on an argument about the consistency
of the hash table with the coroutines.

[PATCH] Coroutine-aware monitor_cur() with coroutine-specific data

Posted by Markus Armbruster 5 years, 6 months ago

This is just a sketch.  It needs comments and a real commit message.

As is, it goes on top of Kevin's series.  It is meant to be squashed
into PATCH 06.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
---
 include/qemu/coroutine.h     |  4 ++++
 include/qemu/coroutine_int.h |  2 ++
 monitor/monitor.c            | 36 +++++++++++++++---------------------
 util/qemu-coroutine.c        | 20 ++++++++++++++++++++
 4 files changed, 41 insertions(+), 21 deletions(-)

diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
index dfd261c5b1..11da47092c 100644
--- a/include/qemu/coroutine.h
+++ b/include/qemu/coroutine.h
@@ -65,6 +65,10 @@ typedef void coroutine_fn CoroutineEntry(void *opaque);
  */
 Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque);
 
+Coroutine *qemu_coroutine_create_with_storage(CoroutineEntry *entry,
+                                              void *opaque, size_t storage);
+void *qemu_coroutine_local_storage(Coroutine *co);
+
 /**
  * Transfer control to a coroutine
  */
diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
index bd6b0468e1..7d7865a02f 100644
--- a/include/qemu/coroutine_int.h
+++ b/include/qemu/coroutine_int.h
@@ -41,6 +41,8 @@ struct Coroutine {
     void *entry_arg;
     Coroutine *caller;
 
+    void *coroutine_local_storage;
+
     /* Only used when the coroutine has terminated.  */
     QSLIST_ENTRY(Coroutine) pool_next;
 
diff --git a/monitor/monitor.c b/monitor/monitor.c
index 50fb5b20d3..047a8fb380 100644
--- a/monitor/monitor.c
+++ b/monitor/monitor.c
@@ -82,38 +82,32 @@ bool qmp_dispatcher_co_shutdown;
  */
 bool qmp_dispatcher_co_busy;
 
-/*
- * Protects mon_list, monitor_qapi_event_state, coroutine_mon,
- * monitor_destroyed.
- */
+/* Protects mon_list, monitor_qapi_event_state, monitor_destroyed. */
 QemuMutex monitor_lock;
 static GHashTable *monitor_qapi_event_state;
-static GHashTable *coroutine_mon; /* Maps Coroutine* to Monitor* */
 
 MonitorList mon_list;
 int mon_refcount;
 static bool monitor_destroyed;
 
+static Monitor **monitor_curp(Coroutine *co)
+{
+    static __thread Monitor *global_cur_mon;
+
+    if (co == qmp_dispatcher_co) {
+        return qemu_coroutine_local_storage(co);
+    }
+    return &global_cur_mon;
+}
+
 Monitor *monitor_cur(void)
 {
-    Monitor *mon;
-
-    qemu_mutex_lock(&monitor_lock);
-    mon = g_hash_table_lookup(coroutine_mon, qemu_coroutine_self());
-    qemu_mutex_unlock(&monitor_lock);
-
-    return mon;
+    return *monitor_curp(qemu_coroutine_self());
 }
 
 void monitor_set_cur(Coroutine *co, Monitor *mon)
 {
-    qemu_mutex_lock(&monitor_lock);
-    if (mon) {
-        g_hash_table_replace(coroutine_mon, co, mon);
-    } else {
-        g_hash_table_remove(coroutine_mon, co);
-    }
-    qemu_mutex_unlock(&monitor_lock);
+    *monitor_curp(co) = mon;
 }
 
 /**
@@ -666,14 +660,14 @@ void monitor_init_globals_core(void)
 {
     monitor_qapi_event_init();
     qemu_mutex_init(&monitor_lock);
-    coroutine_mon = g_hash_table_new(NULL, NULL);
 
     /*
      * The dispatcher BH must run in the main loop thread, since we
      * have commands assuming that context.  It would be nice to get
      * rid of those assumptions.
      */
-    qmp_dispatcher_co = qemu_coroutine_create(monitor_qmp_dispatcher_co, NULL);
+    qmp_dispatcher_co = qemu_coroutine_create_with_storage(
+        monitor_qmp_dispatcher_co, NULL, sizeof(Monitor **));
     atomic_mb_set(&qmp_dispatcher_co_busy, true);
     aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
 }
diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
index c3caa6c770..87bf7f0fc0 100644
--- a/util/qemu-coroutine.c
+++ b/util/qemu-coroutine.c
@@ -81,8 +81,28 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque)
     return co;
 }
 
+Coroutine *qemu_coroutine_create_with_storage(CoroutineEntry *entry,
+                                              void *opaque, size_t storage)
+{
+    Coroutine *co = qemu_coroutine_create(entry, opaque);
+
+    if (!co) {
+        return NULL;
+    }
+
+    co->coroutine_local_storage = g_malloc0(storage);
+    return co;
+}
+
+void *qemu_coroutine_local_storage(Coroutine *co)
+{
+    return co->coroutine_local_storage;
+}
+
 static void coroutine_delete(Coroutine *co)
 {
+    g_free(co->coroutine_local_storage);
+    co->coroutine_local_storage = NULL;
     co->caller = NULL;
 
     if (CONFIG_COROUTINE_POOL) {
-- 
2.26.2

Re: [PATCH] Coroutine-aware monitor_cur() with coroutine-specific data

Posted by Kevin Wolf 5 years, 6 months ago

Am 07.08.2020 um 15:29 hat Markus Armbruster geschrieben:
> This is just a sketch.  It needs comments and a real commit message.
> 
> As is, it goes on top of Kevin's series.  It is meant to be squashed
> into PATCH 06.
> 
> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> ---
>  include/qemu/coroutine.h     |  4 ++++
>  include/qemu/coroutine_int.h |  2 ++
>  monitor/monitor.c            | 36 +++++++++++++++---------------------
>  util/qemu-coroutine.c        | 20 ++++++++++++++++++++
>  4 files changed, 41 insertions(+), 21 deletions(-)
> 
> diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
> index dfd261c5b1..11da47092c 100644
> --- a/include/qemu/coroutine.h
> +++ b/include/qemu/coroutine.h
> @@ -65,6 +65,10 @@ typedef void coroutine_fn CoroutineEntry(void *opaque);
>   */
>  Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque);
>  
> +Coroutine *qemu_coroutine_create_with_storage(CoroutineEntry *entry,
> +                                              void *opaque, size_t storage);
> +void *qemu_coroutine_local_storage(Coroutine *co);
> +
>  /**
>   * Transfer control to a coroutine
>   */
> diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
> index bd6b0468e1..7d7865a02f 100644
> --- a/include/qemu/coroutine_int.h
> +++ b/include/qemu/coroutine_int.h
> @@ -41,6 +41,8 @@ struct Coroutine {
>      void *entry_arg;
>      Coroutine *caller;
>  
> +    void *coroutine_local_storage;
> +
>      /* Only used when the coroutine has terminated.  */
>      QSLIST_ENTRY(Coroutine) pool_next;

This increases the size of Coroutine objects typically by 8 bytes and
shifts the following fields by the same amount. On my x86_64 build, we
have exactly those 8 bytes left in CoroutineUContext until a new
cacheline would start. With different CONFIG_* settings, it could be the
change that increases the size to a new cacheline. No idea what this
looks like on other architectures.

Does this or the shifting of fields matter for performance? I don't
know. It might even be unlikely. But cache effects are hard to predict
and not wanting to do the work of proving that it's indeed harmless is
one of the reasons why for the slow paths in question I preferred a
solution that doesn't touch the coroutine core at all.

> diff --git a/monitor/monitor.c b/monitor/monitor.c
> index 50fb5b20d3..047a8fb380 100644
> --- a/monitor/monitor.c
> +++ b/monitor/monitor.c
> @@ -82,38 +82,32 @@ bool qmp_dispatcher_co_shutdown;
>   */
>  bool qmp_dispatcher_co_busy;
>  
> -/*
> - * Protects mon_list, monitor_qapi_event_state, coroutine_mon,
> - * monitor_destroyed.
> - */
> +/* Protects mon_list, monitor_qapi_event_state, monitor_destroyed. */
>  QemuMutex monitor_lock;
>  static GHashTable *monitor_qapi_event_state;
> -static GHashTable *coroutine_mon; /* Maps Coroutine* to Monitor* */
>  
>  MonitorList mon_list;
>  int mon_refcount;
>  static bool monitor_destroyed;
>  
> +static Monitor **monitor_curp(Coroutine *co)
> +{
> +    static __thread Monitor *global_cur_mon;
> +
> +    if (co == qmp_dispatcher_co) {
> +        return qemu_coroutine_local_storage(co);
> +    }
> +    return &global_cur_mon;
> +}

Like the other patch, this needs to be extended for HMP. global_cur_mon
is never meant to be set.

The solution fails as soon as we have more than a single monitor
coroutine running at the same time because it relies on
qmp_dispatcher_co. In this respect, it makes the same assumptions as the
simple hack.

Only knowing that qmp_dispatcher_co is always created with storage
containing a Monitor** makes this safe.

>  Monitor *monitor_cur(void)
>  {
> -    Monitor *mon;
> -
> -    qemu_mutex_lock(&monitor_lock);
> -    mon = g_hash_table_lookup(coroutine_mon, qemu_coroutine_self());
> -    qemu_mutex_unlock(&monitor_lock);
> -
> -    return mon;
> +    return *monitor_curp(qemu_coroutine_self());
>  }
>  
>  void monitor_set_cur(Coroutine *co, Monitor *mon)
>  {
> -    qemu_mutex_lock(&monitor_lock);
> -    if (mon) {
> -        g_hash_table_replace(coroutine_mon, co, mon);
> -    } else {
> -        g_hash_table_remove(coroutine_mon, co);
> -    }
> -    qemu_mutex_unlock(&monitor_lock);
> +    *monitor_curp(co) = mon;
>  }
>  
>  /**
> @@ -666,14 +660,14 @@ void monitor_init_globals_core(void)
>  {
>      monitor_qapi_event_init();
>      qemu_mutex_init(&monitor_lock);
> -    coroutine_mon = g_hash_table_new(NULL, NULL);
>  
>      /*
>       * The dispatcher BH must run in the main loop thread, since we
>       * have commands assuming that context.  It would be nice to get
>       * rid of those assumptions.
>       */
> -    qmp_dispatcher_co = qemu_coroutine_create(monitor_qmp_dispatcher_co, NULL);
> +    qmp_dispatcher_co = qemu_coroutine_create_with_storage(
> +        monitor_qmp_dispatcher_co, NULL, sizeof(Monitor **));
>      atomic_mb_set(&qmp_dispatcher_co_busy, true);
>      aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
>  }
> diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
> index c3caa6c770..87bf7f0fc0 100644
> --- a/util/qemu-coroutine.c
> +++ b/util/qemu-coroutine.c
> @@ -81,8 +81,28 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque)
>      return co;
>  }
>  
> +Coroutine *qemu_coroutine_create_with_storage(CoroutineEntry *entry,
> +                                              void *opaque, size_t storage)
> +{
> +    Coroutine *co = qemu_coroutine_create(entry, opaque);
> +
> +    if (!co) {
> +        return NULL;
> +    }
> +
> +    co->coroutine_local_storage = g_malloc0(storage);
> +    return co;
> +}

As the code above shows, this interface is only useful if you can
identify the coroutine. It cannot be used in code that didn't create the
current coroutine because then it can't know whether or not the
coroutine has coroutine local storage, and if it has, what its structure
is.

For a supposedly generic solution, I think this is a bit weak.
Effectively, this might be a one-off solution in disguise because
it's a big restriction on the possible use cases.

> +void *qemu_coroutine_local_storage(Coroutine *co)
> +{
> +    return co->coroutine_local_storage;
> +}
> +
>  static void coroutine_delete(Coroutine *co)
>  {
> +    g_free(co->coroutine_local_storage);
> +    co->coroutine_local_storage = NULL;
>      co->caller = NULL;
>  
>      if (CONFIG_COROUTINE_POOL) {

Your list of pros/cons didn't mention coroutine creation/deletion as a
hot path at all (which it is, we have one coroutine per request).

You leave qemu_coroutine_create() untouched (except indirectly by a
larger g_malloc0() in the non-pooled case, which is negligible) and I
assume that g_free(NULL) is cheap, so at least this is probably as good
as it gets for something integrated in the coroutine core. Maybe an
explicit if (co->coroutine_local_storage) would improve it slightly.

Kevin

Re: [PATCH] Coroutine-aware monitor_cur() with coroutine-specific data

Posted by Markus Armbruster 5 years, 5 months ago

Kevin Wolf <kwolf@redhat.com> writes:

> Am 07.08.2020 um 15:29 hat Markus Armbruster geschrieben:
>> This is just a sketch.  It needs comments and a real commit message.
>> 
>> As is, it goes on top of Kevin's series.  It is meant to be squashed
>> into PATCH 06.
>> 
>> Signed-off-by: Markus Armbruster <armbru@redhat.com>
>> ---
>>  include/qemu/coroutine.h     |  4 ++++
>>  include/qemu/coroutine_int.h |  2 ++
>>  monitor/monitor.c            | 36 +++++++++++++++---------------------
>>  util/qemu-coroutine.c        | 20 ++++++++++++++++++++
>>  4 files changed, 41 insertions(+), 21 deletions(-)
>> 
>> diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
>> index dfd261c5b1..11da47092c 100644
>> --- a/include/qemu/coroutine.h
>> +++ b/include/qemu/coroutine.h
>> @@ -65,6 +65,10 @@ typedef void coroutine_fn CoroutineEntry(void *opaque);
>>   */
>>  Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque);
>>  
>> +Coroutine *qemu_coroutine_create_with_storage(CoroutineEntry *entry,
>> +                                              void *opaque, size_t storage);
>> +void *qemu_coroutine_local_storage(Coroutine *co);
>> +
>>  /**
>>   * Transfer control to a coroutine
>>   */
>> diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
>> index bd6b0468e1..7d7865a02f 100644
>> --- a/include/qemu/coroutine_int.h
>> +++ b/include/qemu/coroutine_int.h
>> @@ -41,6 +41,8 @@ struct Coroutine {
>>      void *entry_arg;
>>      Coroutine *caller;
>>  
>> +    void *coroutine_local_storage;
>> +
>>      /* Only used when the coroutine has terminated.  */
>>      QSLIST_ENTRY(Coroutine) pool_next;
>
> This increases the size of Coroutine objects typically by 8 bytes and
> shifts the following fields by the same amount. On my x86_64 build, we
> have exactly those 8 bytes left in CoroutineUContext until a new
> cacheline would start. With different CONFIG_* settings, it could be the
> change that increases the size to a new cacheline. No idea what this
> looks like on other architectures.
>
> Does this or the shifting of fields matter for performance? I don't
> know. It might even be unlikely. But cache effects are hard to predict
> and not wanting to do the work of proving that it's indeed harmless is
> one of the reasons why for the slow paths in question I preferred a
> solution that doesn't touch the coroutine core at all.

Point taken.

Possible mitigation: add at the end rather than in the middle.

>> diff --git a/monitor/monitor.c b/monitor/monitor.c
>> index 50fb5b20d3..047a8fb380 100644
>> --- a/monitor/monitor.c
>> +++ b/monitor/monitor.c
>> @@ -82,38 +82,32 @@ bool qmp_dispatcher_co_shutdown;
>>   */
>>  bool qmp_dispatcher_co_busy;
>>  
>> -/*
>> - * Protects mon_list, monitor_qapi_event_state, coroutine_mon,
>> - * monitor_destroyed.
>> - */
>> +/* Protects mon_list, monitor_qapi_event_state, monitor_destroyed. */
>>  QemuMutex monitor_lock;
>>  static GHashTable *monitor_qapi_event_state;
>> -static GHashTable *coroutine_mon; /* Maps Coroutine* to Monitor* */
>>  
>>  MonitorList mon_list;
>>  int mon_refcount;
>>  static bool monitor_destroyed;
>>  
>> +static Monitor **monitor_curp(Coroutine *co)
>> +{
>> +    static __thread Monitor *global_cur_mon;
>> +
>> +    if (co == qmp_dispatcher_co) {
>> +        return qemu_coroutine_local_storage(co);
>> +    }
>> +    return &global_cur_mon;
>> +}
>
> Like the other patch, this needs to be extended for HMP. global_cur_mon
> is never meant to be set.

It is, for OOB commands.

> The solution fails as soon as we have more than a single monitor
> coroutine running at the same time because it relies on
> qmp_dispatcher_co.

Yes, but pretty much everything below handle_qmp_command() falls apart
then.  Remembering to update monitor_curp() would be the least of my
worries :)

>                    In this respect, it makes the same assumptions as the
> simple hack.
>
> Only knowing that qmp_dispatcher_co is always created with storage
> containing a Monitor** makes this safe.

Correct.

>>  Monitor *monitor_cur(void)
>>  {
>> -    Monitor *mon;
>> -
>> -    qemu_mutex_lock(&monitor_lock);
>> -    mon = g_hash_table_lookup(coroutine_mon, qemu_coroutine_self());
>> -    qemu_mutex_unlock(&monitor_lock);
>> -
>> -    return mon;
>> +    return *monitor_curp(qemu_coroutine_self());
>>  }
>>  
>>  void monitor_set_cur(Coroutine *co, Monitor *mon)
>>  {
>> -    qemu_mutex_lock(&monitor_lock);
>> -    if (mon) {
>> -        g_hash_table_replace(coroutine_mon, co, mon);
>> -    } else {
>> -        g_hash_table_remove(coroutine_mon, co);
>> -    }
>> -    qemu_mutex_unlock(&monitor_lock);
>> +    *monitor_curp(co) = mon;
>>  }
>>  
>>  /**
>> @@ -666,14 +660,14 @@ void monitor_init_globals_core(void)
>>  {
>>      monitor_qapi_event_init();
>>      qemu_mutex_init(&monitor_lock);
>> -    coroutine_mon = g_hash_table_new(NULL, NULL);
>>  
>>      /*
>>       * The dispatcher BH must run in the main loop thread, since we
>>       * have commands assuming that context.  It would be nice to get
>>       * rid of those assumptions.
>>       */
>> -    qmp_dispatcher_co = qemu_coroutine_create(monitor_qmp_dispatcher_co, NULL);
>> +    qmp_dispatcher_co = qemu_coroutine_create_with_storage(
>> +        monitor_qmp_dispatcher_co, NULL, sizeof(Monitor **));
>>      atomic_mb_set(&qmp_dispatcher_co_busy, true);
>>      aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
>>  }
>> diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
>> index c3caa6c770..87bf7f0fc0 100644
>> --- a/util/qemu-coroutine.c
>> +++ b/util/qemu-coroutine.c
>> @@ -81,8 +81,28 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque)
>>      return co;
>>  }
>>  
>> +Coroutine *qemu_coroutine_create_with_storage(CoroutineEntry *entry,
>> +                                              void *opaque, size_t storage)
>> +{
>> +    Coroutine *co = qemu_coroutine_create(entry, opaque);
>> +
>> +    if (!co) {
>> +        return NULL;
>> +    }
>> +
>> +    co->coroutine_local_storage = g_malloc0(storage);
>> +    return co;
>> +}
>
> As the code above shows, this interface is only useful if you can
> identify the coroutine. It cannot be used in code that didn't create the
> current coroutine because then it can't know whether or not the
> coroutine has coroutine local storage, and if it has, what its structure
> is.
>
> For a supposedly generic solution, I think this is a bit weak.

Yes, that's fair.

The solution Daniel proposed is makes the weakness more explicit:
instead of relying on "coroutine was created with this coroutine-local
storage", we'd rely on "coroutine_getspecific(key) does not fail".  It
can fail only if coroutine_setspecific(key, ...) was not called.  Not
much better in practice.

> Effectively, this might be a one-off solution in disguise because
> it's a big restriction on the possible use cases.

Daniel's solution is basically pthread_getspecific() for coroutines,
with the keys dumbed down.

If pthread_getspecific() was good enough for pthreads...

Well, it wasn't, or rather it was only because something better could
not be had with just a library, without toolchain support.  And that's
where we are with coroutines.

>> +void *qemu_coroutine_local_storage(Coroutine *co)
>> +{
>> +    return co->coroutine_local_storage;
>> +}
>> +
>>  static void coroutine_delete(Coroutine *co)
>>  {
>> +    g_free(co->coroutine_local_storage);
>> +    co->coroutine_local_storage = NULL;
>>      co->caller = NULL;
>>  
>>      if (CONFIG_COROUTINE_POOL) {
>
> Your list of pros/cons didn't mention coroutine creation/deletion as a
> hot path at all (which it is, we have one coroutine per request).

I did not expect coroutine creation / deletion to be a hot path.

It is not a hot path for QMP, because QMP is not a hot path.

I'm ready to accept the proposition that it's a hot path elsewhere.

> You leave qemu_coroutine_create() untouched (except indirectly by a
> larger g_malloc0() in the non-pooled case, which is negligible) and I
> assume that g_free(NULL) is cheap, so at least this is probably as good
> as it gets for something integrated in the coroutine core. Maybe an
> explicit if (co->coroutine_local_storage) would improve it slightly.
>
> Kevin

Re: [PATCH] Coroutine-aware monitor_cur() with coroutine-specific data

Posted by Kevin Wolf 5 years, 5 months ago

Am 26.08.2020 um 14:40 hat Markus Armbruster geschrieben:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > Am 07.08.2020 um 15:29 hat Markus Armbruster geschrieben:
> >> This is just a sketch.  It needs comments and a real commit message.
> >> 
> >> As is, it goes on top of Kevin's series.  It is meant to be squashed
> >> into PATCH 06.
> >> 
> >> Signed-off-by: Markus Armbruster <armbru@redhat.com>
> >> ---
> >>  include/qemu/coroutine.h     |  4 ++++
> >>  include/qemu/coroutine_int.h |  2 ++
> >>  monitor/monitor.c            | 36 +++++++++++++++---------------------
> >>  util/qemu-coroutine.c        | 20 ++++++++++++++++++++
> >>  4 files changed, 41 insertions(+), 21 deletions(-)
> >> 
> >> diff --git a/include/qemu/coroutine.h b/include/qemu/coroutine.h
> >> index dfd261c5b1..11da47092c 100644
> >> --- a/include/qemu/coroutine.h
> >> +++ b/include/qemu/coroutine.h
> >> @@ -65,6 +65,10 @@ typedef void coroutine_fn CoroutineEntry(void *opaque);
> >>   */
> >>  Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque);
> >>  
> >> +Coroutine *qemu_coroutine_create_with_storage(CoroutineEntry *entry,
> >> +                                              void *opaque, size_t storage);
> >> +void *qemu_coroutine_local_storage(Coroutine *co);
> >> +
> >>  /**
> >>   * Transfer control to a coroutine
> >>   */
> >> diff --git a/include/qemu/coroutine_int.h b/include/qemu/coroutine_int.h
> >> index bd6b0468e1..7d7865a02f 100644
> >> --- a/include/qemu/coroutine_int.h
> >> +++ b/include/qemu/coroutine_int.h
> >> @@ -41,6 +41,8 @@ struct Coroutine {
> >>      void *entry_arg;
> >>      Coroutine *caller;
> >>  
> >> +    void *coroutine_local_storage;
> >> +
> >>      /* Only used when the coroutine has terminated.  */
> >>      QSLIST_ENTRY(Coroutine) pool_next;
> >
> > This increases the size of Coroutine objects typically by 8 bytes and
> > shifts the following fields by the same amount. On my x86_64 build, we
> > have exactly those 8 bytes left in CoroutineUContext until a new
> > cacheline would start. With different CONFIG_* settings, it could be the
> > change that increases the size to a new cacheline. No idea what this
> > looks like on other architectures.
> >
> > Does this or the shifting of fields matter for performance? I don't
> > know. It might even be unlikely. But cache effects are hard to predict
> > and not wanting to do the work of proving that it's indeed harmless is
> > one of the reasons why for the slow paths in question I preferred a
> > solution that doesn't touch the coroutine core at all.
> 
> Point taken.
> 
> Possible mitigation: add at the end rather than in the middle.

Doesn't work: This is a struct that is embedded at the start of
CoroutineUContext, so while you can move it down a bit, you'll never get
to the end of the actual struct used at runtime.

> >> diff --git a/monitor/monitor.c b/monitor/monitor.c
> >> index 50fb5b20d3..047a8fb380 100644
> >> --- a/monitor/monitor.c
> >> +++ b/monitor/monitor.c
> >> @@ -82,38 +82,32 @@ bool qmp_dispatcher_co_shutdown;
> >>   */
> >>  bool qmp_dispatcher_co_busy;
> >>  
> >> -/*
> >> - * Protects mon_list, monitor_qapi_event_state, coroutine_mon,
> >> - * monitor_destroyed.
> >> - */
> >> +/* Protects mon_list, monitor_qapi_event_state, monitor_destroyed. */
> >>  QemuMutex monitor_lock;
> >>  static GHashTable *monitor_qapi_event_state;
> >> -static GHashTable *coroutine_mon; /* Maps Coroutine* to Monitor* */
> >>  
> >>  MonitorList mon_list;
> >>  int mon_refcount;
> >>  static bool monitor_destroyed;
> >>  
> >> +static Monitor **monitor_curp(Coroutine *co)
> >> +{
> >> +    static __thread Monitor *global_cur_mon;
> >> +
> >> +    if (co == qmp_dispatcher_co) {
> >> +        return qemu_coroutine_local_storage(co);
> >> +    }
> >> +    return &global_cur_mon;
> >> +}
> >
> > Like the other patch, this needs to be extended for HMP. global_cur_mon
> > is never meant to be set.
> 
> It is, for OOB commands.

Right, I missed this.

> > The solution fails as soon as we have more than a single monitor
> > coroutine running at the same time because it relies on
> > qmp_dispatcher_co.
> 
> Yes, but pretty much everything below handle_qmp_command() falls apart
> then.  Remembering to update monitor_curp() would be the least of my
> worries :)

Fair enough.

> >                    In this respect, it makes the same assumptions as the
> > simple hack.
> >
> > Only knowing that qmp_dispatcher_co is always created with storage
> > containing a Monitor** makes this safe.
> 
> Correct.
> 
> >>  Monitor *monitor_cur(void)
> >>  {
> >> -    Monitor *mon;
> >> -
> >> -    qemu_mutex_lock(&monitor_lock);
> >> -    mon = g_hash_table_lookup(coroutine_mon, qemu_coroutine_self());
> >> -    qemu_mutex_unlock(&monitor_lock);
> >> -
> >> -    return mon;
> >> +    return *monitor_curp(qemu_coroutine_self());
> >>  }
> >>  
> >>  void monitor_set_cur(Coroutine *co, Monitor *mon)
> >>  {
> >> -    qemu_mutex_lock(&monitor_lock);
> >> -    if (mon) {
> >> -        g_hash_table_replace(coroutine_mon, co, mon);
> >> -    } else {
> >> -        g_hash_table_remove(coroutine_mon, co);
> >> -    }
> >> -    qemu_mutex_unlock(&monitor_lock);
> >> +    *monitor_curp(co) = mon;
> >>  }
> >>  
> >>  /**
> >> @@ -666,14 +660,14 @@ void monitor_init_globals_core(void)
> >>  {
> >>      monitor_qapi_event_init();
> >>      qemu_mutex_init(&monitor_lock);
> >> -    coroutine_mon = g_hash_table_new(NULL, NULL);
> >>  
> >>      /*
> >>       * The dispatcher BH must run in the main loop thread, since we
> >>       * have commands assuming that context.  It would be nice to get
> >>       * rid of those assumptions.
> >>       */
> >> -    qmp_dispatcher_co = qemu_coroutine_create(monitor_qmp_dispatcher_co, NULL);
> >> +    qmp_dispatcher_co = qemu_coroutine_create_with_storage(
> >> +        monitor_qmp_dispatcher_co, NULL, sizeof(Monitor **));
> >>      atomic_mb_set(&qmp_dispatcher_co_busy, true);
> >>      aio_co_schedule(iohandler_get_aio_context(), qmp_dispatcher_co);
> >>  }
> >> diff --git a/util/qemu-coroutine.c b/util/qemu-coroutine.c
> >> index c3caa6c770..87bf7f0fc0 100644
> >> --- a/util/qemu-coroutine.c
> >> +++ b/util/qemu-coroutine.c
> >> @@ -81,8 +81,28 @@ Coroutine *qemu_coroutine_create(CoroutineEntry *entry, void *opaque)
> >>      return co;
> >>  }
> >>  
> >> +Coroutine *qemu_coroutine_create_with_storage(CoroutineEntry *entry,
> >> +                                              void *opaque, size_t storage)
> >> +{
> >> +    Coroutine *co = qemu_coroutine_create(entry, opaque);
> >> +
> >> +    if (!co) {
> >> +        return NULL;
> >> +    }
> >> +
> >> +    co->coroutine_local_storage = g_malloc0(storage);
> >> +    return co;
> >> +}
> >
> > As the code above shows, this interface is only useful if you can
> > identify the coroutine. It cannot be used in code that didn't create the
> > current coroutine because then it can't know whether or not the
> > coroutine has coroutine local storage, and if it has, what its structure
> > is.
> >
> > For a supposedly generic solution, I think this is a bit weak.
> 
> Yes, that's fair.
> 
> The solution Daniel proposed is makes the weakness more explicit:
> instead of relying on "coroutine was created with this coroutine-local
> storage", we'd rely on "coroutine_getspecific(key) does not fail".  It
> can fail only if coroutine_setspecific(key, ...) was not called.  Not
> much better in practice.

It would be a little more generic, I guess. So for a solution that wants
to look generic, it might be better.

But as long as we don't have a second user (not even in our
imagination), I'm not sure how important it is to have something that
looks generic. With the simple and stupid patch, it would be more
obvious that it doesn't hurt cases that are unrelated to the monitor.

> > Effectively, this might be a one-off solution in disguise because
> > it's a big restriction on the possible use cases.
> 
> Daniel's solution is basically pthread_getspecific() for coroutines,
> with the keys dumbed down.
> 
> If pthread_getspecific() was good enough for pthreads...
> 
> Well, it wasn't, or rather it was only because something better could
> not be had with just a library, without toolchain support.  And that's
> where we are with coroutines.
> 
> >> +void *qemu_coroutine_local_storage(Coroutine *co)
> >> +{
> >> +    return co->coroutine_local_storage;
> >> +}
> >> +
> >>  static void coroutine_delete(Coroutine *co)
> >>  {
> >> +    g_free(co->coroutine_local_storage);
> >> +    co->coroutine_local_storage = NULL;
> >>      co->caller = NULL;
> >>  
> >>      if (CONFIG_COROUTINE_POOL) {
> >
> > Your list of pros/cons didn't mention coroutine creation/deletion as a
> > hot path at all (which it is, we have one coroutine per request).
> 
> I did not expect coroutine creation / deletion to be a hot path.
> 
> It is not a hot path for QMP, because QMP is not a hot path.
> 
> I'm ready to accept the proposition that it's a hot path elsewhere.

It is a hot path for block device requests. To be more specific, the
part that is executed when taking a coroutine from the pool or putting
it back to the pool is. The code to create a coroutine from scratch or
to actually free it may be less relevant.

Kevin

> > You leave qemu_coroutine_create() untouched (except indirectly by a
> > larger g_malloc0() in the non-pooled case, which is negligible) and I
> > assume that g_free(NULL) is cheap, so at least this is probably as good
> > as it gets for something integrated in the coroutine core. Maybe an
> > explicit if (co->coroutine_local_storage) would improve it slightly.
> >
> > Kevin

Re: [PATCH v6 06/12] monitor: Make current monitor a per-coroutine property

Posted by Daniel P. Berrangé 5 years, 6 months ago

On Tue, Aug 04, 2020 at 03:50:54PM +0200, Markus Armbruster wrote:
> Kevin Wolf <kwolf@redhat.com> writes:
> 
> > This way, a monitor command handler will still be able to access the
> > current monitor, but when it yields, all other code code will correctly
> > get NULL from monitor_cur().
> >
> > Outside of coroutine context, qemu_coroutine_self() returns the leader
> > coroutine of the current thread.
> 
> Unsaid: you use it as a hash table key to map from coroutine to monitor,
> and for that you need it to return a value unique to the coroutine in
> coroutine context, and a value unique to the thread outside coroutine
> context.  Which qemu_coroutine_self() does.  Correct?
> 
> The hash table works, but I hate it just as much as I hate
> pthread_getspecific() / pthread_setspecific().
> 
> What we have here is a need for coroutine-local data.  Feels like a
> perfectly natural concept to me.
> 
> Are we going to create another hash table whenever we need another piece
> of coroutine-local data?  Or shall we reuse the hash table, suitably
> renamed and moved to another file?
> 
> Why not simply associate an opaque pointer with each coroutine?  All it
> takes is one more member of struct Coroutine.  Whatever creates the
> coroutine decides what to use it for.  The monitor coroutine would use
> it to point to the monitor.

Possible benefit of having the coroutine-local data stored in the
coroutine stack is that we can probably make it lock-less. Using
the hash table in monitor.c results in a serialization of across
all coroutines & threads.

Also, by providing a GDestroyNotify against the coroutine-local
data we can easily guarantee cleanup with the coroutine is freed.

Since we'll have a limited number of data items, we could make do
with a simple array in the coroutine struct, instead of a hashtable.
eg

  enum CoroutineLocalKeys {
     CO_LOCAL_CUR_MONITOR = 0,

     CO_LOCAL_LAST,
  };

  struct Coroutine {
    ...
    gpointer localData[CO_LOCAL_LAST];
    GDestroyNotify localDataFree[CO_LOCAL_LAST];
  };


Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|