Changeset
src/qemu/THREADS.txt   |  62 ++++++++++++++++--
src/qemu/qemu_domain.c | 169 +++++++++++++++++++++++++++++++++++++++----------
src/qemu/qemu_domain.h |  29 +++++++++
src/qemu/qemu_driver.c |  91 ++++++++++++++++----------
4 files changed, 281 insertions(+), 70 deletions(-)
Git apply log
Switched to a new branch 'cover.1528465164.git.mprivozn@redhat.com'
Applying: qemu: Introduce qemuDomainAgentJob
Applying: qemu: Introduce APIs for manipulating qemuDomainAgentJob
Applying: qemuDomainAgentJob: Introduce query and modify jobs
Applying: qemu: Switch code to use new agent job APIs
To https://github.com/patchew-project/libvirt
 * [new tag]         patchew/cover.1528465164.git.mprivozn@redhat.com -> patchew/cover.1528465164.git.mprivozn@redhat.com
Test passed: syntax-check

loading

[libvirt] [PATCH 0/4] qemu: Allow concurrent access to monitor and guest agent
Posted by Michal Privoznik, 2 weeks ago
It is very easy to see these patches in action. Just apply the following
diff:

diff --git i/src/qemu/qemu_agent.c w/src/qemu/qemu_agent.c
index 10c6ef09fa..dd48caecb7 100644
--- i/src/qemu/qemu_agent.c
+++ w/src/qemu/qemu_agent.c
@@ -1702,6 +1702,8 @@ qemuAgentGetTime(qemuAgentPtr mon,
                          VIR_DOMAIN_QEMU_AGENT_COMMAND_BLOCK) < 0)
         goto cleanup;
 
+    sleep(120);
+
     if (virJSONValueObjectGetNumberUlong(reply, "return", &json_time) < 0) {
         virReportError(VIR_ERR_INTERNAL_ERROR, "%s",
                        _("malformed return value"));


This is to simulate a slow guest agent. Then all you need to do is run:
'virsh domtime $dom' and then 'virsh domstats $dom' from another
terminal.

Enjoy.

Michal Privoznik (4):
  qemu: Introduce qemuDomainAgentJob
  qemu: Introduce APIs for manipulating qemuDomainAgentJob
  qemuDomainAgentJob: Introduce query and modify jobs
  qemu: Switch code to use new agent job APIs

 src/qemu/THREADS.txt   |  62 ++++++++++++++++--
 src/qemu/qemu_domain.c | 169 +++++++++++++++++++++++++++++++++++++++----------
 src/qemu/qemu_domain.h |  29 +++++++++
 src/qemu/qemu_driver.c |  91 ++++++++++++++++----------
 4 files changed, 281 insertions(+), 70 deletions(-)

-- 
2.16.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 1/4] qemu: Introduce qemuDomainAgentJob
Posted by Michal Privoznik, 2 weeks ago
This enum will list all possible jobs for guest agent. The idea
is when a thread needs to talk to guest agent only it will take
the QEMU_AGENT_JOB instead of QEMU_JOB helping better
concurrency.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
 src/qemu/qemu_domain.c |  4 ++++
 src/qemu/qemu_domain.h | 15 +++++++++++++++
 2 files changed, 19 insertions(+)

diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 11c261db1a..b8e34c1c2c 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -93,6 +93,10 @@ VIR_ENUM_IMPL(qemuDomainJob, QEMU_JOB_LAST,
               "async nested",
 );
 
+VIR_ENUM_IMPL(qemuDomainAgentJob, QEMU_AGENT_JOB_LAST,
+              "none"
+);
+
 VIR_ENUM_IMPL(qemuDomainAsyncJob, QEMU_ASYNC_JOB_LAST,
               "none",
               "migration out",
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
index f17157b951..709b42e6fd 100644
--- a/src/qemu/qemu_domain.h
+++ b/src/qemu/qemu_domain.h
@@ -82,6 +82,13 @@ typedef enum {
 } qemuDomainJob;
 VIR_ENUM_DECL(qemuDomainJob)
 
+typedef enum {
+    QEMU_AGENT_JOB_NONE = 0,    /* No agent job. */
+
+    QEMU_AGENT_JOB_LAST
+} qemuDomainAgentJob;
+VIR_ENUM_DECL(qemuDomainAgentJob)
+
 /* Async job consists of a series of jobs that may change state. Independent
  * jobs that do not change state (and possibly others if explicitly allowed by
  * current async job) are allowed to be run even if async job is active.
@@ -158,11 +165,19 @@ typedef struct _qemuDomainJobObj qemuDomainJobObj;
 typedef qemuDomainJobObj *qemuDomainJobObjPtr;
 struct _qemuDomainJobObj {
     virCond cond;                       /* Use to coordinate jobs */
+
+    /* The following members are for QEMU_JOB_* */
     qemuDomainJob active;               /* Currently running job */
     unsigned long long owner;           /* Thread id which set current job */
     const char *ownerAPI;               /* The API which owns the job */
     unsigned long long started;         /* When the current job started */
 
+    /* The following members are for QEMU_AGENT_JOB_* */
+    qemuDomainAgentJob agentActive;     /* Currently running agent job */
+    unsigned long long agentOwner;      /* Thread id which set current agent job */
+    const char *agentOwnerAPI;          /* The API which owns the agent job */
+    unsigned long long agentStarted;    /* When the current agent job started */
+
     virCond asyncCond;                  /* Use to coordinate with async jobs */
     qemuDomainAsyncJob asyncJob;        /* Currently active async job */
     unsigned long long asyncOwner;      /* Thread which set current async job */
-- 
2.16.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 1/4] qemu: Introduce qemuDomainAgentJob
Posted by John Ferlan, 1 week ago

On 06/08/2018 09:45 AM, Michal Privoznik wrote:
> This enum will list all possible jobs for guest agent. The idea
> is when a thread needs to talk to guest agent only it will take
> the QEMU_AGENT_JOB instead of QEMU_JOB helping better
> concurrency.

Consider:

Introduce guest agent specific job categories to allow threads to run
agent monitor specific jobs while normal monitor jobs can also be running.

Alter _qemuDomainJobObj in order to duplicate certain fields that will
be used for guest agent specific tasks to increase concurrency and
throughput and reduce serialization.

> 
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
>  src/qemu/qemu_domain.c |  4 ++++
>  src/qemu/qemu_domain.h | 15 +++++++++++++++
>  2 files changed, 19 insertions(+)
> 
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index 11c261db1a..b8e34c1c2c 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -93,6 +93,10 @@ VIR_ENUM_IMPL(qemuDomainJob, QEMU_JOB_LAST,
>                "async nested",
>  );
>  
> +VIR_ENUM_IMPL(qemuDomainAgentJob, QEMU_AGENT_JOB_LAST,
> +              "none"
> +);
> +

I think merging patch3 in here is perfectly reasonable, unless of course
you're looking to get your patch counts up to stay ahead of Peter ;-)

>  VIR_ENUM_IMPL(qemuDomainAsyncJob, QEMU_ASYNC_JOB_LAST,
>                "none",
>                "migration out",
> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
> index f17157b951..709b42e6fd 100644
> --- a/src/qemu/qemu_domain.h
> +++ b/src/qemu/qemu_domain.h
> @@ -82,6 +82,13 @@ typedef enum {
>  } qemuDomainJob;
>  VIR_ENUM_DECL(qemuDomainJob)
>  
> +typedef enum {
> +    QEMU_AGENT_JOB_NONE = 0,    /* No agent job. */
> +
> +    QEMU_AGENT_JOB_LAST
> +} qemuDomainAgentJob;
> +VIR_ENUM_DECL(qemuDomainAgentJob)
> +
>  /* Async job consists of a series of jobs that may change state. Independent
>   * jobs that do not change state (and possibly others if explicitly allowed by
>   * current async job) are allowed to be run even if async job is active.
> @@ -158,11 +165,19 @@ typedef struct _qemuDomainJobObj qemuDomainJobObj;
>  typedef qemuDomainJobObj *qemuDomainJobObjPtr;
>  struct _qemuDomainJobObj {
>      virCond cond;                       /* Use to coordinate jobs */
> +
> +    /* The following members are for QEMU_JOB_* */
>      qemuDomainJob active;               /* Currently running job */
>      unsigned long long owner;           /* Thread id which set current job */
>      const char *ownerAPI;               /* The API which owns the job */
>      unsigned long long started;         /* When the current job started */
>  
> +    /* The following members are for QEMU_AGENT_JOB_* */
> +    qemuDomainAgentJob agentActive;     /* Currently running agent job */

Hmm... I seem to recall a recent patch that spoke against using an enum
type for struct fields:

https://www.redhat.com/archives/libvir-list/2018-June/msg00486.html

While I know you're just copying the existing qemuDomainJob - this is
the kind of stuff that just gets copied around and used elsewhere.

> +    unsigned long long agentOwner;      /* Thread id which set current agent job */
> +    const char *agentOwnerAPI;          /* The API which owns the agent job */
> +    unsigned long long agentStarted;    /* When the current agent job started */
> +

FWIW: From the description in the next patch, I would think there would
need to be an agentCond variable too.  But reading patch 4 I know that's
not quite the case. I have concerns over the combined @cond usage, but
those are only towards some "future change" that misses some subtlety.

And the following few fields are for async jobs - may as well add a
comment here too...

John

>      virCond asyncCond;                  /* Use to coordinate with async jobs */
>      qemuDomainAsyncJob asyncJob;        /* Currently active async job */
>      unsigned long long asyncOwner;      /* Thread which set current async job */
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 1/4] qemu: Introduce qemuDomainAgentJob
Posted by Michal Privoznik, 1 week ago
On 06/14/2018 12:04 AM, John Ferlan wrote:
> 
> 
> On 06/08/2018 09:45 AM, Michal Privoznik wrote:
>> This enum will list all possible jobs for guest agent. The idea
>> is when a thread needs to talk to guest agent only it will take
>> the QEMU_AGENT_JOB instead of QEMU_JOB helping better
>> concurrency.
> 
> Consider:
> 
> Introduce guest agent specific job categories to allow threads to run
> agent monitor specific jobs while normal monitor jobs can also be running.
> 
> Alter _qemuDomainJobObj in order to duplicate certain fields that will
> be used for guest agent specific tasks to increase concurrency and
> throughput and reduce serialization.

Okay, sounds better than mine commit message.

> 
>>
>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
>> ---
>>  src/qemu/qemu_domain.c |  4 ++++
>>  src/qemu/qemu_domain.h | 15 +++++++++++++++
>>  2 files changed, 19 insertions(+)
>>
>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>> index 11c261db1a..b8e34c1c2c 100644
>> --- a/src/qemu/qemu_domain.c
>> +++ b/src/qemu/qemu_domain.c
>> @@ -93,6 +93,10 @@ VIR_ENUM_IMPL(qemuDomainJob, QEMU_JOB_LAST,
>>                "async nested",
>>  );
>>  
>> +VIR_ENUM_IMPL(qemuDomainAgentJob, QEMU_AGENT_JOB_LAST,
>> +              "none"
>> +);
>> +
> 
> I think merging patch3 in here is perfectly reasonable, unless of course
> you're looking to get your patch counts up to stay ahead of Peter ;-)

Yeah, I was considering that. But for some reason did not do that. I
don't care, so I'll merge it.

> 
>>  VIR_ENUM_IMPL(qemuDomainAsyncJob, QEMU_ASYNC_JOB_LAST,
>>                "none",
>>                "migration out",
>> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
>> index f17157b951..709b42e6fd 100644
>> --- a/src/qemu/qemu_domain.h
>> +++ b/src/qemu/qemu_domain.h
>> @@ -82,6 +82,13 @@ typedef enum {
>>  } qemuDomainJob;
>>  VIR_ENUM_DECL(qemuDomainJob)
>>  
>> +typedef enum {
>> +    QEMU_AGENT_JOB_NONE = 0,    /* No agent job. */
>> +
>> +    QEMU_AGENT_JOB_LAST
>> +} qemuDomainAgentJob;
>> +VIR_ENUM_DECL(qemuDomainAgentJob)
>> +
>>  /* Async job consists of a series of jobs that may change state. Independent
>>   * jobs that do not change state (and possibly others if explicitly allowed by
>>   * current async job) are allowed to be run even if async job is active.
>> @@ -158,11 +165,19 @@ typedef struct _qemuDomainJobObj qemuDomainJobObj;
>>  typedef qemuDomainJobObj *qemuDomainJobObjPtr;
>>  struct _qemuDomainJobObj {
>>      virCond cond;                       /* Use to coordinate jobs */
>> +
>> +    /* The following members are for QEMU_JOB_* */
>>      qemuDomainJob active;               /* Currently running job */
>>      unsigned long long owner;           /* Thread id which set current job */
>>      const char *ownerAPI;               /* The API which owns the job */
>>      unsigned long long started;         /* When the current job started */
>>  
>> +    /* The following members are for QEMU_AGENT_JOB_* */
>> +    qemuDomainAgentJob agentActive;     /* Currently running agent job */
> 
> Hmm... I seem to recall a recent patch that spoke against using an enum
> type for struct fields:
> 
> https://www.redhat.com/archives/libvir-list/2018-June/msg00486.html
> 
> While I know you're just copying the existing qemuDomainJob - this is
> the kind of stuff that just gets copied around and used elsewhere.

No this is something different. The problem with using enums in
virDomainDef (or any other struct that is immediate part of XML parsing)
is that some of us use the following construct:

  typedef enum {
    VIR_SOME_ENUM_ITEM1,
    VIR_SOME_ENUM_ITEM2,
    ...
    VIR_SOME_ENUM_ITEM_LAST
  } virSomeEnum;

  virSomeEnum val = virSomeEnumTypeFromString(str);

The problem lies in the last line because TypeFromString() can return -1
(if @str does not fall into virSomeEnum). Therefore the last line can be
rewritten as:

  virSomeEnum val = -1;

I guess you can see the problem now. There are basically two ways to get
rid of this problem. The first one is to use an intermediary integer:

  virSomEnum val;
  int tmp = virSomeEnumTypeFromString(str);

  if (tmp < 0)
    error;
  else
    val = tmp;

The other way is to move the integer into the struct and put into the
comment reference to the enum (this is what Dan is doing in the patch
you're referring to).

However, my scenario is different. Whenever setting the struct member
it's only using corresponding enum values.

> 
>> +    unsigned long long agentOwner;      /* Thread id which set current agent job */
>> +    const char *agentOwnerAPI;          /* The API which owns the agent job */
>> +    unsigned long long agentStarted;    /* When the current agent job started */
>> +
> 
> FWIW: From the description in the next patch, I would think there would
> need to be an agentCond variable too.  But reading patch 4 I know that's
> not quite the case. I have concerns over the combined @cond usage, but
> those are only towards some "future change" that misses some subtlety.
> 

No. Using one condition is the point of these patches. IMO having two
conditions would only complicate things needlessly. The code is guarded
against "spurious" wakeups from condition.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 1/4] qemu: Introduce qemuDomainAgentJob
Posted by John Ferlan, 1 week ago
[...]

>>>  
>>> +    /* The following members are for QEMU_AGENT_JOB_* */
>>> +    qemuDomainAgentJob agentActive;     /* Currently running agent job */
>>
>> Hmm... I seem to recall a recent patch that spoke against using an enum
>> type for struct fields:
>>
>> https://www.redhat.com/archives/libvir-list/2018-June/msg00486.html
>>
>> While I know you're just copying the existing qemuDomainJob - this is
>> the kind of stuff that just gets copied around and used elsewhere.
> 
> No this is something different. The problem with using enums in
> virDomainDef (or any other struct that is immediate part of XML parsing)
> is that some of us use the following construct:
> 
>   typedef enum {
>     VIR_SOME_ENUM_ITEM1,
>     VIR_SOME_ENUM_ITEM2,
>     ...
>     VIR_SOME_ENUM_ITEM_LAST
>   } virSomeEnum;
> 
>   virSomeEnum val = virSomeEnumTypeFromString(str);
> 
> The problem lies in the last line because TypeFromString() can return -1
> (if @str does not fall into virSomeEnum). Therefore the last line can be
> rewritten as:
> 
>   virSomeEnum val = -1;
> 
> I guess you can see the problem now. There are basically two ways to get
> rid of this problem. The first one is to use an intermediary integer:
> 
>   virSomEnum val;
>   int tmp = virSomeEnumTypeFromString(str);
> 
>   if (tmp < 0)
>     error;
>   else
>     val = tmp;
> 
> The other way is to move the integer into the struct and put into the
> comment reference to the enum (this is what Dan is doing in the patch
> you're referring to).
> 
> However, my scenario is different. Whenever setting the struct member
> it's only using corresponding enum values.
> 

I understand the difference, but was merely pointing out the dichotomy
in using an enum in a struct if one takes the comment of the linked
patch literally as much as they take the review comment literally.

I think usage of an enum in a struct is not a blanket don't do that, but
rather a well we can do it for certain situations that don't involve use
of switches and/or *TypeFrom* processing.

>>
>>> +    unsigned long long agentOwner;      /* Thread id which set current agent job */
>>> +    const char *agentOwnerAPI;          /* The API which owns the agent job */
>>> +    unsigned long long agentStarted;    /* When the current agent job started */
>>> +
>>
>> FWIW: From the description in the next patch, I would think there would
>> need to be an agentCond variable too.  But reading patch 4 I know that's
>> not quite the case. I have concerns over the combined @cond usage, but
>> those are only towards some "future change" that misses some subtlety.
>>
> 
> No. Using one condition is the point of these patches. IMO having two
> conditions would only complicate things needlessly. The code is guarded
> against "spurious" wakeups from condition.
> 

Yes, I came to understand that, but it's the THREADS.txt description
that probably needs adjustment.  Remember you're working backwards and
working forwards - e.g. you already know and I'm learning from review.
If you take that viewpoint when reading THREADS.txt what do you get
(e.g. the more literal viewpoint). The whole relationship with async
jobs (and your other series) just adds a level of complexity.

John

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 2/4] qemu: Introduce APIs for manipulating qemuDomainAgentJob
Posted by Michal Privoznik, 2 weeks ago
The point is to break QEMU_JOB_* into smaller pieces which
enables us to achieve higher throughput. For instance, if there
are two threads, one is trying to query something on qemu
monitor while the other is trying to query something on agent
monitor these two threads would serialize. There is not much
reason for that.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
 src/qemu/THREADS.txt   |  62 +++++++++++++++++--
 src/qemu/qemu_domain.c | 163 +++++++++++++++++++++++++++++++++++++++----------
 src/qemu/qemu_domain.h |  12 ++++
 3 files changed, 200 insertions(+), 37 deletions(-)

diff --git a/src/qemu/THREADS.txt b/src/qemu/THREADS.txt
index 7243161fe3..6219f46a81 100644
--- a/src/qemu/THREADS.txt
+++ b/src/qemu/THREADS.txt
@@ -51,8 +51,8 @@ There are a number of locks on various objects
     Since virDomainObjPtr lock must not be held during sleeps, the job
     conditions provide additional protection for code making updates.
 
-    QEMU driver uses two kinds of job conditions: asynchronous and
-    normal.
+    QEMU driver uses three kinds of job conditions: asynchronous, agent
+    and normal.
 
     Asynchronous job condition is used for long running jobs (such as
     migration) that consist of several monitor commands and it is
@@ -69,9 +69,15 @@ There are a number of locks on various objects
     it needs to wait until the asynchronous job ends and try to acquire
     the job again.
 
+    Agent job condition is then used when thread wishes to talk to qemu
+    agent monitor. It is possible to acquire just agent job
+    (qemuDomainObjBeginAgentJob), or only normal job
+    (qemuDomainObjBeginJob) or both at the same time
+    (qemuDomainObjBeginJobWithAgent).
+
     Immediately after acquiring the virDomainObjPtr lock, any method
-    which intends to update state must acquire either asynchronous or
-    normal job condition.  The virDomainObjPtr lock is released while
+    which intends to update state must acquire asynchronous, normal or
+    agent job condition. The virDomainObjPtr lock is released while
     blocking on these condition variables.  Once the job condition is
     acquired, a method can safely release the virDomainObjPtr lock
     whenever it hits a piece of code which may sleep/wait, and
@@ -132,6 +138,30 @@ To acquire the normal job condition
 
 
 
+To acquire the agent job condition
+
+  qemuDomainObjBeginAgentJob()
+    - Waits until there is no other agent job set
+    - Sets job.agentActive tp the job type
+
+  qemuDomainObjEndAgentJob()
+    - Sets job.agentActive to 0
+    - Signals on job.cond condition
+
+
+
+To acquire both normal and agent job condition
+
+  qemuDomainObjBeginJobWithAgent()
+    - Waits until there is no normal and no agent job set
+    - Sets both job.active and job.agentActive with required job types
+
+  qemuDomainObjEndJobWithAgent()
+    - Sets both job.active and job.agentActive to 0
+    - Signals on job.cond condition
+
+
+
 To acquire the asynchronous job condition
 
   qemuDomainObjBeginAsyncJob()
@@ -247,6 +277,30 @@ Design patterns
      virDomainObjEndAPI(&obj);
 
 
+ * Invoking an agent command on a virDomainObjPtr
+
+     virDomainObjPtr obj;
+     qemuAgentPtr agent;
+
+     obj = qemuDomObjFromDomain(dom);
+
+     qemuDomainObjBeginAgentJob(obj, QEMU_JOB_TYPE);
+
+     ...do prep work...
+
+     if (!qemuDomainAgentAvailable(obj, true))
+         goto cleanup;
+
+     agent = qemuDomainObjEnterAgent(obj);
+     qemuAgentXXXX(agent, ..);
+     qemuDomainObjExitAgent(obj, agent);
+
+     ...do final work...
+
+     qemuDomainObjEndAgentJob(obj);
+     virDomainObjEndAPI(&obj);
+
+
  * Running asynchronous job
 
      virDomainObjPtr obj;
diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index b8e34c1c2c..09404f6569 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -319,6 +319,17 @@ qemuDomainObjResetJob(qemuDomainObjPrivatePtr priv)
     job->started = 0;
 }
 
+static void
+qemuDomainObjResetAgentJob(qemuDomainObjPrivatePtr priv)
+{
+    qemuDomainJobObjPtr job = &priv->job;
+
+    job->agentActive = QEMU_AGENT_JOB_NONE;
+    job->agentOwner = 0;
+    job->agentOwnerAPI = NULL;
+    job->agentStarted = 0;
+}
+
 static void
 qemuDomainObjResetAsyncJob(qemuDomainObjPrivatePtr priv)
 {
@@ -6361,6 +6372,15 @@ qemuDomainJobAllowed(qemuDomainObjPrivatePtr priv, qemuDomainJob job)
     return !priv->job.active && qemuDomainNestedJobAllowed(priv, job);
 }
 
+static bool
+qemuDomainObjCanSetJob(qemuDomainObjPrivatePtr priv,
+                       qemuDomainJob job,
+                       qemuDomainAgentJob agentJob)
+{
+    return (job == QEMU_JOB_NONE || !priv->job.active) &&
+           (agentJob == QEMU_AGENT_JOB_NONE || !priv->job.agentActive);
+}
+
 /* Give up waiting for mutex after 30 seconds */
 #define QEMU_JOB_WAIT_TIME (1000ull * 30)
 
@@ -6371,6 +6391,7 @@ static int ATTRIBUTE_NONNULL(1)
 qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
                               virDomainObjPtr obj,
                               qemuDomainJob job,
+                              qemuDomainAgentJob agentJob,
                               qemuDomainAsyncJob asyncJob)
 {
     qemuDomainObjPrivatePtr priv = obj->privateData;
@@ -6383,16 +6404,15 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
     int ret = -1;
     unsigned long long duration = 0;
     unsigned long long asyncDuration = 0;
-    const char *jobStr;
 
-    if (async)
-        jobStr = qemuDomainAsyncJobTypeToString(asyncJob);
-    else
-        jobStr = qemuDomainJobTypeToString(job);
-
-    VIR_DEBUG("Starting %s: %s (vm=%p name=%s, current job=%s async=%s)",
-              async ? "async job" : "job", jobStr, obj, obj->def->name,
+    VIR_DEBUG("Starting job: job=%s agentJob=%s asyncJob=%s "
+              "(vm=%p name=%s, current job=%s agentJob=%s async=%s)",
+              qemuDomainJobTypeToString(job),
+              qemuDomainAgentJobTypeToString(agentJob),
+              qemuDomainAsyncJobTypeToString(asyncJob),
+              obj, obj->def->name,
               qemuDomainJobTypeToString(priv->job.active),
+              qemuDomainAgentJobTypeToString(priv->job.agentActive),
               qemuDomainAsyncJobTypeToString(priv->job.asyncJob));
 
     if (virTimeMillisNow(&now) < 0) {
@@ -6416,7 +6436,7 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
             goto error;
     }
 
-    while (priv->job.active) {
+    while (!qemuDomainObjCanSetJob(priv, job, agentJob)) {
         VIR_DEBUG("Waiting for job (vm=%p name=%s)", obj, obj->def->name);
         if (virCondWaitUntil(&priv->job.cond, &obj->parent.lock, then) < 0)
             goto error;
@@ -6427,32 +6447,48 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
     if (!nested && !qemuDomainNestedJobAllowed(priv, job))
         goto retry;
 
-    qemuDomainObjResetJob(priv);
-
     ignore_value(virTimeMillisNow(&now));
 
-    if (job != QEMU_JOB_ASYNC) {
-        VIR_DEBUG("Started job: %s (async=%s vm=%p name=%s)",
-                   qemuDomainJobTypeToString(job),
-                  qemuDomainAsyncJobTypeToString(priv->job.asyncJob),
-                  obj, obj->def->name);
-        priv->job.active = job;
-        priv->job.owner = virThreadSelfID();
-        priv->job.ownerAPI = virThreadJobGet();
+    if (job) {
+        qemuDomainObjResetJob(priv);
+
+        if (job != QEMU_JOB_ASYNC) {
+            VIR_DEBUG("Started job: %s (async=%s vm=%p name=%s)",
+                      qemuDomainJobTypeToString(job),
+                      qemuDomainAsyncJobTypeToString(priv->job.asyncJob),
+                      obj, obj->def->name);
+            priv->job.active = job;
+            priv->job.owner = virThreadSelfID();
+            priv->job.ownerAPI = virThreadJobGet();
+            priv->job.started = now;
+        } else {
+            VIR_DEBUG("Started async job: %s (vm=%p name=%s)",
+                      qemuDomainAsyncJobTypeToString(asyncJob),
+                      obj, obj->def->name);
+            qemuDomainObjResetAsyncJob(priv);
+            if (VIR_ALLOC(priv->job.current) < 0)
+                goto cleanup;
+            priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE;
+            priv->job.asyncJob = asyncJob;
+            priv->job.asyncOwner = virThreadSelfID();
+            priv->job.asyncOwnerAPI = virThreadJobGet();
+            priv->job.asyncStarted = now;
+            priv->job.current->started = now;
+        }
+    }
+
+    if (agentJob) {
+        qemuDomainObjResetAgentJob(priv);
+
+        VIR_DEBUG("Started agent job: %s (vm=%p name=%s job=%s async=%s)",
+                  qemuDomainAgentJobTypeToString(agentJob),
+                  obj, obj->def->name,
+                  qemuDomainJobTypeToString(priv->job.active),
+                  qemuDomainAsyncJobTypeToString(priv->job.asyncJob));
+        priv->job.agentActive = agentJob;
+        priv->job.agentOwner = virThreadSelfID();
+        priv->job.agentOwnerAPI = virThreadJobGet();
         priv->job.started = now;
-    } else {
-        VIR_DEBUG("Started async job: %s (vm=%p name=%s)",
-                  qemuDomainAsyncJobTypeToString(asyncJob),
-                  obj, obj->def->name);
-        qemuDomainObjResetAsyncJob(priv);
-        if (VIR_ALLOC(priv->job.current) < 0)
-            goto cleanup;
-        priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE;
-        priv->job.asyncJob = asyncJob;
-        priv->job.asyncOwner = virThreadSelfID();
-        priv->job.asyncOwnerAPI = virThreadJobGet();
-        priv->job.asyncStarted = now;
-        priv->job.current->started = now;
     }
 
     if (qemuDomainTrackJob(job))
@@ -6534,12 +6570,31 @@ int qemuDomainObjBeginJob(virQEMUDriverPtr driver,
                           qemuDomainJob job)
 {
     if (qemuDomainObjBeginJobInternal(driver, obj, job,
+                                      QEMU_AGENT_JOB_NONE,
                                       QEMU_ASYNC_JOB_NONE) < 0)
         return -1;
     else
         return 0;
 }
 
+int qemuDomainObjBeginAgentJob(virQEMUDriverPtr driver,
+                               virDomainObjPtr obj,
+                               qemuDomainAgentJob agentJob)
+{
+    return qemuDomainObjBeginJobInternal(driver, obj, QEMU_JOB_NONE,
+                                         agentJob,
+                                         QEMU_ASYNC_JOB_NONE);
+}
+
+int qemuDomainObjBeginJobWithAgent(virQEMUDriverPtr driver,
+                                   virDomainObjPtr obj,
+                                   qemuDomainJob job,
+                                   qemuDomainAgentJob agentJob)
+{
+    return qemuDomainObjBeginJobInternal(driver, obj, job,
+                                        agentJob, QEMU_ASYNC_JOB_NONE);
+}
+
 int qemuDomainObjBeginAsyncJob(virQEMUDriverPtr driver,
                                virDomainObjPtr obj,
                                qemuDomainAsyncJob asyncJob,
@@ -6549,6 +6604,7 @@ int qemuDomainObjBeginAsyncJob(virQEMUDriverPtr driver,
     qemuDomainObjPrivatePtr priv;
 
     if (qemuDomainObjBeginJobInternal(driver, obj, QEMU_JOB_ASYNC,
+                                      QEMU_AGENT_JOB_NONE,
                                       asyncJob) < 0)
         return -1;
 
@@ -6578,6 +6634,7 @@ qemuDomainObjBeginNestedJob(virQEMUDriverPtr driver,
 
     return qemuDomainObjBeginJobInternal(driver, obj,
                                          QEMU_JOB_ASYNC_NESTED,
+                                         QEMU_AGENT_JOB_NONE,
                                          QEMU_ASYNC_JOB_NONE);
 }
 
@@ -6604,7 +6661,47 @@ qemuDomainObjEndJob(virQEMUDriverPtr driver, virDomainObjPtr obj)
     qemuDomainObjResetJob(priv);
     if (qemuDomainTrackJob(job))
         qemuDomainObjSaveJob(driver, obj);
-    virCondSignal(&priv->job.cond);
+    virCondBroadcast(&priv->job.cond);
+}
+
+void
+qemuDomainObjEndAgentJob(virDomainObjPtr obj)
+{
+    qemuDomainObjPrivatePtr priv = obj->privateData;
+    qemuDomainAgentJob agentJob = priv->job.agentActive;
+
+    priv->jobs_queued--;
+
+    VIR_DEBUG("Stopping agent job: %s (async=%s vm=%p name=%s)",
+              qemuDomainAgentJobTypeToString(agentJob),
+              qemuDomainAsyncJobTypeToString(priv->job.asyncJob),
+              obj, obj->def->name);
+
+    qemuDomainObjResetAgentJob(priv);
+    virCondBroadcast(&priv->job.cond);
+}
+
+void
+qemuDomainObjEndJobWithAgent(virQEMUDriverPtr driver,
+                             virDomainObjPtr obj)
+{
+    qemuDomainObjPrivatePtr priv = obj->privateData;
+    qemuDomainJob job = priv->job.active;
+    qemuDomainAgentJob agentJob = priv->job.agentActive;
+
+    priv->jobs_queued--;
+
+    VIR_DEBUG("Stopping both jobs: %s %s (async=%s vm=%p name=%s)",
+              qemuDomainJobTypeToString(job),
+              qemuDomainAgentJobTypeToString(agentJob),
+              qemuDomainAsyncJobTypeToString(priv->job.asyncJob),
+              obj, obj->def->name);
+
+    qemuDomainObjResetJob(priv);
+    qemuDomainObjResetAgentJob(priv);
+    if (qemuDomainTrackJob(job))
+        qemuDomainObjSaveJob(driver, obj);
+    virCondBroadcast(&priv->job.cond);
 }
 
 void
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
index 709b42e6fd..6ada26ca99 100644
--- a/src/qemu/qemu_domain.h
+++ b/src/qemu/qemu_domain.h
@@ -517,6 +517,15 @@ int qemuDomainObjBeginJob(virQEMUDriverPtr driver,
                           virDomainObjPtr obj,
                           qemuDomainJob job)
     ATTRIBUTE_RETURN_CHECK;
+int qemuDomainObjBeginAgentJob(virQEMUDriverPtr driver,
+                               virDomainObjPtr obj,
+                               qemuDomainAgentJob agentJob)
+    ATTRIBUTE_RETURN_CHECK;
+int qemuDomainObjBeginJobWithAgent(virQEMUDriverPtr driver,
+                                   virDomainObjPtr obj,
+                                   qemuDomainJob job,
+                                   qemuDomainAgentJob agentJob)
+    ATTRIBUTE_RETURN_CHECK;
 int qemuDomainObjBeginAsyncJob(virQEMUDriverPtr driver,
                                virDomainObjPtr obj,
                                qemuDomainAsyncJob asyncJob,
@@ -530,6 +539,9 @@ int qemuDomainObjBeginNestedJob(virQEMUDriverPtr driver,
 
 void qemuDomainObjEndJob(virQEMUDriverPtr driver,
                          virDomainObjPtr obj);
+void qemuDomainObjEndAgentJob(virDomainObjPtr obj);
+void qemuDomainObjEndJobWithAgent(virQEMUDriverPtr driver,
+                                  virDomainObjPtr obj);
 void qemuDomainObjEndAsyncJob(virQEMUDriverPtr driver,
                               virDomainObjPtr obj);
 void qemuDomainObjAbortAsyncJob(virDomainObjPtr obj);
-- 
2.16.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 2/4] qemu: Introduce APIs for manipulating qemuDomainAgentJob
Posted by John Ferlan, 1 week ago

On 06/08/2018 09:45 AM, Michal Privoznik wrote:
> The point is to break QEMU_JOB_* into smaller pieces which
> enables us to achieve higher throughput. For instance, if there
> are two threads, one is trying to query something on qemu
> monitor while the other is trying to query something on agent
> monitor these two threads would serialize. There is not much
> reason for that.
> 
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
>  src/qemu/THREADS.txt   |  62 +++++++++++++++++--
>  src/qemu/qemu_domain.c | 163 +++++++++++++++++++++++++++++++++++++++----------
>  src/qemu/qemu_domain.h |  12 ++++
>  3 files changed, 200 insertions(+), 37 deletions(-)
> 

Looked at THREADS.txt before reading any code in order to get a sense
for what's being done... I won't go back after reading the code so that
you get a sense of how someone not knowing what's going on views things.
So when reading comments here - just take them with that in mind. I have
a feeling my mind/thoughts will change later on ;-).

> diff --git a/src/qemu/THREADS.txt b/src/qemu/THREADS.txt
> index 7243161fe3..6219f46a81 100644
> --- a/src/qemu/THREADS.txt
> +++ b/src/qemu/THREADS.txt
> @@ -51,8 +51,8 @@ There are a number of locks on various objects
>      Since virDomainObjPtr lock must not be held during sleeps, the job
>      conditions provide additional protection for code making updates.
>  
> -    QEMU driver uses two kinds of job conditions: asynchronous and
> -    normal.
> +    QEMU driver uses three kinds of job conditions: asynchronous, agent
> +    and normal.
>  
>      Asynchronous job condition is used for long running jobs (such as
>      migration) that consist of several monitor commands and it is
> @@ -69,9 +69,15 @@ There are a number of locks on various objects
>      it needs to wait until the asynchronous job ends and try to acquire
>      the job again.
>  
> +    Agent job condition is then used when thread wishes to talk to qemu

s/then//
s/when thread/when the thread/
s/to qemu/to the qemu/

> +    agent monitor. It is possible to acquire just agent job
> +    (qemuDomainObjBeginAgentJob), or only normal job
> +    (qemuDomainObjBeginJob) or both at the same time
> +    (qemuDomainObjBeginJobWithAgent).
> +

Oh this seems like it's going to be tricky...  How does one choose which
to use. At least w/ async you know it's a long running job and normal is
everything else.  Now you have category 3 agent specific - at this point
I'm not sure you really want to mix normal and agent. If the purpose of
the series is to extricate agent job from normal job, then mixing when
it's convenient leads to speculation of what misunderstandings will
occur for future consumers.

Because of the text for "Normal job condition is used by all other
jobs", I wonder if the agent job description should go above Normal job;
otherwise, perhaps it'd be best to distinguish "qemu monitor" jobs in
the normal description.

>      Immediately after acquiring the virDomainObjPtr lock, any method
> -    which intends to update state must acquire either asynchronous or
> -    normal job condition.  The virDomainObjPtr lock is released while
> +    which intends to update state must acquire asynchronous, normal or
> +    agent job condition. The virDomainObjPtr lock is released while

There is no agent job condition, yet. And how on earth do you do both
agent and normal?

>      blocking on these condition variables.  Once the job condition is
>      acquired, a method can safely release the virDomainObjPtr lock
>      whenever it hits a piece of code which may sleep/wait, and
> @@ -132,6 +138,30 @@ To acquire the normal job condition
>  
>  
>  
> +To acquire the agent job condition
> +
> +  qemuDomainObjBeginAgentJob()
> +    - Waits until there is no other agent job set

So no agent job before starting, but...

> +    - Sets job.agentActive tp the job type
> +
> +  qemuDomainObjEndAgentJob()
> +    - Sets job.agentActive to 0
> +    - Signals on job.cond condition

... we signal on job.cond even though we didn't necessarily obtain?

> +
> +
> +
> +To acquire both normal and agent job condition
> +
> +  qemuDomainObjBeginJobWithAgent()
> +    - Waits until there is no normal and no agent job set
> +    - Sets both job.active and job.agentActive with required job types
> +
> +  qemuDomainObjEndJobWithAgent()
> +    - Sets both job.active and job.agentActive to 0
> +    - Signals on job.cond condition
> +
> +
> +
>  To acquire the asynchronous job condition
>  
>    qemuDomainObjBeginAsyncJob()
> @@ -247,6 +277,30 @@ Design patterns
>       virDomainObjEndAPI(&obj);
>  
>  
> + * Invoking an agent command on a virDomainObjPtr
> +
> +     virDomainObjPtr obj;
> +     qemuAgentPtr agent;
> +
> +     obj = qemuDomObjFromDomain(dom);
> +
> +     qemuDomainObjBeginAgentJob(obj, QEMU_JOB_TYPE);
> +
> +     ...do prep work...
> +
> +     if (!qemuDomainAgentAvailable(obj, true))
> +         goto cleanup;
> +
> +     agent = qemuDomainObjEnterAgent(obj);
> +     qemuAgentXXXX(agent, ..);
> +     qemuDomainObjExitAgent(obj, agent);
> +
> +     ...do final work...
> +
> +     qemuDomainObjEndAgentJob(obj);
> +     virDomainObjEndAPI(&obj);
> +
> +

Can you add a using a JobWithAgent example?

>   * Running asynchronous job
>  
>       virDomainObjPtr obj;
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index b8e34c1c2c..09404f6569 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -319,6 +319,17 @@ qemuDomainObjResetJob(qemuDomainObjPrivatePtr priv)
>      job->started = 0;
>  }
>  

Two empty lines (multiple occurrences w/ new functions).

> +static void
> +qemuDomainObjResetAgentJob(qemuDomainObjPrivatePtr priv)
> +{
> +    qemuDomainJobObjPtr job = &priv->job;
> +
> +    job->agentActive = QEMU_AGENT_JOB_NONE;
> +    job->agentOwner = 0;
> +    job->agentOwnerAPI = NULL;
> +    job->agentStarted = 0;
> +}
> +
>  static void
>  qemuDomainObjResetAsyncJob(qemuDomainObjPrivatePtr priv)
>  {
> @@ -6361,6 +6372,15 @@ qemuDomainJobAllowed(qemuDomainObjPrivatePtr priv, qemuDomainJob job)
>      return !priv->job.active && qemuDomainNestedJobAllowed(priv, job);
>  }
>  
> +static bool
> +qemuDomainObjCanSetJob(qemuDomainObjPrivatePtr priv,
> +                       qemuDomainJob job,
> +                       qemuDomainAgentJob agentJob)
> +{
> +    return (job == QEMU_JOB_NONE || !priv->job.active) &&
> +           (agentJob == QEMU_AGENT_JOB_NONE || !priv->job.agentActive);
> +}
> +
>  /* Give up waiting for mutex after 30 seconds */
>  #define QEMU_JOB_WAIT_TIME (1000ull * 30)
>  
> @@ -6371,6 +6391,7 @@ static int ATTRIBUTE_NONNULL(1)

So you're competing with yourself with changes to this code from your
other series. Who merges first ;-)

>  qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
>                                virDomainObjPtr obj,
>                                qemuDomainJob job,
> +                              qemuDomainAgentJob agentJob,
>                                qemuDomainAsyncJob asyncJob)
>  {
>      qemuDomainObjPrivatePtr priv = obj->privateData;
> @@ -6383,16 +6404,15 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
>      int ret = -1;
>      unsigned long long duration = 0;
>      unsigned long long asyncDuration = 0;
> -    const char *jobStr;
>  
> -    if (async)
> -        jobStr = qemuDomainAsyncJobTypeToString(asyncJob);
> -    else
> -        jobStr = qemuDomainJobTypeToString(job);
> -
> -    VIR_DEBUG("Starting %s: %s (vm=%p name=%s, current job=%s async=%s)",
> -              async ? "async job" : "job", jobStr, obj, obj->def->name,
> +    VIR_DEBUG("Starting job: job=%s agentJob=%s asyncJob=%s "
> +              "(vm=%p name=%s, current job=%s agentJob=%s async=%s)",
> +              qemuDomainJobTypeToString(job),
> +              qemuDomainAgentJobTypeToString(agentJob),
> +              qemuDomainAsyncJobTypeToString(asyncJob),
> +              obj, obj->def->name,
>                qemuDomainJobTypeToString(priv->job.active),
> +              qemuDomainAgentJobTypeToString(priv->job.agentActive),
>                qemuDomainAsyncJobTypeToString(priv->job.asyncJob));

Some random grumbling about splitting patches ;-)

>  
>      if (virTimeMillisNow(&now) < 0) {
> @@ -6416,7 +6436,7 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
>              goto error;
>      }
>  
> -    while (priv->job.active) {
> +    while (!qemuDomainObjCanSetJob(priv, job, agentJob)) {
>          VIR_DEBUG("Waiting for job (vm=%p name=%s)", obj, obj->def->name);
>          if (virCondWaitUntil(&priv->job.cond, &obj->parent.lock, then) < 0)
>              goto error;

Part of me says this check is ripe for some future coding error, but
logically AFAICT it works.

> @@ -6427,32 +6447,48 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
>      if (!nested && !qemuDomainNestedJobAllowed(priv, job))
>          goto retry;
>  
> -    qemuDomainObjResetJob(priv);
> -
>      ignore_value(virTimeMillisNow(&now));
>  
> -    if (job != QEMU_JOB_ASYNC) {
> -        VIR_DEBUG("Started job: %s (async=%s vm=%p name=%s)",
> -                   qemuDomainJobTypeToString(job),
> -                  qemuDomainAsyncJobTypeToString(priv->job.asyncJob),
> -                  obj, obj->def->name);
> -        priv->job.active = job;
> -        priv->job.owner = virThreadSelfID();
> -        priv->job.ownerAPI = virThreadJobGet();
> +    if (job) {
> +        qemuDomainObjResetJob(priv);
> +
> +        if (job != QEMU_JOB_ASYNC) {
> +            VIR_DEBUG("Started job: %s (async=%s vm=%p name=%s)",
> +                      qemuDomainJobTypeToString(job),
> +                      qemuDomainAsyncJobTypeToString(priv->job.asyncJob),
> +                      obj, obj->def->name);
> +            priv->job.active = job;
> +            priv->job.owner = virThreadSelfID();
> +            priv->job.ownerAPI = virThreadJobGet();
> +            priv->job.started = now;
> +        } else {
> +            VIR_DEBUG("Started async job: %s (vm=%p name=%s)",
> +                      qemuDomainAsyncJobTypeToString(asyncJob),
> +                      obj, obj->def->name);
> +            qemuDomainObjResetAsyncJob(priv);
> +            if (VIR_ALLOC(priv->job.current) < 0)
> +                goto cleanup;
> +            priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE;
> +            priv->job.asyncJob = asyncJob;
> +            priv->job.asyncOwner = virThreadSelfID();
> +            priv->job.asyncOwnerAPI = virThreadJobGet();
> +            priv->job.asyncStarted = now;
> +            priv->job.current->started = now;
> +        }
> +    }
> +
> +    if (agentJob) {
> +        qemuDomainObjResetAgentJob(priv);
> +
> +        VIR_DEBUG("Started agent job: %s (vm=%p name=%s job=%s async=%s)",
> +                  qemuDomainAgentJobTypeToString(agentJob),
> +                  obj, obj->def->name,
> +                  qemuDomainJobTypeToString(priv->job.active),
> +                  qemuDomainAsyncJobTypeToString(priv->job.asyncJob));
> +        priv->job.agentActive = agentJob;
> +        priv->job.agentOwner = virThreadSelfID();
> +        priv->job.agentOwnerAPI = virThreadJobGet();
>          priv->job.started = now;
> -    } else {
> -        VIR_DEBUG("Started async job: %s (vm=%p name=%s)",
> -                  qemuDomainAsyncJobTypeToString(asyncJob),
> -                  obj, obj->def->name);
> -        qemuDomainObjResetAsyncJob(priv);
> -        if (VIR_ALLOC(priv->job.current) < 0)
> -            goto cleanup;
> -        priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE;
> -        priv->job.asyncJob = asyncJob;
> -        priv->job.asyncOwner = virThreadSelfID();
> -        priv->job.asyncOwnerAPI = virThreadJobGet();
> -        priv->job.asyncStarted = now;
> -        priv->job.current->started = now;

So after reading the code and thinking about the documentation, IMO,
having agentJob use job.cond is confusing.  Having JobWithAgent is even
more confusing. But it seems the code would work

I think what concerns me is there are 3 paths (I read patch 4) that
would seemingly indicate that by starting an agentJob we block a normal
domain job. In order to start one, IIUC both agentJob and normal job
must be 0. That seems reasonable until one starts thinking about races
when there's a thread waiting for a JobWithAgent and a thread waiting
for one or the other. Combine that with the check "if (!nested &&
!qemuDomainNestedJobAllowed(priv, job))" which may now pass NONE for
@job if one of those threads was an agentJob.

>      }
>  
>      if (qemuDomainTrackJob(job))
> @@ -6534,12 +6570,31 @@ int qemuDomainObjBeginJob(virQEMUDriverPtr driver,
>                            qemuDomainJob job)
>  {
>      if (qemuDomainObjBeginJobInternal(driver, obj, job,
> +                                      QEMU_AGENT_JOB_NONE,
>                                        QEMU_ASYNC_JOB_NONE) < 0)
>          return -1;
>      else
>          return 0;
>  }
>  

This could use an intro comment regarding usage.

> +int qemuDomainObjBeginAgentJob(virQEMUDriverPtr driver,
> +                               virDomainObjPtr obj,
> +                               qemuDomainAgentJob agentJob)
> +{

Read patch 3 and the following will make sense...

    if (agentJob == QEMU_AGENT_JOB_MODIFY_BLOCK)
        return qemuDomainObjBeginJobInternal(driver, obj,
                                             QEMU_JOB_MODIFY,
                                             agentJob,
                                             QEMU_ASYNC_JOB_NONE);
    else
        return qemuDomainObjBeginJobInternal(driver, obj,
                                             QEMU_JOB_NONE,
                                             agentJob,
                                             QEMU_ASYNC_JOB_NONE);
    > +    return qemuDomainObjBeginJobInternal(driver, obj, QEMU_JOB_NONE,
> +                                         agentJob,
> +                                         QEMU_ASYNC_JOB_NONE);
> +}
> +
> +int qemuDomainObjBeginJobWithAgent(virQEMUDriverPtr driver,
> +                                   virDomainObjPtr obj,
> +                                   qemuDomainJob job,
> +                                   qemuDomainAgentJob agentJob)
> +{
> +    return qemuDomainObjBeginJobInternal(driver, obj, job,
> +                                        agentJob, QEMU_ASYNC_JOB_NONE);
> +}
> +

if you got with the above, then this one's not necessary.

>  int qemuDomainObjBeginAsyncJob(virQEMUDriverPtr driver,
>                                 virDomainObjPtr obj,
>                                 qemuDomainAsyncJob asyncJob,
> @@ -6549,6 +6604,7 @@ int qemuDomainObjBeginAsyncJob(virQEMUDriverPtr driver,
>      qemuDomainObjPrivatePtr priv;
>  
>      if (qemuDomainObjBeginJobInternal(driver, obj, QEMU_JOB_ASYNC,
> +                                      QEMU_AGENT_JOB_NONE,
>                                        asyncJob) < 0)
>          return -1;
>  
> @@ -6578,6 +6634,7 @@ qemuDomainObjBeginNestedJob(virQEMUDriverPtr driver,
>  
>      return qemuDomainObjBeginJobInternal(driver, obj,
>                                           QEMU_JOB_ASYNC_NESTED,
> +                                         QEMU_AGENT_JOB_NONE,
>                                           QEMU_ASYNC_JOB_NONE);
>  }
>  
> @@ -6604,7 +6661,47 @@ qemuDomainObjEndJob(virQEMUDriverPtr driver, virDomainObjPtr obj)
>      qemuDomainObjResetJob(priv);
>      if (qemuDomainTrackJob(job))
>          qemuDomainObjSaveJob(driver, obj);
> -    virCondSignal(&priv->job.cond);
> +    virCondBroadcast(&priv->job.cond);
> +}
> +
> +void
> +qemuDomainObjEndAgentJob(virDomainObjPtr obj)
> +{
> +    qemuDomainObjPrivatePtr priv = obj->privateData;
> +    qemuDomainAgentJob agentJob = priv->job.agentActive;
> +
> +    priv->jobs_queued--;
> +
> +    VIR_DEBUG("Stopping agent job: %s (async=%s vm=%p name=%s)",
> +              qemuDomainAgentJobTypeToString(agentJob),
> +              qemuDomainAsyncJobTypeToString(priv->job.asyncJob),
> +              obj, obj->def->name);
> +

Read patch 3 and this will make sense:

    if (priv->job.agentActive == QEMU_AGENT_JOB_MODIFY_BLOCK) {
        qemuDomainObjResetJob(priv);
        if (qemuDomainTrackJob(job))
            qemuDomainObjSaveJob
    }

Although the VIR_DEBUG above gets complicated...

> +    qemuDomainObjResetAgentJob(priv);
> +    virCondBroadcast(&priv->job.cond);
> +}
> +
> +void
> +qemuDomainObjEndJobWithAgent(virQEMUDriverPtr driver,
> +                             virDomainObjPtr obj)
> +{
> +    qemuDomainObjPrivatePtr priv = obj->privateData;
> +    qemuDomainJob job = priv->job.active;
> +    qemuDomainAgentJob agentJob = priv->job.agentActive;
> +
> +    priv->jobs_queued--;
> +
> +    VIR_DEBUG("Stopping both jobs: %s %s (async=%s vm=%p name=%s)",
> +              qemuDomainJobTypeToString(job),
> +              qemuDomainAgentJobTypeToString(agentJob),
> +              qemuDomainAsyncJobTypeToString(priv->job.asyncJob),
> +              obj, obj->def->name);
> +
> +    qemuDomainObjResetJob(priv);
> +    qemuDomainObjResetAgentJob(priv);
> +    if (qemuDomainTrackJob(job))
> +        qemuDomainObjSaveJob(driver, obj);
> +    virCondBroadcast(&priv->job.cond);

and this one goes away.

John

>  }
>  
>  void
> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
> index 709b42e6fd..6ada26ca99 100644
> --- a/src/qemu/qemu_domain.h
> +++ b/src/qemu/qemu_domain.h
> @@ -517,6 +517,15 @@ int qemuDomainObjBeginJob(virQEMUDriverPtr driver,
>                            virDomainObjPtr obj,
>                            qemuDomainJob job)
>      ATTRIBUTE_RETURN_CHECK;
> +int qemuDomainObjBeginAgentJob(virQEMUDriverPtr driver,
> +                               virDomainObjPtr obj,
> +                               qemuDomainAgentJob agentJob)
> +    ATTRIBUTE_RETURN_CHECK;
> +int qemuDomainObjBeginJobWithAgent(virQEMUDriverPtr driver,
> +                                   virDomainObjPtr obj,
> +                                   qemuDomainJob job,
> +                                   qemuDomainAgentJob agentJob)
> +    ATTRIBUTE_RETURN_CHECK;
>  int qemuDomainObjBeginAsyncJob(virQEMUDriverPtr driver,
>                                 virDomainObjPtr obj,
>                                 qemuDomainAsyncJob asyncJob,
> @@ -530,6 +539,9 @@ int qemuDomainObjBeginNestedJob(virQEMUDriverPtr driver,
>  
>  void qemuDomainObjEndJob(virQEMUDriverPtr driver,
>                           virDomainObjPtr obj);
> +void qemuDomainObjEndAgentJob(virDomainObjPtr obj);
> +void qemuDomainObjEndJobWithAgent(virQEMUDriverPtr driver,
> +                                  virDomainObjPtr obj);
>  void qemuDomainObjEndAsyncJob(virQEMUDriverPtr driver,
>                                virDomainObjPtr obj);
>  void qemuDomainObjAbortAsyncJob(virDomainObjPtr obj);
> 

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 2/4] qemu: Introduce APIs for manipulating qemuDomainAgentJob
Posted by Michal Privoznik, 1 week ago
On 06/14/2018 12:10 AM, John Ferlan wrote:
> 
> 
> On 06/08/2018 09:45 AM, Michal Privoznik wrote:
>> The point is to break QEMU_JOB_* into smaller pieces which
>> enables us to achieve higher throughput. For instance, if there
>> are two threads, one is trying to query something on qemu
>> monitor while the other is trying to query something on agent
>> monitor these two threads would serialize. There is not much
>> reason for that.
>>
>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
>> ---
>>  src/qemu/THREADS.txt   |  62 +++++++++++++++++--
>>  src/qemu/qemu_domain.c | 163 +++++++++++++++++++++++++++++++++++++++----------
>>  src/qemu/qemu_domain.h |  12 ++++
>>  3 files changed, 200 insertions(+), 37 deletions(-)
>>
> 
> Looked at THREADS.txt before reading any code in order to get a sense
> for what's being done... I won't go back after reading the code so that
> you get a sense of how someone not knowing what's going on views things.
> So when reading comments here - just take them with that in mind. I have
> a feeling my mind/thoughts will change later on ;-).
> 
>> diff --git a/src/qemu/THREADS.txt b/src/qemu/THREADS.txt
>> index 7243161fe3..6219f46a81 100644
>> --- a/src/qemu/THREADS.txt
>> +++ b/src/qemu/THREADS.txt
>> @@ -51,8 +51,8 @@ There are a number of locks on various objects
>>      Since virDomainObjPtr lock must not be held during sleeps, the job
>>      conditions provide additional protection for code making updates.
>>  
>> -    QEMU driver uses two kinds of job conditions: asynchronous and
>> -    normal.
>> +    QEMU driver uses three kinds of job conditions: asynchronous, agent
>> +    and normal.
>>  
>>      Asynchronous job condition is used for long running jobs (such as
>>      migration) that consist of several monitor commands and it is
>> @@ -69,9 +69,15 @@ There are a number of locks on various objects
>>      it needs to wait until the asynchronous job ends and try to acquire
>>      the job again.
>>  
>> +    Agent job condition is then used when thread wishes to talk to qemu
> 
> s/then//
> s/when thread/when the thread/
> s/to qemu/to the qemu/
> 
>> +    agent monitor. It is possible to acquire just agent job
>> +    (qemuDomainObjBeginAgentJob), or only normal job
>> +    (qemuDomainObjBeginJob) or both at the same time
>> +    (qemuDomainObjBeginJobWithAgent).
>> +
> 
> Oh this seems like it's going to be tricky...  How does one choose which
> to use. 

By reading further. There are examples which make it clear.

At least w/ async you know it's a long running job and normal is
> everything else.  

Not really. How long must a job be to be considered long running? You
(as reader) don't know at this point so you carry on reading and
examples make it all clear.

But okay, I can try to expand the description.

Now you have category 3 agent specific - at this point
> I'm not sure you really want to mix normal and agent. If the purpose of
> the series is to extricate agent job from normal job, then mixing when
> it's convenient leads to speculation of what misunderstandings will
> occur for future consumers.

Sometimes an API accesses both monitor AND agent. In this case it needs
to grab both types of jobs.

> 
> Because of the text for "Normal job condition is used by all other
> jobs", I wonder if the agent job description should go above Normal job;
> otherwise, perhaps it'd be best to distinguish "qemu monitor" jobs in
> the normal description.
> 
>>      Immediately after acquiring the virDomainObjPtr lock, any method
>> -    which intends to update state must acquire either asynchronous or
>> -    normal job condition.  The virDomainObjPtr lock is released while
>> +    which intends to update state must acquire asynchronous, normal or
>> +    agent job condition. The virDomainObjPtr lock is released while
> 
> There is no agent job condition, yet. And how on earth do you do both
> agent and normal?

There's a difference between "job condition" and pthread_cond_t. What
this paragraph understands as "job condition" is qemuDomainJob really.
And in the second hunk of my patch I'm enumerating all three ways how to
acquire jobs.
I'll drop "condition" word which is apparently confusing.

> 
>>      blocking on these condition variables.  Once the job condition is
>>      acquired, a method can safely release the virDomainObjPtr lock
>>      whenever it hits a piece of code which may sleep/wait, and
>> @@ -132,6 +138,30 @@ To acquire the normal job condition
>>  
>>  
>>  
>> +To acquire the agent job condition
>> +
>> +  qemuDomainObjBeginAgentJob()
>> +    - Waits until there is no other agent job set
> 
> So no agent job before starting, but...
> 
>> +    - Sets job.agentActive tp the job type
>> +
>> +  qemuDomainObjEndAgentJob()
>> +    - Sets job.agentActive to 0
>> +    - Signals on job.cond condition
> 
> ... we signal on job.cond even though we didn't necessarily obtain?

Yes. I don't see the problem here.

> 
>> +
>> +
>> +
>> +To acquire both normal and agent job condition
>> +
>> +  qemuDomainObjBeginJobWithAgent()
>> +    - Waits until there is no normal and no agent job set
>> +    - Sets both job.active and job.agentActive with required job types
>> +
>> +  qemuDomainObjEndJobWithAgent()
>> +    - Sets both job.active and job.agentActive to 0
>> +    - Signals on job.cond condition
>> +
>> +
>> +
>>  To acquire the asynchronous job condition
>>  
>>    qemuDomainObjBeginAsyncJob()
>> @@ -247,6 +277,30 @@ Design patterns
>>       virDomainObjEndAPI(&obj);
>>  
>>  
>> + * Invoking an agent command on a virDomainObjPtr
>> +
>> +     virDomainObjPtr obj;
>> +     qemuAgentPtr agent;
>> +
>> +     obj = qemuDomObjFromDomain(dom);
>> +
>> +     qemuDomainObjBeginAgentJob(obj, QEMU_JOB_TYPE);
>> +
>> +     ...do prep work...
>> +
>> +     if (!qemuDomainAgentAvailable(obj, true))
>> +         goto cleanup;
>> +
>> +     agent = qemuDomainObjEnterAgent(obj);
>> +     qemuAgentXXXX(agent, ..);
>> +     qemuDomainObjExitAgent(obj, agent);
>> +
>> +     ...do final work...
>> +
>> +     qemuDomainObjEndAgentJob(obj);
>> +     virDomainObjEndAPI(&obj);
>> +
>> +
> 
> Can you add a using a JobWithAgent example?

Okay.

> 
>>   * Running asynchronous job
>>  
>>       virDomainObjPtr obj;
>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>> index b8e34c1c2c..09404f6569 100644
>> --- a/src/qemu/qemu_domain.c
>> +++ b/src/qemu/qemu_domain.c
>> @@ -319,6 +319,17 @@ qemuDomainObjResetJob(qemuDomainObjPrivatePtr priv)
>>      job->started = 0;
>>  }
>>  
> 
> Two empty lines (multiple occurrences w/ new functions).>
>> +static void
>> +qemuDomainObjResetAgentJob(qemuDomainObjPrivatePtr priv)
>> +{
>> +    qemuDomainJobObjPtr job = &priv->job;
>> +
>> +    job->agentActive = QEMU_AGENT_JOB_NONE;
>> +    job->agentOwner = 0;
>> +    job->agentOwnerAPI = NULL;
>> +    job->agentStarted = 0;
>> +}
>> +
>>  static void
>>  qemuDomainObjResetAsyncJob(qemuDomainObjPrivatePtr priv)
>>  {
>> @@ -6361,6 +6372,15 @@ qemuDomainJobAllowed(qemuDomainObjPrivatePtr priv, qemuDomainJob job)
>>      return !priv->job.active && qemuDomainNestedJobAllowed(priv, job);
>>  }
>>  
>> +static bool
>> +qemuDomainObjCanSetJob(qemuDomainObjPrivatePtr priv,
>> +                       qemuDomainJob job,
>> +                       qemuDomainAgentJob agentJob)
>> +{
>> +    return (job == QEMU_JOB_NONE || !priv->job.active) &&
>> +           (agentJob == QEMU_AGENT_JOB_NONE || !priv->job.agentActive);
>> +}
>> +
>>  /* Give up waiting for mutex after 30 seconds */
>>  #define QEMU_JOB_WAIT_TIME (1000ull * 30)
>>  
>> @@ -6371,6 +6391,7 @@ static int ATTRIBUTE_NONNULL(1)
> 
> So you're competing with yourself with changes to this code from your
> other series. Who merges first ;-)

Yes, I am aware of that :-) I've got only myself to blame though.

> 
>>  qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
>>                                virDomainObjPtr obj,
>>                                qemuDomainJob job,
>> +                              qemuDomainAgentJob agentJob,
>>                                qemuDomainAsyncJob asyncJob)
>>  {
>>      qemuDomainObjPrivatePtr priv = obj->privateData;
>> @@ -6383,16 +6404,15 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
>>      int ret = -1;
>>      unsigned long long duration = 0;
>>      unsigned long long asyncDuration = 0;
>> -    const char *jobStr;
>>  
>> -    if (async)
>> -        jobStr = qemuDomainAsyncJobTypeToString(asyncJob);
>> -    else
>> -        jobStr = qemuDomainJobTypeToString(job);
>> -
>> -    VIR_DEBUG("Starting %s: %s (vm=%p name=%s, current job=%s async=%s)",
>> -              async ? "async job" : "job", jobStr, obj, obj->def->name,
>> +    VIR_DEBUG("Starting job: job=%s agentJob=%s asyncJob=%s "
>> +              "(vm=%p name=%s, current job=%s agentJob=%s async=%s)",
>> +              qemuDomainJobTypeToString(job),
>> +              qemuDomainAgentJobTypeToString(agentJob),
>> +              qemuDomainAsyncJobTypeToString(asyncJob),
>> +              obj, obj->def->name,
>>                qemuDomainJobTypeToString(priv->job.active),
>> +              qemuDomainAgentJobTypeToString(priv->job.agentActive),
>>                qemuDomainAsyncJobTypeToString(priv->job.asyncJob));
> 
> Some random grumbling about splitting patches ;-)
> 
>>  
>>      if (virTimeMillisNow(&now) < 0) {
>> @@ -6416,7 +6436,7 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
>>              goto error;
>>      }
>>  
>> -    while (priv->job.active) {
>> +    while (!qemuDomainObjCanSetJob(priv, job, agentJob)) {
>>          VIR_DEBUG("Waiting for job (vm=%p name=%s)", obj, obj->def->name);
>>          if (virCondWaitUntil(&priv->job.cond, &obj->parent.lock, then) < 0)
>>              goto error;
> 
> Part of me says this check is ripe for some future coding error, but
> logically AFAICT it works.

Well, the whole reason why any virCondWait() MUST be guarded with
while() loop is that pthread_cond_wait() can wake up anytime even
without anybody signalling it. This is taken from pthread_cond_wait(3p)
man page:

  When using condition variables there is always a Boolean predicate
  involving shared variables associated with each condition wait that is
  true if the thread should proceed. Spurious wakeups from the
  pthread_cond_timedwait() or pthread_cond_wait()  functions may occur.
  Since the return from pthread_cond_timedwait() or pthread_cond_wait()
  does not imply anything about the value of this predicate, the
  predicate should be re-evaluated upon such return.

Therefore, if there's a thread waiting for say agent job, and it gets
signalized by another thread ending monitor job, it is woken up but
realizes this is a "spurious" wake up and returns into virCondWait()
immediately.

IOW, yes we will get more wake ups here but it also enables us to grab
both jobs at the same time (without us having to worry about how to wait
on two pthread_cond at the same time).


> 
>> @@ -6427,32 +6447,48 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
>>      if (!nested && !qemuDomainNestedJobAllowed(priv, job))
>>          goto retry;
>>  
>> -    qemuDomainObjResetJob(priv);
>> -
>>      ignore_value(virTimeMillisNow(&now));
>>  
>> -    if (job != QEMU_JOB_ASYNC) {
>> -        VIR_DEBUG("Started job: %s (async=%s vm=%p name=%s)",
>> -                   qemuDomainJobTypeToString(job),
>> -                  qemuDomainAsyncJobTypeToString(priv->job.asyncJob),
>> -                  obj, obj->def->name);
>> -        priv->job.active = job;
>> -        priv->job.owner = virThreadSelfID();
>> -        priv->job.ownerAPI = virThreadJobGet();
>> +    if (job) {
>> +        qemuDomainObjResetJob(priv);
>> +
>> +        if (job != QEMU_JOB_ASYNC) {
>> +            VIR_DEBUG("Started job: %s (async=%s vm=%p name=%s)",
>> +                      qemuDomainJobTypeToString(job),
>> +                      qemuDomainAsyncJobTypeToString(priv->job.asyncJob),
>> +                      obj, obj->def->name);
>> +            priv->job.active = job;
>> +            priv->job.owner = virThreadSelfID();
>> +            priv->job.ownerAPI = virThreadJobGet();
>> +            priv->job.started = now;
>> +        } else {
>> +            VIR_DEBUG("Started async job: %s (vm=%p name=%s)",
>> +                      qemuDomainAsyncJobTypeToString(asyncJob),
>> +                      obj, obj->def->name);
>> +            qemuDomainObjResetAsyncJob(priv);
>> +            if (VIR_ALLOC(priv->job.current) < 0)
>> +                goto cleanup;
>> +            priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE;
>> +            priv->job.asyncJob = asyncJob;
>> +            priv->job.asyncOwner = virThreadSelfID();
>> +            priv->job.asyncOwnerAPI = virThreadJobGet();
>> +            priv->job.asyncStarted = now;
>> +            priv->job.current->started = now;
>> +        }
>> +    }
>> +
>> +    if (agentJob) {
>> +        qemuDomainObjResetAgentJob(priv);
>> +
>> +        VIR_DEBUG("Started agent job: %s (vm=%p name=%s job=%s async=%s)",
>> +                  qemuDomainAgentJobTypeToString(agentJob),
>> +                  obj, obj->def->name,
>> +                  qemuDomainJobTypeToString(priv->job.active),
>> +                  qemuDomainAsyncJobTypeToString(priv->job.asyncJob));
>> +        priv->job.agentActive = agentJob;
>> +        priv->job.agentOwner = virThreadSelfID();
>> +        priv->job.agentOwnerAPI = virThreadJobGet();
>>          priv->job.started = now;
>> -    } else {
>> -        VIR_DEBUG("Started async job: %s (vm=%p name=%s)",
>> -                  qemuDomainAsyncJobTypeToString(asyncJob),
>> -                  obj, obj->def->name);
>> -        qemuDomainObjResetAsyncJob(priv);
>> -        if (VIR_ALLOC(priv->job.current) < 0)
>> -            goto cleanup;
>> -        priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE;
>> -        priv->job.asyncJob = asyncJob;
>> -        priv->job.asyncOwner = virThreadSelfID();
>> -        priv->job.asyncOwnerAPI = virThreadJobGet();
>> -        priv->job.asyncStarted = now;
>> -        priv->job.current->started = now;
> 
> So after reading the code and thinking about the documentation, IMO,
> having agentJob use job.cond is confusing.  Having JobWithAgent is even
> more confusing. But it seems the code would work

Again, naming is hard. So any suggestions for better names of functions
are welcome. Basically what we need are names to cover the following
cases:

                   | needs monitor | doesn't need monitor |
                   +---------------+----------------------+
needs agent        |      X        |          X           |
                   +---------------+----------------------+
doesn't need agent |      X        |          0           |
                   +---------------+----------------------+

> 
> I think what concerns me is there are 3 paths (I read patch 4) that
> would seemingly indicate that by starting an agentJob we block a normal
> domain job. In order to start one, IIUC both agentJob and normal job
> must be 0. 

No. I don't see that in code. What I see is the following predicate:

  return (job == QEMU_JOB_NONE || !priv->job.active) &&
         (agentJob == QEMU_AGENT_JOB_NONE || !priv->job.agentActive);

So if a thread wants just agentJob (in which case job == QEMU_JOB_NONE)
this predicate is true (= thread can set agent job) as long as
agentActive is unset. Similarly, if a thread wants just monitor job
(agentJob == QEMU_AGENT_JOB_NONE) this predicate allows that as long as
priv->job.active is 0. And finally, if thread wants both jobs, both
priv->job.active AND priv->job.agentActive have to be equal to zero.

And here you can see how having just one condition pays off: ALL threads
waiting are woken up, but only that one which can set the job will
successfully set it. The rest goes back to sleep.

> That seems reasonable until one starts thinking about races
> when there's a thread waiting for a JobWithAgent and a thread waiting
> for one or the other. Combine that with the check "if (!nested &&
> !qemuDomainNestedJobAllowed(priv, job))" which may now pass NONE for
> @job if one of those threads was an agentJob.

I'm failing to see any race here, sorry. Can you give me an example
please?

> 
>>      }
>>  
>>      if (qemuDomainTrackJob(job))
>> @@ -6534,12 +6570,31 @@ int qemuDomainObjBeginJob(virQEMUDriverPtr driver,
>>                            qemuDomainJob job)
>>  {
>>      if (qemuDomainObjBeginJobInternal(driver, obj, job,
>> +                                      QEMU_AGENT_JOB_NONE,
>>                                        QEMU_ASYNC_JOB_NONE) < 0)
>>          return -1;
>>      else
>>          return 0;
>>  }
>>  
> 
> This could use an intro comment regarding usage.

Oh, sure.

> 
>> +int qemuDomainObjBeginAgentJob(virQEMUDriverPtr driver,
>> +                               virDomainObjPtr obj,
>> +                               qemuDomainAgentJob agentJob)
>> +{
> 
> Read patch 3 and the following will make sense...
> 
>     if (agentJob == QEMU_AGENT_JOB_MODIFY_BLOCK)
>         return qemuDomainObjBeginJobInternal(driver, obj,
>                                              QEMU_JOB_MODIFY,
>                                              agentJob,
>                                              QEMU_ASYNC_JOB_NONE);
>     else
>         return qemuDomainObjBeginJobInternal(driver, obj,
>                                              QEMU_JOB_NONE,
>                                              agentJob,
>                                              QEMU_ASYNC_JOB_NONE);
>     > +    return qemuDomainObjBeginJobInternal(driver, obj, QEMU_JOB_NONE,
>> +                                         agentJob,
>> +                                         QEMU_ASYNC_JOB_NONE);
>> +}
>> +
>> +int qemuDomainObjBeginJobWithAgent(virQEMUDriverPtr driver,
>> +                                   virDomainObjPtr obj,
>> +                                   qemuDomainJob job,
>> +                                   qemuDomainAgentJob agentJob)
>> +{
>> +    return qemuDomainObjBeginJobInternal(driver, obj, job,
>> +                                        agentJob, QEMU_ASYNC_JOB_NONE);
>> +}
>> +
> 
> if you got with the above, then this one's not necessary.

No. It is necessary.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 2/4] qemu: Introduce APIs for manipulating qemuDomainAgentJob
Posted by John Ferlan, 1 week ago

On 06/14/2018 10:32 AM, Michal Privoznik wrote:
> On 06/14/2018 12:10 AM, John Ferlan wrote:
>>
>>
>> On 06/08/2018 09:45 AM, Michal Privoznik wrote:
>>> The point is to break QEMU_JOB_* into smaller pieces which
>>> enables us to achieve higher throughput. For instance, if there
>>> are two threads, one is trying to query something on qemu
>>> monitor while the other is trying to query something on agent
>>> monitor these two threads would serialize. There is not much
>>> reason for that.
>>>
>>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
>>> ---
>>>  src/qemu/THREADS.txt   |  62 +++++++++++++++++--
>>>  src/qemu/qemu_domain.c | 163 +++++++++++++++++++++++++++++++++++++++----------
>>>  src/qemu/qemu_domain.h |  12 ++++
>>>  3 files changed, 200 insertions(+), 37 deletions(-)
>>>
>>
>> Looked at THREADS.txt before reading any code in order to get a sense
>> for what's being done... I won't go back after reading the code so that
>> you get a sense of how someone not knowing what's going on views things.
>> So when reading comments here - just take them with that in mind. I have
>> a feeling my mind/thoughts will change later on ;-).
>>
>>> diff --git a/src/qemu/THREADS.txt b/src/qemu/THREADS.txt
>>> index 7243161fe3..6219f46a81 100644
>>> --- a/src/qemu/THREADS.txt
>>> +++ b/src/qemu/THREADS.txt
>>> @@ -51,8 +51,8 @@ There are a number of locks on various objects
>>>      Since virDomainObjPtr lock must not be held during sleeps, the job
>>>      conditions provide additional protection for code making updates.
>>>  
>>> -    QEMU driver uses two kinds of job conditions: asynchronous and
>>> -    normal.
>>> +    QEMU driver uses three kinds of job conditions: asynchronous, agent
>>> +    and normal.
>>>  
>>>      Asynchronous job condition is used for long running jobs (such as
>>>      migration) that consist of several monitor commands and it is
>>> @@ -69,9 +69,15 @@ There are a number of locks on various objects
>>>      it needs to wait until the asynchronous job ends and try to acquire
>>>      the job again.
>>>  
>>> +    Agent job condition is then used when thread wishes to talk to qemu
>>
>> s/then//
>> s/when thread/when the thread/
>> s/to qemu/to the qemu/
>>
>>> +    agent monitor. It is possible to acquire just agent job
>>> +    (qemuDomainObjBeginAgentJob), or only normal job
>>> +    (qemuDomainObjBeginJob) or both at the same time
>>> +    (qemuDomainObjBeginJobWithAgent).
>>> +
>>
>> Oh this seems like it's going to be tricky...  How does one choose which
>> to use. 
> 
> By reading further. There are examples which make it clear.

Not sure I 100% agree with that statement, but will point out that the
descriptions for async and normal don't list API names.

I get the sense for why to choose async job and that a normal job
encompasses the rest. What an agent job is and why it's selected - it's
less clear, but perhaps the following helps alters that...

An agent job condition is used for any command that uses the domain
guest agent to query or modify for supported guest agent options. Using
this job condition allows for one thread to query/modify the agent while
another normal thread can do the same for the non guest agent related
data. For certain guest agent commands (shutdown, reboot, and set time),
a normal job cannot be running thus those APIs will wait for any normal
thread to be done before running.


 >
> At least w/ async you know it's a long running job and normal is
>> everything else.  
> 
> Not really. How long must a job be to be considered long running? You
> (as reader) don't know at this point so you carry on reading and
> examples make it all clear.
> 
> But okay, I can try to expand the description.
> 
> Now you have category 3 agent specific - at this point
>> I'm not sure you really want to mix normal and agent. If the purpose of
>> the series is to extricate agent job from normal job, then mixing when
>> it's convenient leads to speculation of what misunderstandings will
>> occur for future consumers.
> 
> Sometimes an API accesses both monitor AND agent. In this case it needs
> to grab both types of jobs.
> 

right remember the context - I read THREADS.txt before reading the code
and I didn't go back to modify any comments in this section. Kind of the
don't pollute my point of reference with how I read the code...

>>
>> Because of the text for "Normal job condition is used by all other
>> jobs", I wonder if the agent job description should go above Normal job;
>> otherwise, perhaps it'd be best to distinguish "qemu monitor" jobs in
>> the normal description.
>>
>>>      Immediately after acquiring the virDomainObjPtr lock, any method
>>> -    which intends to update state must acquire either asynchronous or
>>> -    normal job condition.  The virDomainObjPtr lock is released while
>>> +    which intends to update state must acquire asynchronous, normal or
>>> +    agent job condition. The virDomainObjPtr lock is released while
>>
>> There is no agent job condition, yet. And how on earth do you do both
>> agent and normal?
> 
> There's a difference between "job condition" and pthread_cond_t. What
> this paragraph understands as "job condition" is qemuDomainJob really.
> And in the second hunk of my patch I'm enumerating all three ways how to
> acquire jobs.
> I'll drop "condition" word which is apparently confusing.
> 

mmm... true... use of "XXXX job condition" is confusing, but it's what
it's named and apparently didn't cause confusion previously.

>>
>>>      blocking on these condition variables.  Once the job condition is
>>>      acquired, a method can safely release the virDomainObjPtr lock
>>>      whenever it hits a piece of code which may sleep/wait, and
>>> @@ -132,6 +138,30 @@ To acquire the normal job condition
>>>  
>>>  
>>>  
>>> +To acquire the agent job condition
>>> +
>>> +  qemuDomainObjBeginAgentJob()
>>> +    - Waits until there is no other agent job set
>>
>> So no agent job before starting, but...
>>
>>> +    - Sets job.agentActive tp the job type
>>> +
>>> +  qemuDomainObjEndAgentJob()
>>> +    - Sets job.agentActive to 0
>>> +    - Signals on job.cond condition
>>
>> ... we signal on job.cond even though we didn't necessarily obtain?
> 
> Yes. I don't see the problem here.
> 

Of course not, you wrote the patches!

Again, my point of reference.  There's a job.asyncCond and a job.cond
with job.cond being used for both - it just read strangely.  Once you're
intimate with the code, different story.

[...]

>>> -    while (priv->job.active) {
>>> +    while (!qemuDomainObjCanSetJob(priv, job, agentJob)) {
>>>          VIR_DEBUG("Waiting for job (vm=%p name=%s)", obj, obj->def->name);
>>>          if (virCondWaitUntil(&priv->job.cond, &obj->parent.lock, then) < 0)
>>>              goto error;
>>
>> Part of me says this check is ripe for some future coding error, but
>> logically AFAICT it works.
> 
> Well, the whole reason why any virCondWait() MUST be guarded with
> while() loop is that pthread_cond_wait() can wake up anytime even
> without anybody signalling it. This is taken from pthread_cond_wait(3p)
> man page:
> 
>   When using condition variables there is always a Boolean predicate
>   involving shared variables associated with each condition wait that is
>   true if the thread should proceed. Spurious wakeups from the
>   pthread_cond_timedwait() or pthread_cond_wait()  functions may occur.
>   Since the return from pthread_cond_timedwait() or pthread_cond_wait()
>   does not imply anything about the value of this predicate, the
>   predicate should be re-evaluated upon such return.
> 
> Therefore, if there's a thread waiting for say agent job, and it gets
> signalized by another thread ending monitor job, it is woken up but
> realizes this is a "spurious" wake up and returns into virCondWait()
> immediately.
> 
> IOW, yes we will get more wake ups here but it also enables us to grab
> both jobs at the same time (without us having to worry about how to wait
> on two pthread_cond at the same time).
> 

OK - I came to realize that for the most part, without of course the
extra reference to the man page. It's nuances like this that I generally
have to reacquaint myself with when debugging problems in code (not that
I'd necessarily stop to read extra normally helpful comments ;-)).

> 
>>
>>> @@ -6427,32 +6447,48 @@ qemuDomainObjBeginJobInternal(virQEMUDriverPtr driver,
>>>      if (!nested && !qemuDomainNestedJobAllowed(priv, job))
>>>          goto retry;
>>>  
>>> -    qemuDomainObjResetJob(priv);
>>> -
>>>      ignore_value(virTimeMillisNow(&now));
>>>  
>>> -    if (job != QEMU_JOB_ASYNC) {
>>> -        VIR_DEBUG("Started job: %s (async=%s vm=%p name=%s)",
>>> -                   qemuDomainJobTypeToString(job),
>>> -                  qemuDomainAsyncJobTypeToString(priv->job.asyncJob),
>>> -                  obj, obj->def->name);
>>> -        priv->job.active = job;
>>> -        priv->job.owner = virThreadSelfID();
>>> -        priv->job.ownerAPI = virThreadJobGet();
>>> +    if (job) {
>>> +        qemuDomainObjResetJob(priv);
>>> +
>>> +        if (job != QEMU_JOB_ASYNC) {
>>> +            VIR_DEBUG("Started job: %s (async=%s vm=%p name=%s)",
>>> +                      qemuDomainJobTypeToString(job),
>>> +                      qemuDomainAsyncJobTypeToString(priv->job.asyncJob),
>>> +                      obj, obj->def->name);
>>> +            priv->job.active = job;
>>> +            priv->job.owner = virThreadSelfID();
>>> +            priv->job.ownerAPI = virThreadJobGet();
>>> +            priv->job.started = now;
>>> +        } else {
>>> +            VIR_DEBUG("Started async job: %s (vm=%p name=%s)",
>>> +                      qemuDomainAsyncJobTypeToString(asyncJob),
>>> +                      obj, obj->def->name);
>>> +            qemuDomainObjResetAsyncJob(priv);
>>> +            if (VIR_ALLOC(priv->job.current) < 0)
>>> +                goto cleanup;
>>> +            priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE;
>>> +            priv->job.asyncJob = asyncJob;
>>> +            priv->job.asyncOwner = virThreadSelfID();
>>> +            priv->job.asyncOwnerAPI = virThreadJobGet();
>>> +            priv->job.asyncStarted = now;
>>> +            priv->job.current->started = now;
>>> +        }
>>> +    }
>>> +
>>> +    if (agentJob) {
>>> +        qemuDomainObjResetAgentJob(priv);
>>> +
>>> +        VIR_DEBUG("Started agent job: %s (vm=%p name=%s job=%s async=%s)",
>>> +                  qemuDomainAgentJobTypeToString(agentJob),
>>> +                  obj, obj->def->name,
>>> +                  qemuDomainJobTypeToString(priv->job.active),
>>> +                  qemuDomainAsyncJobTypeToString(priv->job.asyncJob));
>>> +        priv->job.agentActive = agentJob;
>>> +        priv->job.agentOwner = virThreadSelfID();
>>> +        priv->job.agentOwnerAPI = virThreadJobGet();
>>>          priv->job.started = now;
>>> -    } else {
>>> -        VIR_DEBUG("Started async job: %s (vm=%p name=%s)",
>>> -                  qemuDomainAsyncJobTypeToString(asyncJob),
>>> -                  obj, obj->def->name);
>>> -        qemuDomainObjResetAsyncJob(priv);
>>> -        if (VIR_ALLOC(priv->job.current) < 0)
>>> -            goto cleanup;
>>> -        priv->job.current->status = QEMU_DOMAIN_JOB_STATUS_ACTIVE;
>>> -        priv->job.asyncJob = asyncJob;
>>> -        priv->job.asyncOwner = virThreadSelfID();
>>> -        priv->job.asyncOwnerAPI = virThreadJobGet();
>>> -        priv->job.asyncStarted = now;
>>> -        priv->job.current->started = now;
>>
>> So after reading the code and thinking about the documentation, IMO,
>> having agentJob use job.cond is confusing.  Having JobWithAgent is even
>> more confusing. But it seems the code would work
> 
> Again, naming is hard. So any suggestions for better names of functions
> are welcome. Basically what we need are names to cover the following
> cases:
> 
>                    | needs monitor | doesn't need monitor |
>                    +---------------+----------------------+
> needs agent        |      X        |          X           |
>                    +---------------+----------------------+
> doesn't need agent |      X        |          0           |
>                    +---------------+----------------------+
> 

The "confusion" is squarely with the separate cond for async, but no
separate cond for normal and agent. Whether generating historical code
comments will help or not - who's to say. Noting that cond is used for
both normal and agent in some comment could help, but it requires
someone to read the comment too!

As for naming, yep too many opinions - and you know what opinions are
like, right? ;-)  Still you have "BeginAgentJob" and "BeginJobWithAgent"
where the latter is in some way a combined job. Whether the agent is
used is only determined after the Job is obtained (for shutdown and
reboot) or when the job requires both agent and monitor calls (for set
time).  That's why I went with the BLOCK name because I couldn't think
of anything better either!

In all 3 cases though we don't want other agent or monitor jobs to run
concurrently, so I understand the need to get both. It's this
intersection that did cause the most concern while reviewing.

>>
>> I think what concerns me is there are 3 paths (I read patch 4) that
>> would seemingly indicate that by starting an agentJob we block a normal
>> domain job. In order to start one, IIUC both agentJob and normal job
>> must be 0. 
> 
> No. I don't see that in code. What I see is the following predicate:
> 
>   return (job == QEMU_JOB_NONE || !priv->job.active) &&
>          (agentJob == QEMU_AGENT_JOB_NONE || !priv->job.agentActive);
> 
> So if a thread wants just agentJob (in which case job == QEMU_JOB_NONE)
> this predicate is true (= thread can set agent job) as long as
> agentActive is unset. Similarly, if a thread wants just monitor job
> (agentJob == QEMU_AGENT_JOB_NONE) this predicate allows that as long as
> priv->job.active is 0. And finally, if thread wants both jobs, both
> priv->job.active AND priv->job.agentActive have to be equal to zero.
> 
> And here you can see how having just one condition pays off: ALL threads
> waiting are woken up, but only that one which can set the job will
> successfully set it. The rest goes back to sleep.
> 

Maybe I wasn't clear enough... There are 3 paths (shutdown, reboot, and
set time) which require both job.active and job.agentActive to be 0
before proceeding. In the totality of things, yes, things seem to work.
I still had lingering thoughts and wanted in a way to be sure the way I
was understanding things matched up. As much as there is separation
there is an amount of subtle co-dependency which wasn't perhaps obvious
the first time or two that I read the code. After studying it for a
while, writing thoughts, etc. I've come to understand it better.
Sometimes it helps to write down things and that jiggles a different
thought for the author as well - for others it upsets them. Sorry for
the churn.

>> That seems reasonable until one starts thinking about races
>> when there's a thread waiting for a JobWithAgent and a thread waiting
>> for one or the other. Combine that with the check "if (!nested &&
>> !qemuDomainNestedJobAllowed(priv, job))" which may now pass NONE for
>> @job if one of those threads was an agentJob.
> 
> I'm failing to see any race here, sorry. Can you give me an example
> please?
> 

A race doesn't exist until someone finds one. I didn't find one or I
would have noted it - I wouldn't leave you hanging like that trying to
guess! It's the subsequent NestedJobAllowed call check that caused some
level of doubt/concern, but I couldn't put my finger on anything specific.

John

[...]

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 3/4] qemuDomainAgentJob: Introduce query and modify jobs
Posted by Michal Privoznik, 2 weeks ago
These jobs can be used to mark job type over qemu agent.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
 src/qemu/qemu_domain.c | 4 +++-
 src/qemu/qemu_domain.h | 2 ++
 2 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
index 09404f6569..e5e11f0cb7 100644
--- a/src/qemu/qemu_domain.c
+++ b/src/qemu/qemu_domain.c
@@ -94,7 +94,9 @@ VIR_ENUM_IMPL(qemuDomainJob, QEMU_JOB_LAST,
 );
 
 VIR_ENUM_IMPL(qemuDomainAgentJob, QEMU_AGENT_JOB_LAST,
-              "none"
+              "none",
+              "query",
+              "modify",
 );
 
 VIR_ENUM_IMPL(qemuDomainAsyncJob, QEMU_ASYNC_JOB_LAST,
diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
index 6ada26ca99..f2759951e5 100644
--- a/src/qemu/qemu_domain.h
+++ b/src/qemu/qemu_domain.h
@@ -84,6 +84,8 @@ VIR_ENUM_DECL(qemuDomainJob)
 
 typedef enum {
     QEMU_AGENT_JOB_NONE = 0,    /* No agent job. */
+    QEMU_AGENT_JOB_QUERY,       /* Does not change state of domain */
+    QEMU_AGENT_JOB_MODIFY,      /* May change state of domain */
 
     QEMU_AGENT_JOB_LAST
 } qemuDomainAgentJob;
-- 
2.16.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 3/4] qemuDomainAgentJob: Introduce query and modify jobs
Posted by John Ferlan, 1 week ago

On 06/08/2018 09:45 AM, Michal Privoznik wrote:
> These jobs can be used to mark job type over qemu agent.
> 
> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
>  src/qemu/qemu_domain.c | 4 +++-
>  src/qemu/qemu_domain.h | 2 ++
>  2 files changed, 5 insertions(+), 1 deletion(-)
> 
> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
> index 09404f6569..e5e11f0cb7 100644
> --- a/src/qemu/qemu_domain.c
> +++ b/src/qemu/qemu_domain.c
> @@ -94,7 +94,9 @@ VIR_ENUM_IMPL(qemuDomainJob, QEMU_JOB_LAST,
>  );
>  
>  VIR_ENUM_IMPL(qemuDomainAgentJob, QEMU_AGENT_JOB_LAST,
> -              "none"
> +              "none",
> +              "query",
> +              "modify",
>  );
>  
>  VIR_ENUM_IMPL(qemuDomainAsyncJob, QEMU_ASYNC_JOB_LAST,
> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
> index 6ada26ca99..f2759951e5 100644
> --- a/src/qemu/qemu_domain.h
> +++ b/src/qemu/qemu_domain.h
> @@ -84,6 +84,8 @@ VIR_ENUM_DECL(qemuDomainJob)
>  
>  typedef enum {
>      QEMU_AGENT_JOB_NONE = 0,    /* No agent job. */
> +    QEMU_AGENT_JOB_QUERY,       /* Does not change state of domain */
> +    QEMU_AGENT_JOB_MODIFY,      /* May change state of domain */

Seems there are 2 classes of MODIFY - one is just agent modify and one
is agent and normal modify. Maybe a QEMU_AGENT_JOB_MODIFY_BLOCK would
provide some clarity or easier to read code. A MODIFY_BLOCK ensures a
normal job cannot occur at the same time because something being done
(reboot, shutdown, and settime currently).

Then the black magic for starting both a normal and an agent job is left
to qemuDomainObjBeginAgentJob obviating qemuDomainObjBeginJobWithAgent.
Simmilarly, the specific logic in EndJobWithAgent for is used when
agentActive == MODIFY_BLOCK

>  
>      QEMU_AGENT_JOB_LAST
>  } qemuDomainAgentJob;
> 

As noted earlier, this should just be merged into patch 1.

John

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 3/4] qemuDomainAgentJob: Introduce query and modify jobs
Posted by Michal Privoznik, 1 week ago
On 06/14/2018 12:11 AM, John Ferlan wrote:
> 
> 
> On 06/08/2018 09:45 AM, Michal Privoznik wrote:
>> These jobs can be used to mark job type over qemu agent.
>>
>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
>> ---
>>  src/qemu/qemu_domain.c | 4 +++-
>>  src/qemu/qemu_domain.h | 2 ++
>>  2 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>> index 09404f6569..e5e11f0cb7 100644
>> --- a/src/qemu/qemu_domain.c
>> +++ b/src/qemu/qemu_domain.c
>> @@ -94,7 +94,9 @@ VIR_ENUM_IMPL(qemuDomainJob, QEMU_JOB_LAST,
>>  );
>>  
>>  VIR_ENUM_IMPL(qemuDomainAgentJob, QEMU_AGENT_JOB_LAST,
>> -              "none"
>> +              "none",
>> +              "query",
>> +              "modify",
>>  );
>>  
>>  VIR_ENUM_IMPL(qemuDomainAsyncJob, QEMU_ASYNC_JOB_LAST,
>> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
>> index 6ada26ca99..f2759951e5 100644
>> --- a/src/qemu/qemu_domain.h
>> +++ b/src/qemu/qemu_domain.h
>> @@ -84,6 +84,8 @@ VIR_ENUM_DECL(qemuDomainJob)
>>  
>>  typedef enum {
>>      QEMU_AGENT_JOB_NONE = 0,    /* No agent job. */
>> +    QEMU_AGENT_JOB_QUERY,       /* Does not change state of domain */
>> +    QEMU_AGENT_JOB_MODIFY,      /* May change state of domain */
> 
> Seems there are 2 classes of MODIFY - one is just agent modify and one
> is agent and normal modify. Maybe a QEMU_AGENT_JOB_MODIFY_BLOCK would
> provide some clarity or easier to read code. A MODIFY_BLOCK ensures a
> normal job cannot occur at the same time because something being done
> (reboot, shutdown, and settime currently).
> 
> Then the black magic for starting both a normal and an agent job is left
> to qemuDomainObjBeginAgentJob obviating qemuDomainObjBeginJobWithAgent.
> Simmilarly, the specific logic in EndJobWithAgent for is used when
> agentActive == MODIFY_BLOCK


No. We have to avoid any dependencies between these two jobs as much as
we can. Here, I can offer another view on jobs. Imagine you have a
struct (say virDomainDef) and you need to protect it from multiple
accesses. So you introduce big lock and anything that touches the struct
will serialize there. This is what we currently have as jobs. But there
is a bottle neck, because in the struct you have two, completely
independent members but you still have one global lock. This is
suboptimal so you invent two locks: one to guard one member, the other
to guard the other member. Cool, so anybody who wishes to access the
first member grabs the first lock, and anybody wishing to access the
second member grabs the second lock. Expect you have couple of places
where you need to grab both. Well, so you do.
And our jobs are like locks (one thread can't set a job until the other
ends it).
Therefore, we have to avoid situation where we could deadlock: that is
one thread has monitor job, the other has agent job and both try to
acquire job of each other (classical deadlock scenario). Sure, you can't
really prevent that with pure syntax, but if we have the third function
(which grabs both jobs) then it's at least more visible what is going
on, I guess.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 3/4] qemuDomainAgentJob: Introduce query and modify jobs
Posted by John Ferlan, 1 week ago

On 06/14/2018 10:22 AM, Michal Privoznik wrote:
> On 06/14/2018 12:11 AM, John Ferlan wrote:
>>
>>
>> On 06/08/2018 09:45 AM, Michal Privoznik wrote:
>>> These jobs can be used to mark job type over qemu agent.
>>>
>>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
>>> ---
>>>  src/qemu/qemu_domain.c | 4 +++-
>>>  src/qemu/qemu_domain.h | 2 ++
>>>  2 files changed, 5 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/src/qemu/qemu_domain.c b/src/qemu/qemu_domain.c
>>> index 09404f6569..e5e11f0cb7 100644
>>> --- a/src/qemu/qemu_domain.c
>>> +++ b/src/qemu/qemu_domain.c
>>> @@ -94,7 +94,9 @@ VIR_ENUM_IMPL(qemuDomainJob, QEMU_JOB_LAST,
>>>  );
>>>  
>>>  VIR_ENUM_IMPL(qemuDomainAgentJob, QEMU_AGENT_JOB_LAST,
>>> -              "none"
>>> +              "none",
>>> +              "query",
>>> +              "modify",
>>>  );
>>>  
>>>  VIR_ENUM_IMPL(qemuDomainAsyncJob, QEMU_ASYNC_JOB_LAST,
>>> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
>>> index 6ada26ca99..f2759951e5 100644
>>> --- a/src/qemu/qemu_domain.h
>>> +++ b/src/qemu/qemu_domain.h
>>> @@ -84,6 +84,8 @@ VIR_ENUM_DECL(qemuDomainJob)
>>>  
>>>  typedef enum {
>>>      QEMU_AGENT_JOB_NONE = 0,    /* No agent job. */
>>> +    QEMU_AGENT_JOB_QUERY,       /* Does not change state of domain */
>>> +    QEMU_AGENT_JOB_MODIFY,      /* May change state of domain */
>>
>> Seems there are 2 classes of MODIFY - one is just agent modify and one
>> is agent and normal modify. Maybe a QEMU_AGENT_JOB_MODIFY_BLOCK would
>> provide some clarity or easier to read code. A MODIFY_BLOCK ensures a
>> normal job cannot occur at the same time because something being done
>> (reboot, shutdown, and settime currently).
>>
>> Then the black magic for starting both a normal and an agent job is left
>> to qemuDomainObjBeginAgentJob obviating qemuDomainObjBeginJobWithAgent.
>> Simmilarly, the specific logic in EndJobWithAgent for is used when
>> agentActive == MODIFY_BLOCK
> 
> 
> No. We have to avoid any dependencies between these two jobs as much as
> we can. Here, I can offer another view on jobs. Imagine you have a
> struct (say virDomainDef) and you need to protect it from multiple
> accesses. So you introduce big lock and anything that touches the struct
> will serialize there. This is what we currently have as jobs. But there
> is a bottle neck, because in the struct you have two, completely
> independent members but you still have one global lock. This is
> suboptimal so you invent two locks: one to guard one member, the other
> to guard the other member. Cool, so anybody who wishes to access the
> first member grabs the first lock, and anybody wishing to access the
> second member grabs the second lock. Expect you have couple of places
> where you need to grab both. Well, so you do.

And anyone that has chased locks that have any type of dependency knows
about deadlocks and lock ordering. The changes appear to avoid deadlock
scenarios - at least I hope so!

> And our jobs are like locks (one thread can't set a job until the other
> ends it).
> Therefore, we have to avoid situation where we could deadlock: that is
> one thread has monitor job, the other has agent job and both try to
> acquire job of each other (classical deadlock scenario). Sure, you can't
> really prevent that with pure syntax, but if we have the third function
> (which grabs both jobs) then it's at least more visible what is going
> on, I guess.
> 

Fair enough, but if I'm going to complain about something I should offer
suggestions. I think you've presented enough of a defense of the two
method solution to have me drop thinking about my suggestion.

John

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
[libvirt] [PATCH 4/4] qemu: Switch code to use new agent job APIs
Posted by Michal Privoznik, 2 weeks ago
There are two sets of functions here:
1) some functions talk on both monitor and agent monitor,
2) some functions only talk on agent monitor.

For functions from set 1) we need to use
qemuDomainObjBeginJobWithAgent() and for functions from set 2) we
need to use qemuDomainObjBeginAgentJob() only.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
---
 src/qemu/qemu_driver.c | 91 ++++++++++++++++++++++++++++++++------------------
 1 file changed, 58 insertions(+), 33 deletions(-)

diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
index 28769878cc..bc1368386b 100644
--- a/src/qemu/qemu_driver.c
+++ b/src/qemu/qemu_driver.c
@@ -1956,6 +1956,7 @@ static int qemuDomainShutdownFlags(virDomainPtr dom, unsigned int flags)
     bool useAgent = false, agentRequested, acpiRequested;
     bool isReboot = false;
     bool agentForced;
+    bool agentJob = false;
     int agentFlag = QEMU_AGENT_SHUTDOWN_POWERDOWN;
 
     virCheckFlags(VIR_DOMAIN_SHUTDOWN_ACPI_POWER_BTN |
@@ -1982,9 +1983,14 @@ static int qemuDomainShutdownFlags(virDomainPtr dom, unsigned int flags)
     if (virDomainShutdownFlagsEnsureACL(dom->conn, vm->def, flags) < 0)
         goto cleanup;
 
-    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
+    if ((!useAgent && qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) ||
+        (useAgent && qemuDomainObjBeginJobWithAgent(driver, vm,
+                                                    QEMU_JOB_MODIFY,
+                                                    QEMU_AGENT_JOB_MODIFY) < 0))
         goto cleanup;
 
+    agentJob = useAgent;
+
     if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_RUNNING) {
         virReportError(VIR_ERR_OPERATION_INVALID,
                        "%s", _("domain is not running"));
@@ -2028,7 +2034,10 @@ static int qemuDomainShutdownFlags(virDomainPtr dom, unsigned int flags)
     }
 
  endjob:
-    qemuDomainObjEndJob(driver, vm);
+    if (agentJob)
+        qemuDomainObjEndJobWithAgent(driver, vm);
+    else
+        qemuDomainObjEndJob(driver, vm);
 
  cleanup:
     virDomainObjEndAPI(&vm);
@@ -2051,6 +2060,7 @@ qemuDomainReboot(virDomainPtr dom, unsigned int flags)
     bool useAgent = false, agentRequested, acpiRequested;
     bool isReboot = true;
     bool agentForced;
+    bool agentJob = false;
     int agentFlag = QEMU_AGENT_SHUTDOWN_REBOOT;
 
     virCheckFlags(VIR_DOMAIN_REBOOT_ACPI_POWER_BTN |
@@ -2077,9 +2087,14 @@ qemuDomainReboot(virDomainPtr dom, unsigned int flags)
     if (virDomainRebootEnsureACL(dom->conn, vm->def, flags) < 0)
         goto cleanup;
 
-    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
+    if ((!useAgent && qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) ||
+        (useAgent && qemuDomainObjBeginJobWithAgent(driver, vm,
+                                                    QEMU_JOB_MODIFY,
+                                                    QEMU_AGENT_JOB_MODIFY) < 0))
         goto cleanup;
 
+    agentJob = useAgent;
+
     agentForced = agentRequested && !acpiRequested;
     if (!qemuDomainAgentAvailable(vm, agentForced)) {
         if (agentForced)
@@ -2117,7 +2132,10 @@ qemuDomainReboot(virDomainPtr dom, unsigned int flags)
     }
 
  endjob:
-    qemuDomainObjEndJob(driver, vm);
+    if (agentJob)
+        qemuDomainObjEndJobWithAgent(driver, vm);
+    else
+        qemuDomainObjEndJob(driver, vm);
 
  cleanup:
     virDomainObjEndAPI(&vm);
@@ -4951,6 +4969,7 @@ qemuDomainSetVcpusFlags(virDomainPtr dom,
     virDomainDefPtr def;
     virDomainDefPtr persistentDef;
     bool hotpluggable = !!(flags & VIR_DOMAIN_VCPU_HOTPLUGGABLE);
+    bool useAgent = !!(flags & VIR_DOMAIN_VCPU_GUEST);
     int ret = -1;
 
     virCheckFlags(VIR_DOMAIN_AFFECT_LIVE |
@@ -4965,13 +4984,14 @@ qemuDomainSetVcpusFlags(virDomainPtr dom,
     if (virDomainSetVcpusFlagsEnsureACL(dom->conn, vm->def, flags) < 0)
         goto cleanup;
 
-    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
+    if ((!useAgent && qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) ||
+        (useAgent && qemuDomainObjBeginAgentJob(driver, vm, QEMU_AGENT_JOB_MODIFY) < 0))
         goto cleanup;
 
     if (virDomainObjGetDefs(vm, flags, &def, &persistentDef) < 0)
         goto endjob;
 
-    if (flags & VIR_DOMAIN_VCPU_GUEST)
+    if (useAgent)
         ret = qemuDomainSetVcpusAgent(vm, nvcpus);
     else if (flags & VIR_DOMAIN_VCPU_MAXIMUM)
         ret = qemuDomainSetVcpusMax(driver, def, persistentDef, nvcpus);
@@ -4980,7 +5000,10 @@ qemuDomainSetVcpusFlags(virDomainPtr dom,
                                          nvcpus, hotpluggable);
 
  endjob:
-    qemuDomainObjEndJob(driver, vm);
+    if (useAgent)
+        qemuDomainObjEndAgentJob(vm);
+    else
+        qemuDomainObjEndJob(driver, vm);
 
  cleanup:
     virDomainObjEndAPI(&vm);
@@ -5431,7 +5454,7 @@ qemuDomainGetVcpusFlags(virDomainPtr dom, unsigned int flags)
         goto cleanup;
 
     if (flags & VIR_DOMAIN_VCPU_GUEST) {
-        if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0)
+        if (qemuDomainObjBeginAgentJob(driver, vm, QEMU_AGENT_JOB_QUERY) < 0)
             goto cleanup;
 
         if (!virDomainObjIsActive(vm)) {
@@ -5449,7 +5472,7 @@ qemuDomainGetVcpusFlags(virDomainPtr dom, unsigned int flags)
         qemuDomainObjExitAgent(vm, agent);
 
  endjob:
-        qemuDomainObjEndJob(driver, vm);
+        qemuDomainObjEndAgentJob(vm);
 
         if (ncpuinfo < 0)
             goto cleanup;
@@ -18961,7 +18984,7 @@ qemuDomainPMSuspendForDuration(virDomainPtr dom,
     if (virDomainPMSuspendForDurationEnsureACL(dom->conn, vm->def) < 0)
         goto cleanup;
 
-    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
+    if (qemuDomainObjBeginAgentJob(driver, vm, QEMU_AGENT_JOB_MODIFY) < 0)
         goto cleanup;
 
     if (virDomainObjCheckActive(vm) < 0)
@@ -18992,7 +19015,7 @@ qemuDomainPMSuspendForDuration(virDomainPtr dom,
     qemuDomainObjExitAgent(vm, agent);
 
  endjob:
-    qemuDomainObjEndJob(driver, vm);
+    qemuDomainObjEndAgentJob(vm);
 
  cleanup:
     virDomainObjEndAPI(&vm);
@@ -19077,7 +19100,7 @@ qemuDomainQemuAgentCommand(virDomainPtr domain,
     if (virDomainQemuAgentCommandEnsureACL(domain->conn, vm->def) < 0)
         goto cleanup;
 
-    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
+    if (qemuDomainObjBeginAgentJob(driver, vm, QEMU_AGENT_JOB_MODIFY) < 0)
         goto cleanup;
 
     if (virDomainObjCheckActive(vm) < 0)
@@ -19095,7 +19118,7 @@ qemuDomainQemuAgentCommand(virDomainPtr domain,
         VIR_FREE(result);
 
  endjob:
-    qemuDomainObjEndJob(driver, vm);
+    qemuDomainObjEndAgentJob(vm);
 
  cleanup:
     virDomainObjEndAPI(&vm);
@@ -19177,7 +19200,7 @@ qemuDomainFSTrim(virDomainPtr dom,
     if (virDomainFSTrimEnsureACL(dom->conn, vm->def) < 0)
         goto cleanup;
 
-    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
+    if (qemuDomainObjBeginAgentJob(driver, vm, QEMU_AGENT_JOB_MODIFY) < 0)
         goto cleanup;
 
     if (!qemuDomainAgentAvailable(vm, true))
@@ -19191,7 +19214,7 @@ qemuDomainFSTrim(virDomainPtr dom,
     qemuDomainObjExitAgent(vm, agent);
 
  endjob:
-    qemuDomainObjEndJob(driver, vm);
+    qemuDomainObjEndAgentJob(vm);
 
  cleanup:
     virDomainObjEndAPI(&vm);
@@ -19359,7 +19382,7 @@ qemuDomainGetTime(virDomainPtr dom,
     if (virDomainGetTimeEnsureACL(dom->conn, vm->def) < 0)
         goto cleanup;
 
-    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0)
+    if (qemuDomainObjBeginAgentJob(driver, vm, QEMU_AGENT_JOB_QUERY) < 0)
         goto cleanup;
 
     if (virDomainObjCheckActive(vm) < 0)
@@ -19378,7 +19401,7 @@ qemuDomainGetTime(virDomainPtr dom,
     ret = 0;
 
  endjob:
-    qemuDomainObjEndJob(driver, vm);
+    qemuDomainObjEndAgentJob(vm);
 
  cleanup:
     virDomainObjEndAPI(&vm);
@@ -19409,7 +19432,9 @@ qemuDomainSetTime(virDomainPtr dom,
 
     priv = vm->privateData;
 
-    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
+    if (qemuDomainObjBeginJobWithAgent(driver, vm,
+                                       QEMU_JOB_MODIFY,
+                                       QEMU_AGENT_JOB_MODIFY) < 0)
         goto cleanup;
 
     if (virDomainObjCheckActive(vm) < 0)
@@ -19454,7 +19479,7 @@ qemuDomainSetTime(virDomainPtr dom,
     ret = 0;
 
  endjob:
-    qemuDomainObjEndJob(driver, vm);
+    qemuDomainObjEndJobWithAgent(driver, vm);
 
  cleanup:
     virDomainObjEndAPI(&vm);
@@ -19480,7 +19505,7 @@ qemuDomainFSFreeze(virDomainPtr dom,
     if (virDomainFSFreezeEnsureACL(dom->conn, vm->def) < 0)
         goto cleanup;
 
-    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
+    if (qemuDomainObjBeginAgentJob(driver, vm, QEMU_AGENT_JOB_MODIFY) < 0)
         goto cleanup;
 
     if (virDomainObjCheckActive(vm) < 0)
@@ -19489,7 +19514,7 @@ qemuDomainFSFreeze(virDomainPtr dom,
     ret = qemuDomainSnapshotFSFreeze(driver, vm, mountpoints, nmountpoints);
 
  endjob:
-    qemuDomainObjEndJob(driver, vm);
+    qemuDomainObjEndAgentJob(vm);
 
  cleanup:
     virDomainObjEndAPI(&vm);
@@ -19521,7 +19546,7 @@ qemuDomainFSThaw(virDomainPtr dom,
     if (virDomainFSThawEnsureACL(dom->conn, vm->def) < 0)
         goto cleanup;
 
-    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
+    if (qemuDomainObjBeginAgentJob(driver, vm, QEMU_AGENT_JOB_MODIFY) < 0)
         goto cleanup;
 
     if (virDomainObjCheckActive(vm) < 0)
@@ -19530,7 +19555,7 @@ qemuDomainFSThaw(virDomainPtr dom,
     ret = qemuDomainSnapshotFSThaw(driver, vm, true);
 
  endjob:
-    qemuDomainObjEndJob(driver, vm);
+    qemuDomainObjEndAgentJob(vm);
 
  cleanup:
     virDomainObjEndAPI(&vm);
@@ -20545,7 +20570,7 @@ qemuDomainGetFSInfo(virDomainPtr dom,
     if (virDomainGetFSInfoEnsureACL(dom->conn, vm->def) < 0)
         goto cleanup;
 
-    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0)
+    if (qemuDomainObjBeginAgentJob(driver, vm, QEMU_AGENT_JOB_QUERY) < 0)
         goto cleanup;
 
     if (virDomainObjCheckActive(vm) < 0)
@@ -20565,7 +20590,7 @@ qemuDomainGetFSInfo(virDomainPtr dom,
     qemuDomainObjExitAgent(vm, agent);
 
  endjob:
-    qemuDomainObjEndJob(driver, vm);
+    qemuDomainObjEndAgentJob(vm);
 
  cleanup:
     virDomainObjEndAPI(&vm);
@@ -20602,7 +20627,7 @@ qemuDomainInterfaceAddresses(virDomainPtr dom,
         break;
 
     case VIR_DOMAIN_INTERFACE_ADDRESSES_SRC_AGENT:
-        if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_QUERY) < 0)
+        if (qemuDomainObjBeginAgentJob(driver, vm, QEMU_AGENT_JOB_QUERY) < 0)
             goto cleanup;
 
         if (!qemuDomainAgentAvailable(vm, true))
@@ -20613,7 +20638,7 @@ qemuDomainInterfaceAddresses(virDomainPtr dom,
         qemuDomainObjExitAgent(vm, agent);
 
     endjob:
-        qemuDomainObjEndJob(driver, vm);
+        qemuDomainObjEndAgentJob(vm);
 
         break;
 
@@ -20814,7 +20839,7 @@ qemuDomainSetUserPassword(virDomainPtr dom,
     if (virDomainSetUserPasswordEnsureACL(dom->conn, vm->def) < 0)
         goto cleanup;
 
-    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
+    if (qemuDomainObjBeginAgentJob(driver, vm, QEMU_AGENT_JOB_MODIFY) < 0)
         goto cleanup;
 
     if (virDomainObjCheckActive(vm) < 0)
@@ -20834,7 +20859,7 @@ qemuDomainSetUserPassword(virDomainPtr dom,
     ret = 0;
 
  endjob:
-    qemuDomainObjEndJob(driver, vm);
+    qemuDomainObjEndAgentJob(vm);
 
  cleanup:
     virDomainObjEndAPI(&vm);
@@ -21075,7 +21100,7 @@ qemuDomainGetGuestVcpus(virDomainPtr dom,
     if (virDomainGetGuestVcpusEnsureACL(dom->conn, vm->def) < 0)
         goto cleanup;
 
-    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
+    if (qemuDomainObjBeginAgentJob(driver, vm, QEMU_AGENT_JOB_QUERY) < 0)
         goto cleanup;
 
     if (!qemuDomainAgentAvailable(vm, true))
@@ -21094,7 +21119,7 @@ qemuDomainGetGuestVcpus(virDomainPtr dom,
     ret = 0;
 
  endjob:
-    qemuDomainObjEndJob(driver, vm);
+    qemuDomainObjEndAgentJob(vm);
 
  cleanup:
     VIR_FREE(info);
@@ -21134,7 +21159,7 @@ qemuDomainSetGuestVcpus(virDomainPtr dom,
     if (virDomainSetGuestVcpusEnsureACL(dom->conn, vm->def) < 0)
         goto cleanup;
 
-    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
+    if (qemuDomainObjBeginAgentJob(driver, vm, QEMU_AGENT_JOB_MODIFY) < 0)
         goto cleanup;
 
     if (!qemuDomainAgentAvailable(vm, true))
@@ -21180,7 +21205,7 @@ qemuDomainSetGuestVcpus(virDomainPtr dom,
     qemuDomainObjExitAgent(vm, agent);
 
  endjob:
-    qemuDomainObjEndJob(driver, vm);
+    qemuDomainObjEndAgentJob(vm);
 
  cleanup:
     VIR_FREE(info);
-- 
2.16.4

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 4/4] qemu: Switch code to use new agent job APIs
Posted by John Ferlan, 1 week ago

On 06/08/2018 09:45 AM, Michal Privoznik wrote:
> There are two sets of functions here:
> 1) some functions talk on both monitor and agent monitor,
> 2) some functions only talk on agent monitor.
> 
> For functions from set 1) we need to use
> qemuDomainObjBeginJobWithAgent() and for functions from set 2) we
> need to use qemuDomainObjBeginAgentJob() only.
> 

It seems to me the category for (1) is more - some code makes the
decision whether to use agent or normal code after the point at which we
take the lock so we need to block not only agent, but also normal jobs.


> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
> ---
>  src/qemu/qemu_driver.c | 91 ++++++++++++++++++++++++++++++++------------------
>  1 file changed, 58 insertions(+), 33 deletions(-)
> 
> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
> index 28769878cc..bc1368386b 100644
> --- a/src/qemu/qemu_driver.c
> +++ b/src/qemu/qemu_driver.c
> @@ -1956,6 +1956,7 @@ static int qemuDomainShutdownFlags(virDomainPtr dom, unsigned int flags)
>      bool useAgent = false, agentRequested, acpiRequested;
>      bool isReboot = false;
>      bool agentForced;
> +    bool agentJob = false;
>      int agentFlag = QEMU_AGENT_SHUTDOWN_POWERDOWN;
>  
>      virCheckFlags(VIR_DOMAIN_SHUTDOWN_ACPI_POWER_BTN |
> @@ -1982,9 +1983,14 @@ static int qemuDomainShutdownFlags(virDomainPtr dom, unsigned int flags)
>      if (virDomainShutdownFlagsEnsureACL(dom->conn, vm->def, flags) < 0)
>          goto cleanup;
>  
> -    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
> +    if ((!useAgent && qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) ||
> +        (useAgent && qemuDomainObjBeginJobWithAgent(driver, vm,
> +                                                    QEMU_JOB_MODIFY,
> +                                                    QEMU_AGENT_JOB_MODIFY) < 0))

I think if you change qemuDomainObjBeginAgentJob to use MODIFY_BLOCK,
then this just becomes a single call...

>          goto cleanup;
>  
> +    agentJob = useAgent;
> +
>      if (virDomainObjGetState(vm, NULL) != VIR_DOMAIN_RUNNING) {
>          virReportError(VIR_ERR_OPERATION_INVALID,
>                         "%s", _("domain is not running"));
> @@ -2028,7 +2034,10 @@ static int qemuDomainShutdownFlags(virDomainPtr dom, unsigned int flags)
>      }
>  
>   endjob:
> -    qemuDomainObjEndJob(driver, vm);
> +    if (agentJob)
> +        qemuDomainObjEndJobWithAgent(driver, vm);
> +    else
> +        qemuDomainObjEndJob(driver, vm);

... and this can be a single EndAgentJob... Given that the existing
EndJobWithAgent does what EndJob does - is there reason to worry about
calling EndJob at all if EndJobWithAgent is called (or whatever that
code becomes).


There'd be similar comments for each JobWithAgent. The consumers of just
the agentJob seem to be fine... it's this combined one that looks odd.

John

[...]

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list
Re: [libvirt] [PATCH 4/4] qemu: Switch code to use new agent job APIs
Posted by Michal Privoznik, 1 week ago
On 06/14/2018 12:14 AM, John Ferlan wrote:
> 
> 
> On 06/08/2018 09:45 AM, Michal Privoznik wrote:
>> There are two sets of functions here:
>> 1) some functions talk on both monitor and agent monitor,
>> 2) some functions only talk on agent monitor.
>>
>> For functions from set 1) we need to use
>> qemuDomainObjBeginJobWithAgent() and for functions from set 2) we
>> need to use qemuDomainObjBeginAgentJob() only.
>>
> 
> It seems to me the category for (1) is more - some code makes the
> decision whether to use agent or normal code after the point at which we
> take the lock so we need to block not only agent, but also normal jobs.
> 
> 
>> Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
>> ---
>>  src/qemu/qemu_driver.c | 91 ++++++++++++++++++++++++++++++++------------------
>>  1 file changed, 58 insertions(+), 33 deletions(-)
>>
>> diff --git a/src/qemu/qemu_driver.c b/src/qemu/qemu_driver.c
>> index 28769878cc..bc1368386b 100644
>> --- a/src/qemu/qemu_driver.c
>> +++ b/src/qemu/qemu_driver.c
>> @@ -1956,6 +1956,7 @@ static int qemuDomainShutdownFlags(virDomainPtr dom, unsigned int flags)
>>      bool useAgent = false, agentRequested, acpiRequested;
>>      bool isReboot = false;
>>      bool agentForced;
>> +    bool agentJob = false;
>>      int agentFlag = QEMU_AGENT_SHUTDOWN_POWERDOWN;
>>  
>>      virCheckFlags(VIR_DOMAIN_SHUTDOWN_ACPI_POWER_BTN |
>> @@ -1982,9 +1983,14 @@ static int qemuDomainShutdownFlags(virDomainPtr dom, unsigned int flags)
>>      if (virDomainShutdownFlagsEnsureACL(dom->conn, vm->def, flags) < 0)
>>          goto cleanup;
>>  
>> -    if (qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0)
>> +    if ((!useAgent && qemuDomainObjBeginJob(driver, vm, QEMU_JOB_MODIFY) < 0) ||
>> +        (useAgent && qemuDomainObjBeginJobWithAgent(driver, vm,
>> +                                                    QEMU_JOB_MODIFY,
>> +                                                    QEMU_AGENT_JOB_MODIFY) < 0))
> 
> I think if you change qemuDomainObjBeginAgentJob to use MODIFY_BLOCK,
> then this just becomes a single call...

As ugly as my proposal looks like I think it is still better than hiding
taking of both jobs behind one value of qemuDomainJob while the rest of
them don't do that.

Michal

--
libvir-list mailing list
libvir-list@redhat.com
https://www.redhat.com/mailman/listinfo/libvir-list