[PATCH] replay: don't wait in run_on_cpu

Pavel Dovgalyuk posted 1 patch 3 years ago
Test checkpatch passed
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/161700514781.1141125.8890164582302771524.stgit@pasha-ThinkPad-X280
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>, Paolo Bonzini <pbonzini@redhat.com>
cpus-common.c            |    9 ++++++++-
include/sysemu/replay.h  |    1 +
replay/replay-internal.h |    1 -
stubs/replay-tools.c     |    5 +++++
4 files changed, 14 insertions(+), 2 deletions(-)
[PATCH] replay: don't wait in run_on_cpu
Posted by Pavel Dovgalyuk 3 years ago
In record/replay mode waiting for vCPU to execute
the task scheduled by run_on_cpu may lead to deadlock,
because when run_on_cpu is executed in main_loop
(e.g., in loadvm processing) it holds replay mutex.
This patch allows running scheduled task in iothread
when it holds the replay mutex.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgalyuk@ispras.ru>
---
 cpus-common.c            |    9 ++++++++-
 include/sysemu/replay.h  |    1 +
 replay/replay-internal.h |    1 -
 stubs/replay-tools.c     |    5 +++++
 4 files changed, 14 insertions(+), 2 deletions(-)

diff --git a/cpus-common.c b/cpus-common.c
index 6e73d3e58d..38ff510175 100644
--- a/cpus-common.c
+++ b/cpus-common.c
@@ -23,6 +23,7 @@
 #include "hw/core/cpu.h"
 #include "sysemu/cpus.h"
 #include "qemu/lockable.h"
+#include "sysemu/replay.h"
 
 static QemuMutex qemu_cpu_list_lock;
 static QemuCond exclusive_cond;
@@ -136,7 +137,13 @@ void do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data,
 {
     struct qemu_work_item wi;
 
-    if (qemu_cpu_is_self(cpu)) {
+    if (qemu_cpu_is_self(cpu)
+        /*
+         * vCPU thread is waiting when replay mutex is locked
+         * and the task is not exclusive, the function may be called
+         * without other synchronization.
+         */
+        || (replay_mode != REPLAY_MODE_NONE && replay_mutex_locked())) {
         func(cpu, data);
         return;
     }
diff --git a/include/sysemu/replay.h b/include/sysemu/replay.h
index 0f3b0f7eac..032256533b 100644
--- a/include/sysemu/replay.h
+++ b/include/sysemu/replay.h
@@ -62,6 +62,7 @@ extern char *replay_snapshot;
 
 void replay_mutex_lock(void);
 void replay_mutex_unlock(void);
+bool replay_mutex_locked(void);
 
 /* Replay process control functions */
 
diff --git a/replay/replay-internal.h b/replay/replay-internal.h
index 97649ed8d7..dada623527 100644
--- a/replay/replay-internal.h
+++ b/replay/replay-internal.h
@@ -117,7 +117,6 @@ void replay_get_array_alloc(uint8_t **buf, size_t *size);
  * synchronisation between vCPU and main-loop threads. */
 
 void replay_mutex_init(void);
-bool replay_mutex_locked(void);
 
 /*! Checks error status of the file. */
 void replay_check_error(void);
diff --git a/stubs/replay-tools.c b/stubs/replay-tools.c
index 43296b3d4e..a42f2483d5 100644
--- a/stubs/replay-tools.c
+++ b/stubs/replay-tools.c
@@ -48,6 +48,11 @@ void replay_mutex_unlock(void)
 {
 }
 
+bool replay_mutex_locked(void)
+{
+    return false;
+}
+
 void replay_register_char_driver(Chardev *chr)
 {
 }


Re: [PATCH] replay: don't wait in run_on_cpu
Posted by Paolo Bonzini 3 years ago
On 29/03/21 10:05, Pavel Dovgalyuk wrote:
> @@ -136,7 +137,13 @@ void do_run_on_cpu(CPUState *cpu, run_on_cpu_func func, run_on_cpu_data data,
>   {
>       struct qemu_work_item wi;
>   
> -    if (qemu_cpu_is_self(cpu)) {
> +    if (qemu_cpu_is_self(cpu)
> +        /*
> +         * vCPU thread is waiting when replay mutex is locked
> +         * and the task is not exclusive, the function may be called
> +         * without other synchronization.
> +         */
> +        || (replay_mode != REPLAY_MODE_NONE && replay_mutex_locked())) {
>           func(cpu, data);
>           return;
>       }

Is the "or" saying that the execution is using the lockstep mode?  If 
so, can you put it in a separate function so that it's more 
self-explanatory and check if it should be used elsewhere?

Thanks,

Paolo


Re: [PATCH] replay: don't wait in run_on_cpu
Posted by Pavel Dovgalyuk 3 years ago
On 29.03.2021 12:42, Paolo Bonzini wrote:
> On 29/03/21 10:05, Pavel Dovgalyuk wrote:
>> @@ -136,7 +137,13 @@ void do_run_on_cpu(CPUState *cpu, run_on_cpu_func 
>> func, run_on_cpu_data data,
>>   {
>>       struct qemu_work_item wi;
>> -    if (qemu_cpu_is_self(cpu)) {
>> +    if (qemu_cpu_is_self(cpu)
>> +        /*
>> +         * vCPU thread is waiting when replay mutex is locked
>> +         * and the task is not exclusive, the function may be called
>> +         * without other synchronization.
>> +         */
>> +        || (replay_mode != REPLAY_MODE_NONE && replay_mutex_locked())) {
>>           func(cpu, data);
>>           return;
>>       }
> 
> Is the "or" saying that the execution is using the lockstep mode?  If 
> so, can you put it in a separate function so that it's more 
> self-explanatory and check if it should be used elsewhere?

It was replay (is that lockstep that you mentioned?).
I check that the mutex is already locked, which means, that vCPU
does nothing at this moment.

Pavel Dovgalyuk

Re: [PATCH] replay: don't wait in run_on_cpu
Posted by Paolo Bonzini 3 years ago
On 29/03/21 12:55, Pavel Dovgalyuk wrote:
>>>
>>
>> Is the "or" saying that the execution is using the lockstep mode?  If 
>> so, can you put it in a separate function so that it's more 
>> self-explanatory and check if it should be used elsewhere?
> 
> It was replay (is that lockstep that you mentioned?).

Lockstep in the sense that (as is the case in record/replay mode) the 
I/O thread and vCPU thread execute in turns.

Paolo

> I check that the mutex is already locked, which means, that vCPU
> does nothing at this moment.