[PATCH] record/replay: fix race condition on test_aarch64_reverse_debug

Vladimir Lukianov posted 1 patch 4 months, 3 weeks ago
Patches applied successfully (tree, apply log)
git fetch https://github.com/patchew-project/qemu tags/patchew/20250603125459.17688-1-1844144@gmail.com
Maintainers: Paolo Bonzini <pbonzini@redhat.com>, "Alex Bennée" <alex.bennee@linaro.org>
replay/replay.c                                | 2 ++
tests/functional/test_aarch64_reverse_debug.py | 1 -
2 files changed, 2 insertions(+), 1 deletion(-)
[PATCH] record/replay: fix race condition on test_aarch64_reverse_debug
Posted by Vladimir Lukianov 4 months, 3 weeks ago
Ensures EVENT_INSTRUCTION written to replay.bin before EVENT_SHUTDOWN_HOST_QMP

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2921
Signed-off-by: Vladimir Lukianov <1844144@gmail.com>
---
During the record pass, test_reverse_debug writes a sequence of
instructions to replay.bin. Presumably due to a race condition or
host's async implementation details, the resulting file looks like:

...
12: EVENT_CP_CLOCK_WARP_ACCOUNT(31) no additional data  
13: EVENT_INSTRUCTION(0) + 59 -> 44298  
14: EVENT_CP_CLOCK_WARP_ACCOUNT(31) no additional data  
15: EVENT_SHUTDOWN_HOST_QMP_QUIT(12)  
16: EVENT_INSTRUCTION(0) + 5587988 -> 5632286  
17: EVENT_SHUTDOWN_HOST_SIGNAL(14)  
18: EVENT_END(39)  
Reached 162 of 162 bytes

Here, SHUTDOWN_HOST_QMP_QUIT is written before the last instruction
event. During the replay pass, QUIT is executed before the last
instruction, which causes the VM to shut down. As a result, the QMP
and GDB connections are broken, and the test cannot execute its final
steps.

Adding replay_save_instructions ensures EVENT_INSTRUCTION is written
before EVENT_SHUTDOWN_HOST_QMP_QUIT.

Tested on my arm64. This does not fix the bug on x86_64. The x86_64
case seems similar, but slightly different.

 replay/replay.c                                | 2 ++
 tests/functional/test_aarch64_reverse_debug.py | 1 -
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/replay/replay.c b/replay/replay.c
index a3e24c96..b2121788 100644
--- a/replay/replay.c
+++ b/replay/replay.c
@@ -263,6 +263,8 @@ bool replay_has_interrupt(void)
 
 void replay_shutdown_request(ShutdownCause cause)
 {
+    replay_save_instructions();
+
     if (replay_mode == REPLAY_MODE_RECORD) {
         g_assert(replay_mutex_locked());
         replay_put_event(EVENT_SHUTDOWN + cause);
diff --git a/tests/functional/test_aarch64_reverse_debug.py b/tests/functional/test_aarch64_reverse_debug.py
index 58d45328..0ac1ccb0 100755
--- a/tests/functional/test_aarch64_reverse_debug.py
+++ b/tests/functional/test_aarch64_reverse_debug.py
@@ -26,7 +26,6 @@ class ReverseDebugging_AArch64(ReverseDebugging):
          'releases/29/Everything/aarch64/os/images/pxeboot/vmlinuz'),
         '7e1430b81c26bdd0da025eeb8fbd77b5dc961da4364af26e771bd39f379cbbf7')
 
-    @skipFlakyTest("https://gitlab.com/qemu-project/qemu/-/issues/2921")
     def test_aarch64_virt(self):
         self.set_machine('virt')
         self.cpu = 'cortex-a53'
-- 
2.34.1
Re: [PATCH] record/replay: fix race condition on test_aarch64_reverse_debug
Posted by Alex Bennée 2 weeks, 5 days ago
Vladimir Lukianov <1844144@gmail.com> writes:

> Ensures EVENT_INSTRUCTION written to replay.bin before EVENT_SHUTDOWN_HOST_QMP
>
> Resolves: https://gitlab.com/qemu-project/qemu/-/issues/2921
> Signed-off-by: Vladimir Lukianov <1844144@gmail.com>

Queued to pr/031025-10.2-maintainer-1, thanks.

-- 
Alex Bennée
Virtualisation Tech Lead @ Linaro