[PATCH v4 00/24] replay: fixes and new test cases

Nicholas Piggin posted 24 patches 1 month, 2 weeks ago
Failed in applying to current master (apply log)
Maintainers: Richard Henderson <richard.henderson@linaro.org>, Paolo Bonzini <pbonzini@redhat.com>, "Marc-André Lureau" <marcandre.lureau@redhat.com>, Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>, "Michael S. Tsirkin" <mst@redhat.com>, Jason Wang <jasowang@redhat.com>, Nicholas Piggin <npiggin@gmail.com>, Daniel Henrique Barboza <danielhb413@gmail.com>, "Cédric Le Goater" <clg@kaod.org>, David Gibson <david@gibson.dropbear.id.au>, Harsh Prateek Bora <harshpb@linux.ibm.com>, Peter Xu <peterx@redhat.com>, Fabiano Rosas <farosas@suse.de>, John Snow <jsnow@redhat.com>, Cleber Rosa <crosa@redhat.com>, "Philippe Mathieu-Daudé" <philmd@linaro.org>, Wainer dos Santos Moschetta <wainersm@redhat.com>, Beraldo Leal <bleal@redhat.com>
There is a newer version of this series
docs/system/replay.rst             |   5 +
include/hw/ppc/spapr_cpu_core.h    |   3 +
include/sysemu/replay.h            |  16 ++-
include/sysemu/runstate.h          |   1 +
accel/tcg/tcg-accel-ops-rr.c       |   2 +-
chardev/char.c                     |  71 ++++++++----
hw/net/virtio-net.c                |  17 +--
hw/ppc/ppc.c                       |  11 +-
hw/ppc/spapr.c                     |  36 +-----
hw/ppc/spapr_hcall.c               |  33 ++++++
hw/ppc/spapr_rtas.c                |   1 +
migration/migration.c              |  17 ++-
migration/savevm.c                 |   1 +
net/announce.c                     |   2 +-
replay/replay-snapshot.c           |  57 ++++++++++
replay/replay.c                    |  50 ++++----
system/runstate.c                  |  31 ++++-
system/vl.c                        |   9 ++
target/ppc/machine.c               |   4 +
qemu-options.hx                    |   9 +-
scripts/replay-dump.py             | 167 ++++++++++++++++++---------
tests/avocado/replay_kernel.py     |  11 ++
tests/avocado/replay_linux.py      |  97 +++++++++++++++-
tests/avocado/reverse_debugging.py | 176 ++++++++++++++++++++++++-----
24 files changed, 635 insertions(+), 192 deletions(-)
[PATCH v4 00/24] replay: fixes and new test cases
Posted by Nicholas Piggin 1 month, 2 weeks ago
Since v3,

* Attacked the replay_linux.py bugs and found a bunch of gaps
  in networking that was causing the hangs.
* And several powerpc bugs that were also causing problems on
  pseries.
* Added ppc test to replay_linux.py now that it's working.
* Found several crash bugs in record/replay vs migration.
* Added snapshot and more stepping tests to reverse_debugging.py
* Addressed comments in auto-snapshot code.
* Added auto-snapshot test case.
* "Solved" x86-64 issues in test cases by switching to q35, which
  seems to have less problems.

The last 3 patches I will take in the ppc tree, but included here
because powerpc is the only one that survives the record-replay test
with auto-snapshots at the moment.

Thanks,
Nick

Since v2, here fixes became less minor so I rename the series.

https://lore.kernel.org/qemu-devel/20240125160835.480488-1-npiggin@gmail.com/#r)

* Found several more bugs (patches 5-8).
* Enable the rr avocado test on pseries and aarch64 virt since they're
  passing here (and on gitlab, e.g.,
  https://gitlab.com/npiggin/qemu/-/jobs/6253787216,
  https://gitlab.com/npiggin/qemu/-/jobs/6253787218).
* Updated replay-dump script to John's feedback.

x86-64 still has issues with replay and reverse debugging tests.
replay_kernel.py seems to be timing dependent -- after patch 5 I
had it pass 30/30 runs, then the following day 0/30 and I realized
I had several other QEMU instances hogging the CPU which probably
changed timings. So the first thing I would look at is timers and
clocks. pseries had some rounding issues in time calculations that meant
clock/timer were not replayed exactly as they were recorded, which
caused hangs.

Thanks,
Nick

Nicholas Piggin (24):
  scripts/replay-dump.py: Update to current rr record format
  scripts/replay-dump.py: rejig decoders in event number order
  tests/avocado: excercise scripts/replay-dump.py in replay tests
  replay: allow runstate shutdown->running when replaying trace
  Revert "replay: stop us hanging in rr_wait_io_event"
  chardev: set record/replay on the base device of a muxed device
  replay: Fix migration use of clock
  replay: Fix migration replay_mutex locking
  virtio-net: Use replay_schedule_bh_event for bhs that affect machine
    state
  virtio-net: Use virtual time for RSC timers
  net: Use virtual time for net announce
  savevm: Fix load_snapshot error path crash
  tests/avocado: replay_linux.py remove the timeout expected guards
  tests/avocado/reverse_debugging.py: mark aarch64 and pseries as not
    flaky
  tests/avocado: reverse_debugging.py add test for x86-64 q35 machine
  tests/avocado: reverse_debugging.py verify addresses between record
    and replay
  tests/avocado: reverse_debugging.py stop VM before sampling icount
  tests/avocado: reverse_debugging reverse-step at the end of the trace
  tests/avocado: reverse_debugging.py add snapshot testing
  replay: simple auto-snapshot mode for record
  tests/avocado: reverse_debugging.py test auto-snapshot mode
  target/ppc: fix timebase register reset state
  spapr: Fix vpa dispatch count for record-replay
  tests/avocado: replay_linux.py add ppc64 pseries test

 docs/system/replay.rst             |   5 +
 include/hw/ppc/spapr_cpu_core.h    |   3 +
 include/sysemu/replay.h            |  16 ++-
 include/sysemu/runstate.h          |   1 +
 accel/tcg/tcg-accel-ops-rr.c       |   2 +-
 chardev/char.c                     |  71 ++++++++----
 hw/net/virtio-net.c                |  17 +--
 hw/ppc/ppc.c                       |  11 +-
 hw/ppc/spapr.c                     |  36 +-----
 hw/ppc/spapr_hcall.c               |  33 ++++++
 hw/ppc/spapr_rtas.c                |   1 +
 migration/migration.c              |  17 ++-
 migration/savevm.c                 |   1 +
 net/announce.c                     |   2 +-
 replay/replay-snapshot.c           |  57 ++++++++++
 replay/replay.c                    |  50 ++++----
 system/runstate.c                  |  31 ++++-
 system/vl.c                        |   9 ++
 target/ppc/machine.c               |   4 +
 qemu-options.hx                    |   9 +-
 scripts/replay-dump.py             | 167 ++++++++++++++++++---------
 tests/avocado/replay_kernel.py     |  11 ++
 tests/avocado/replay_linux.py      |  97 +++++++++++++++-
 tests/avocado/reverse_debugging.py | 176 ++++++++++++++++++++++++-----
 24 files changed, 635 insertions(+), 192 deletions(-)

-- 
2.42.0