Hello,
On 8/8/23 06:19, Nicholas Piggin wrote:
> The patches in this series has been seen a few times in various
> iterations.
>
> There are two main pieces, some assorted small fixes and tests for
> record-replay, plus a large set of decrementer fixes. I merged these
> into one series rather than send decrementer fixes alone first, because
> record-replay has been very good at uncovering timer problems, so it's
> good to have those test cases in at the same time IMO.
>
> Some of the fixes we might take to stable, but unclear which.
> Decrementer fixes were a bit of a tangle so maybe we just leave those
> alone since they work okay.
>
> The decrementer is not emulated perfectly still. Underflow from -ve
> to +ve is not implemented, for one. I started doing that but it's not
> trivial so better stop here for now.
>
> For record-replay, pseries is now quite solid with rr. Surely some
> issues to iron out but it is becoming usable.
>
> powernv record-replay has some known problems migrating edge-triggered
> decrementer, and edge triggered msgsnd. Also it seems to get stuck in
> xive init somewhere when replaying from checkpoint, so there is probably
> some state in xive not being reset. But at least it runs the avocado
> test and seems close to working, so I've added that test case so we
> don't go backwards (ha!).
>
> Other machine types might not be too far off if there is interest. I
> found it quite difficult to find these problems though, reverse
> debugging will sometimes just lock up, stop at wrong location, or abort
> with wrong event. Difficult understand what went wrong. Worst case I had
> to basically bisect the replay of the trace, and find the minimum length
> of replay that hit the problem -- that sometimes would land near a
> mtDEC or timer interrupt or similar.
>
> Thanks,
> Nick
>
> Nicholas Piggin (19):
> ppc/vhyp: reset exception state when handling vhyp hcall
> ppc/vof: Fix missed fields in VOF cleanup
> hw/ppc/ppc.c: Tidy over-long lines
> hw/ppc: Introduce functions for conversion between timebase and
> nanoseconds
> host-utils: Add muldiv64_round_up
> hw/ppc: Round up the decrementer interval when converting to ns
> hw/ppc: Avoid decrementer rounding errors
> target/ppc: Sign-extend large decrementer to 64-bits
> hw/ppc: Always store the decrementer value
> target/ppc: Migrate DECR SPR
> hw/ppc: Reset timebase facilities on machine reset
> hw/ppc: Read time only once to perform decrementer write
> target/ppc: Fix CPU reservation migration for record-replay
> target/ppc: Fix timebase reset with record-replay
> spapr: Fix machine reset deadlock from replay-record
> spapr: Fix record-replay machine reset consuming too many events
> tests/avocado: boot ppc64 pseries replay-record test to Linux VFS
> mount
> tests/avocado: reverse-debugging cope with re-executing breakpoints
> tests/avocado: ppc64 reverse debugging tests for pseries and powernv
>
> hw/ppc/mac_oldworld.c | 1 +
> hw/ppc/pegasos2.c | 1 +
> hw/ppc/pnv_core.c | 2 +
> hw/ppc/ppc.c | 236 +++++++++++++++++++----------
> hw/ppc/prep.c | 1 +
> hw/ppc/spapr.c | 32 +++-
> hw/ppc/spapr_cpu_core.c | 2 +
> hw/ppc/vof.c | 2 +
> include/hw/ppc/ppc.h | 3 +-
> include/hw/ppc/spapr.h | 2 +
> include/qemu/host-utils.h | 21 ++-
> target/ppc/compat.c | 19 +++
> target/ppc/cpu.h | 3 +
> target/ppc/excp_helper.c | 3 +
> target/ppc/machine.c | 40 ++++-
> target/ppc/translate.c | 4 +
> tests/avocado/replay_kernel.py | 3 +-
> tests/avocado/reverse_debugging.py | 54 ++++++-
> 18 files changed, 330 insertions(+), 99 deletions(-)
>
I am preparing a PR with this series. It is time to take a look at it if you
haven't already !
Thanks,
C.