accel/kvm/kvm-all.c | 9 + accel/stubs/kvm-stub.c | 5 + exec.c | 2 +- hw/ppc/spapr_hcall.c | 5 + include/sysemu/kvm.h | 3 + stubs/Makefile.objs | 1 + stubs/kvm-arch-singlestep.c | 14 ++ target/ppc/cpu.h | 21 +++ target/ppc/excp_helper.c | 15 ++ target/ppc/kvm.c | 358 ++++++++++++++++++++++++++++++++++-- target/ppc/kvm_ppc.h | 6 +- 11 files changed, 425 insertions(+), 14 deletions(-) create mode 100644 stubs/kvm-arch-singlestep.c
Hi, This version contains two major changes. A fix for stepping an AIL-changing instruction and a fix for when other breakpoints happen mid-step. The issues with AIL were: ------------------------- 1) I missed the fact that AIL only matters when translation is enabled (i.e., MSR_IR|DR=0 => AIL=0) and, 2) if translation is enabled and we step over the hypercall that sets AIL (H_SET_MODE), the step breakpoint will be at the wrong location when the next Trace interrupt occurs. These are fixed by 1) looking at MSR_IR|DR and choosing the appropriate value for AIL and 2) by moving the breakpoint when AIL is altered (QEMU handles the H_SET_MODE hcall, so this is easy to do). The issue with breakpoints mid-step was: ---------------------------------------- Consider the scenario where the user happens to step into a guest userspace program (e.g. by stepping a 'rfid') and from there steps an instruction that causes an interrupt, such as an 'sc'. Since the 'sc' is not traced (see Power ISA), its handler will execute before we get to see the Trace interrupt. This is generally not an issue because upon return from the syscall handler, the instruction following 'sc' will execute and the Trace will happen then. However, if the user has set a breakpoint inside the system call interrupt handler*, the single step gets interrupted leaving the 0xd00 breakpoint still in place, as well as the SE bit, which is saved and restored later by the syscall handler. Depending on how the user proceeds the debugging in GDB, two issues can happen: - stepi over rfid: If the handler is left by stepping all the way until after 'rfid' we would be left with an extra breakpoint at 0xd00 because the actual step that will bring us out of the interrupt handler would be the step over 'rfid' and not the continuation of the original step over the 'sc' instruction. - continue: If the handler is left by issuing 'continue', the 'rfid' at the end of the handler would eventually restore the MSR it has saved (along with the SE bit) and the next instruction execution would cause a Trace, which we are now not prepared to handle since although the 0xd00 breakpoint of the original step is still there, we are not single stepping anymore (due to the 'continue'). So to avoid these issues I'm introducing the concept of a "pending step" so that we can detect the states mentioned above and take the appropriate measures of avoiding leaving the 0xd00 breakpoint behind and handling the rest of the original step properly, respectively. * the 'sc' is not traced so we would not normally see the system call handler executing, therefore setting a breakpoint inside the handler to debug it is likely. Other points worth noting: -------------------------- The -4 in restore_singlestep_env was indeed wrong. I'm now saving the stepped instruction before the step so that we can access it after the step. This also saves us from doing another cpu_memory_rw_debug to read the instruction later on. Due to the way the threads exit the guest and how GDB is notified of that, it is possible that it sees one vcpu's breakpoint before turning single-stepping off for another. I'm now setting singlestep_enabled = false right after we are done handling the step to avoid that issue. The cpu_memory_rw_debug function reads and writes to the memory "from the guests perspective", taking the partition table into consideration, which for general debugging I assume is the desired behavior, but for reads/writes made by QEMU for QEMU's own purposes, such as the 0xd00 breakpoint, this is not the most appropriate function to use. Changing it, would, however, mean duplicating some of the breakpoint logic from common code. If anyone wants to break this, here is my gdb command line: gdb -q --ex "target remote :1234" \ --ex "break configure_exceptions" \ # this is in early_setup, changes LPCR_AIL, MSR_IR|DR --ex "break program_check_exception" \ # inside an interrupt handler --ex "break *0xc00000000000baa0" \ # right before the 'rfid' in system_call_exit --ex "break *0x100ecb0c" \ # 'sc' inside an userspace program (busybox) --ex "break *0xc00000000000e66c" \ # somewhere in fast_exception_return --ex "break schedule" \ # for testing interleaving of steps/breakpoints --ex "break cmdline_proc_show" \ # just an arbitrary user-triggered breakpoint --ex "break __enter_rtas" \ # changes MSR_IR|DR, alternates endianness** --ex "layout asm" build/vmlinux The command: (gdb) x/i 0xc000000000004d00 should always show "mtsprg 2,r13" except in the middle of a pending step, when it should show ".long 0xdddd00". ** endianness needs to be switched manually with: (gdb) set endian <big|little> --- v5 -> v6: - patch 1: new preliminary patch to help readers understand whether we are returning to GDB or into the guest after handling a KVM_EXIT_DEBUG vm exit; - patch 2: remove kvm_has_guest_debug_singlestep wrapper; rename functions to better indicate their purpose: kvm_arch_has_guest_debug_singlestep -> kvm_arch_can_singlestep kvm_arch_set_singlestep -> kvm_arch_emulate_singlestep g_assert_not_reached when calling kvm_set_singlestep with !kvm_enabled; - patch 3: move instruction helpers that are only used by kvm code into kvm.c; new variable sstep_kind used to determine if guest is doing a single step or if the single step was interrupted and is pending completion; new variable sstep_insn to avoid calling ppc_gdb_read_insn twice; use kvm_arch_can_singlestep instead of cap_ppc_singlestep for functions' early exits; set singlestep_enabled to false after handling the step; AIL fixes: take MSR_IR|DR into consideration when computing the trace_handler_addr; reposition the single step breakpoint when guest calls h_set_mode_resource_addr_trans_mode to change the AIL; trace_handler_addr is now saved during the step; Mid-step breakpoint fixes: if a breakpoint in the vcpu being stepped is reached, mark the step as 'pending'; if an rfid happens during a pending step, do not start a new step, let the original one finish; if a breakpoint at 0xd00 happens during a pending step, handle it as a single step. v4 -> v5: - rebase to v4.2.0-rc5; - use KVM_CAP_PPC_GUEST_DEBUG_SSTEP (#176) which is now in kernel v5.5-rc1: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1a9167a214f https://lists.nongnu.org/archive/html/qemu-devel/2019-12/msg02103.html v3 -> v4: - patch 1: fix uninitialized 'offset' variable; - patch 2: squash with patch 7/7 (now 5/5); fix exception vector offset calculation when AIL == 0; - patch 3: squash with 4/7 (now 2/5); - patch 7: introduce ppc_gdb_get_{op,xop,spr} functions; introduce ppc_gdb_read_insn function; define constants for mfmsr, rfid, mtspr extended opcodes; fix bug where instructions would not be properly recognized in SLOF due to guest endianness being different from host's; pass arch_info->address directly into functions that only need the address; improve indentation by returning early when possible. https://lists.nongnu.org/archive/html/qemu-devel/2019-02/msg07994.html v2 -> v3: - take Alternate Interrupt Location (AIL) into consideration when calculating the Trace Interrupt handler address (this allows single stepping in SLOF code); - check for a new KVM_GUEST_DEBUG_SSTEP capability (still to be submitted to kernel ml); - handle other vcpus (not currently stepping) hitting the single step breakpoint - by ignoring the breakpoint; - handle simultaneous single step by GDB inside guest - by first performing our step into the trace interrupt handler itself and returning to the guest afterwards; - handle single stepping when at the first trace interrupt handler instruction - by displacing the breakpoint to the next instruction; - restore MSR, SRR0, SRR1 after the step, taking into consideration possible mtspr, mtmsr instructions; - use stubs for arch-specific code that will not be implemented by other architectures at this point; https://lists.gnu.org/archive/html/qemu-devel/2019-01/msg04627.html v1 -> v2: - split in more patches to facilitate review - use extract32 for decoding instruction instead of open-coding - add more people to CC https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg04269.html v1: https://lists.gnu.org/archive/html/qemu-devel/2018-11/msg03738.html Fabiano Rosas (3): target/ppc: Clarify the meaning of return values in kvm_handle_debug kvm-all: Introduce kvm_set_singlestep target/ppc: support single stepping with KVM HV accel/kvm/kvm-all.c | 9 + accel/stubs/kvm-stub.c | 5 + exec.c | 2 +- hw/ppc/spapr_hcall.c | 5 + include/sysemu/kvm.h | 3 + stubs/Makefile.objs | 1 + stubs/kvm-arch-singlestep.c | 14 ++ target/ppc/cpu.h | 21 +++ target/ppc/excp_helper.c | 15 ++ target/ppc/kvm.c | 358 ++++++++++++++++++++++++++++++++++-- target/ppc/kvm_ppc.h | 6 +- 11 files changed, 425 insertions(+), 14 deletions(-) create mode 100644 stubs/kvm-arch-singlestep.c -- 2.23.0
On Fri, 2020-01-10 at 12:13 -0300, Fabiano Rosas wrote: > Hi, > > This version contains two major changes. A fix for stepping an > AIL-changing instruction and a fix for when other breakpoints happen > mid-step. > I have being using this patchset for almost an year now. It really makes debugging guest kernel in Power a much better experience. I have compiled the last version on monday and used it for running every VM I needed to since then. Two of them included debugging. No issue found. FWIW, Tested-by: Leonardo Bras <leonardo@ibm.com>
© 2016 - 2024 Red Hat, Inc.