PTRACE_SETREGSET fails when setting a hardware breakpoint on a
non-4-byte aligned address with a valid length to a 32-bit tracee. The
length should be valid as long as the range started from the address
is within one aligned 4 bytes.
The cause is that hw_break_set() modifies a breakpoint's addr
first and then ctrl. This calls modify_user_hw_breakpoint() twice,
although once registering both suffices. The first modification causes
errors because new addr and old ctrl can be an invalid combination at
hw_breakpoint_arch_parse(). For example, when a user sets a hardware
breakpoint with addr=0x2 and ctrl.len=1, hw_breakpoint_arch_parse()
will first see addr=0x2 and ctrl.len=4 (the default) and return
-EINVAL. On the other hand, if a user sets the same value to
a breakpoint whose ctrl.len has previously been set to 1 or 2,
it succeeds.
The fix is to set addr and ctrl in one modify_user_hw_breakpoint(),
effectively eliminating the discrepancy in validation.
Signed-off-by: Bill Tsui <b10902118@ntu.edu.tw>
---
arch/arm64/kernel/ptrace.c | 34 +++++++++++++++++++++++++++++-----
1 file changed, 29 insertions(+), 5 deletions(-)
diff --git a/arch/arm64/kernel/ptrace.c b/arch/arm64/kernel/ptrace.c
index 4b001121c..73c67f743 100644
--- a/arch/arm64/kernel/ptrace.c
+++ b/arch/arm64/kernel/ptrace.c
@@ -467,6 +467,32 @@ static int ptrace_hbp_set_addr(unsigned int note_type,
return err;
}
+/* Set the address and control together for non-compat ptrace */
+static int ptrace_hbp_set(unsigned int note_type, struct task_struct *tsk,
+ unsigned long idx, u64 addr, u32 uctrl)
+{
+ int err;
+ struct perf_event *bp;
+ struct perf_event_attr attr;
+ struct arch_hw_breakpoint_ctrl ctrl;
+
+ bp = ptrace_hbp_get_initialised_bp(note_type, tsk, idx);
+ if (IS_ERR(bp)) {
+ err = PTR_ERR(bp);
+ return err;
+ }
+
+ attr = bp->attr;
+ attr.bp_addr = addr;
+
+ decode_ctrl_reg(uctrl, &ctrl);
+ err = ptrace_hbp_fill_attr_ctrl(note_type, ctrl, &attr);
+ if (err)
+ return err;
+
+ return modify_user_hw_breakpoint(bp, &attr);
+}
+
#define PTRACE_HBP_ADDR_SZ sizeof(u64)
#define PTRACE_HBP_CTRL_SZ sizeof(u32)
#define PTRACE_HBP_PAD_SZ sizeof(u32)
@@ -524,9 +550,6 @@ static int hw_break_set(struct task_struct *target,
return -EINVAL;
ret = user_regset_copyin(&pos, &count, &kbuf, &ubuf, &addr,
offset, offset + PTRACE_HBP_ADDR_SZ);
- if (ret)
- return ret;
- ret = ptrace_hbp_set_addr(note_type, target, idx, addr);
if (ret)
return ret;
offset += PTRACE_HBP_ADDR_SZ;
@@ -537,10 +560,11 @@ static int hw_break_set(struct task_struct *target,
offset, offset + PTRACE_HBP_CTRL_SZ);
if (ret)
return ret;
- ret = ptrace_hbp_set_ctrl(note_type, target, idx, ctrl);
+ offset += PTRACE_HBP_CTRL_SZ;
+
+ ret = ptrace_hbp_set(note_type, target, idx, addr, ctrl);
if (ret)
return ret;
- offset += PTRACE_HBP_CTRL_SZ;
user_regset_copyin_ignore(&pos, &count, &kbuf, &ubuf,
offset, offset + PTRACE_HBP_PAD_SZ);
--
2.50.1
On Wed, Aug 27, 2025 at 09:41:11AM +0800, Bill Tsui wrote: > PTRACE_SETREGSET fails when setting a hardware breakpoint on a > non-4-byte aligned address with a valid length to a 32-bit tracee. The > length should be valid as long as the range started from the address > is within one aligned 4 bytes. > > The cause is that hw_break_set() modifies a breakpoint's addr > first and then ctrl. This calls modify_user_hw_breakpoint() twice, > although once registering both suffices. The first modification causes > errors because new addr and old ctrl can be an invalid combination at > hw_breakpoint_arch_parse(). For example, when a user sets a hardware > breakpoint with addr=0x2 and ctrl.len=1, hw_breakpoint_arch_parse() > will first see addr=0x2 and ctrl.len=4 (the default) and return > -EINVAL. On the other hand, if a user sets the same value to > a breakpoint whose ctrl.len has previously been set to 1 or 2, > it succeeds. > > The fix is to set addr and ctrl in one modify_user_hw_breakpoint(), > effectively eliminating the discrepancy in validation. > > Signed-off-by: Bill Tsui <b10902118@ntu.edu.tw> > --- > arch/arm64/kernel/ptrace.c | 34 +++++++++++++++++++++++++++++----- > 1 file changed, 29 insertions(+), 5 deletions(-) Given that: (a) This is a pretty niche interface (primarily/exclusively (?) used by GDB) (b) It's been like this for a long time (c) Userspace can work around the problem I'm not sure I see the benefit of trying to handle this differently in the kernel. If somebody _does_ have the time and energy for significant surgery on this code, then the best thing to do would be to remove the indirection through 'perf_events' altogether. I did make a start on that once but it's a thankless job and I got preempted by other stuff. Will
On 2025-09-08 23:14, Will Deacon wrote: > On Wed, Aug 27, 2025 at 09:41:11AM +0800, Bill Tsui wrote: >> PTRACE_SETREGSET fails when setting a hardware breakpoint on a >> non-4-byte aligned address with a valid length to a 32-bit tracee. The >> length should be valid as long as the range started from the address >> is within one aligned 4 bytes. >> >> The cause is that hw_break_set() modifies a breakpoint's addr >> first and then ctrl. This calls modify_user_hw_breakpoint() twice, >> although once registering both suffices. The first modification causes >> errors because new addr and old ctrl can be an invalid combination at >> hw_breakpoint_arch_parse(). For example, when a user sets a hardware >> breakpoint with addr=0x2 and ctrl.len=1, hw_breakpoint_arch_parse() >> will first see addr=0x2 and ctrl.len=4 (the default) and return >> -EINVAL. On the other hand, if a user sets the same value to >> a breakpoint whose ctrl.len has previously been set to 1 or 2, >> it succeeds. >> >> The fix is to set addr and ctrl in one modify_user_hw_breakpoint(), >> effectively eliminating the discrepancy in validation. >> >> Signed-off-by: Bill Tsui <b10902118@ntu.edu.tw> >> --- >> arch/arm64/kernel/ptrace.c | 34 +++++++++++++++++++++++++++++----- >> 1 file changed, 29 insertions(+), 5 deletions(-) > > Given that: > > (a) This is a pretty niche interface (primarily/exclusively (?) used > by GDB) > (b) It's been like this for a long time > (c) Userspace can work around the problem > > I'm not sure I see the benefit of trying to handle this differently > in the kernel. > > If somebody _does_ have the time and energy for significant surgery > on this code, then the best thing to do would be to remove the > indirection through 'perf_events' altogether. I did make a start on > that once but it's a thankless job and I got preempted by other stuff. > > Will (Fixed email name with real name. Sorry again.) Thanks for the reply. I did find this bug when helping test lldb. The primary problem is that this forbids hardware breakpoints for Thumb at addresses only divisible by 2 (not 4). So half of the instructions cannot have a hardware breakpoint on. This cannot be worked around. The first patch should be accepted, at least. The arm64 case can be completely fixed by simply merging modifications in one call, making the code reasonable and correct. I often debug 32bit executables on arm64, for CTF challenges and my hobby. I get extremely confused when some breakpoints just cannot be set. This is also why I worked on lldb and finally traced to the kernel. As a student, I am willing to take the task of moving ptrace out of 'perf_events' as long as someone can review my patches. Thanks Bill
The following can reproduce the error for `arm64 compat` watchpoints (breakpoints follow the same logic). I realized that testing my patch can take time to set up from scratch, hoping this can facilitate the review process. There are three parts: 1. watch_unaligned.c 2. tracee.c 3. compile & run Basically, watch_unaligned runs tracee and set 1-byte watchpoints on it. 1. watch_unaligned.c #include <stdio.h> #include <stdlib.h> #include <stdint.h> #include <string.h> #include <errno.h> #include <sys/ptrace.h> #include <sys/types.h> #include <sys/wait.h> #include <unistd.h> #include <linux/ptrace.h> #include <asm/ptrace.h> #include <sys/uio.h> #include <elf.h> #include <sys/uio.h> #include <asm/ptrace.h> #define MAX_WATCHPOINTS 4 #define AARCH64_BREAKPOINT_EL0 2 #define ENABLE 1 typedef enum { arm_hwbp_break = 0, arm_hwbp_load = 1, arm_hwbp_store = 2, arm_hwbp_access = 3 } arm_hwbp_type; void error_detach_close(pid_t pid, FILE* pipe, const char* msg) { perror(msg); ptrace(PTRACE_DETACH, pid, NULL, NULL); pclose(pipe); } void watch_1byte(int pid, uint32_t addr) { // Set hardware watchpoint on ARM64 // Use PTRACE_SETREGSET with NT_ARM_HW_WATCH for watchpoints printf("Setting watchpoint at address 0x%x\n", addr); struct user_hwdebug_state watch; memset(&watch, 0, sizeof(watch)); watch.dbg_regs[0].addr = addr; uint32_t size = 1; uint32_t byte_mask = (1u << size) - 1u; uint32_t type = arm_hwbp_access; uint32_t priv = AARCH64_BREAKPOINT_EL0; watch.dbg_regs[0].ctrl = (byte_mask << 5) | (type << 3) | (priv << 1) | ENABLE; struct iovec ioVec; memset(&ioVec, 0, sizeof(ioVec)); ioVec.iov_base = &watch; // According to lldb. Otherwise dbg_regs[16] will cause error ioVec.iov_len = sizeof(watch.dbg_info) + sizeof(watch.pad) + (sizeof(watch.dbg_regs[0]) * MAX_WATCHPOINTS); if (ptrace(PTRACE_SETREGSET, pid, NT_ARM_HW_WATCH, &ioVec) == -1) { perror("PTRACE_SETREGSET"); } else { printf("Watchpoint at address 0x%x set successfully\n", addr); } printf("---\n"); } FILE* run_tracee(pid_t *pid, uint32_t *addr) { // run tracee and get pid & addr FILE *pipe = popen("./tracee", "r"); if (!pipe) { perror("popen failed"); return NULL; } char line[64]; if (fgets(line, sizeof(line), pipe) == NULL) { perror("read pid failed"); } *pid = atoi(line); if (fgets(line, sizeof(line), pipe) == NULL) { perror("read addr failed"); } *addr = strtoul(line, NULL, 0); return pipe; } int main() { pid_t pid; uint32_t addr; FILE *pipe = run_tracee(&pid, &addr); if (!pipe) { return 1; } printf("Attaching to PID %d\n", pid); if (ptrace(PTRACE_ATTACH, pid, NULL, NULL) == -1) { error_detach_close(pid, pipe, "ptrace(ATTACH)"); return 1; } printf("Attached\n\n"); int status = 0; if (waitpid(pid, &status, 0) == -1) { error_detach_close(pid, pipe, "waitpid"); return 1; } // The TEST // watch non-4-byte aligned. (Fail) watch_1byte(pid, addr); // watch any 4-byte aligned. (Success) watch_1byte(pid, 0); // watch non-4-byte aligned again. (Success) // because bp_len was previously set to 2 successfully watch_1byte(pid, addr); // continue and let it trigger, proving functionality if (ptrace(PTRACE_CONT, pid, NULL, NULL) == -1) { error_detach_close(pid, pipe,"ptrace(CONT)"); return 1; } if (waitpid(pid, &status, 0) == -1) { error_detach_close(pid, pipe, "waitpid (after CONT)"); pclose(pipe); return 1; } if (WIFSTOPPED(status)) { printf("Watchpoint triggered, signal: %d\n", WSTOPSIG(status)); ptrace(PTRACE_CONT, pid, NULL, NULL); ptrace(PTRACE_DETACH, pid, NULL, NULL); } pclose(pipe); return 0; } 2. tracee.c #include <stdio.h> #include <unistd.h> #include <stdint.h> int main() { volatile int a = 42; printf("%d\n%p\n", getpid(), (void*)((uintptr_t)&a + 1)); fflush(stdout); while (1) { a += 1; } return 0; } 3. compile & run # 64-bit tracer aarch64-linux-gnu-gcc watch_unaligned.c -o watch_unaligned # 32-bit tracee arm-linux-gnueabi-gcc tracee.c -o tracee # run ./watch_unaligned output: Attaching to PID 325 Attached Setting watchpoint at address 0xffacca39 PTRACE_SETREGSET: Invalid argument --- Setting watchpoint at address 0x0 Watchpoint at address 0x0 set successfully --- Setting watchpoint at address 0xffacca39 Watchpoint at address 0xffacca39 set successfully --- Watchpoint triggered, signal: 5 Summary: the first watchpoint failed but should be supported since it was successfully set and triggered just after addr 0x0 with length 1. Thanks, Bill
On 2025-09-08 23:14, Will Deacon wrote: > On Wed, Aug 27, 2025 at 09:41:11AM +0800, Bill Tsui wrote: >> PTRACE_SETREGSET fails when setting a hardware breakpoint on a >> non-4-byte aligned address with a valid length to a 32-bit tracee. The >> length should be valid as long as the range started from the address >> is within one aligned 4 bytes. >> >> The cause is that hw_break_set() modifies a breakpoint's addr >> first and then ctrl. This calls modify_user_hw_breakpoint() twice, >> although once registering both suffices. The first modification causes >> errors because new addr and old ctrl can be an invalid combination at >> hw_breakpoint_arch_parse(). For example, when a user sets a hardware >> breakpoint with addr=0x2 and ctrl.len=1, hw_breakpoint_arch_parse() >> will first see addr=0x2 and ctrl.len=4 (the default) and return >> -EINVAL. On the other hand, if a user sets the same value to >> a breakpoint whose ctrl.len has previously been set to 1 or 2, >> it succeeds. >> >> The fix is to set addr and ctrl in one modify_user_hw_breakpoint(), >> effectively eliminating the discrepancy in validation. >> >> Signed-off-by: Bill Tsui <b10902118@ntu.edu.tw> >> --- >> arch/arm64/kernel/ptrace.c | 34 +++++++++++++++++++++++++++++----- >> 1 file changed, 29 insertions(+), 5 deletions(-) > > Given that: > > (a) This is a pretty niche interface (primarily/exclusively (?) used > by GDB) > (b) It's been like this for a long time > (c) Userspace can work around the problem > > I'm not sure I see the benefit of trying to handle this differently > in the kernel. > > If somebody _does_ have the time and energy for significant surgery > on this code, then the best thing to do would be to remove the > indirection through 'perf_events' altogether. I did make a start on > that once but it's a thankless job and I got preempted by other stuff. > > Will (This is the plain text version. Sorry for sending html in the previous mail) Thanks for the reply. I did find this bug when helping test lldb. The primary problem is that this forbids hardware breakpoints for Thumb at addresses only divisible by 2 (not 4). So half of the instructions cannot have a hardware breakpoint on. This cannot be worked around. The first patch should be accepted, at least. The arm64 case can be completely fixed by simply merging modifications in one call, making the code reasonable and correct. I often debug 32bit executables on arm64, for CTF challenges and my hobby. I get extremely confused when some breakpoints just cannot be set. This is also why I worked on lldb and finally traced to the kernel. As a student, I am willing to take the task of moving ptrace out of 'perf_events' as long as someone can review my patches. Thanks Bill
© 2016 - 2025 Red Hat, Inc.