[PATCH] arch/arm64/mm/fault: Implement exceptions tracepoints

Balbir Singh posted 1 patch 2 months, 1 week ago
arch/arm64/mm/fault.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)
[PATCH] arch/arm64/mm/fault: Implement exceptions tracepoints
Posted by Balbir Singh 2 months, 1 week ago
x86 and riscv provide trace points for page-faults (user and kernel
tracepoints). Some scripts [1],[2] rely on these trace points. The
tracepoint is useful for tracking faults and their reasons.

Adding the tracepoints is simple and straight-forward. For arm64
use esr as error code and untagged memory address as addr.

Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: Paul Walmsley <pjw@kernel.org>
Cc: Palmer Dabbelt <palmer@dabbelt.com>
Cc: Albert Ou <aou@eecs.berkeley.edu>
Cc: Alexandre Ghiti <alex@ghiti.fr>

[1] https://www.brendangregg.com/FlameGraphs/memoryflamegraphs.html
[2] https://taras.glek.net/posts/ebpf-mmap-page-fault-tracing/
Signed-off-by: Balbir Singh <balbirs@nvidia.com>
---

Tested at my end with a kernel-compile and running a user space
program to check that the trace points are indeed reported.

 arch/arm64/mm/fault.c | 9 ++++++++-
 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index d816ff44faff..9d7b86e92434 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -44,6 +44,9 @@
 #include <asm/tlbflush.h>
 #include <asm/traps.h>
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/exceptions.h>
+
 struct fault_info {
 	int	(*fn)(unsigned long far, unsigned long esr,
 		      struct pt_regs *regs);
@@ -572,8 +575,12 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 	if (faulthandler_disabled() || !mm)
 		goto no_context;
 
-	if (user_mode(regs))
+	if (user_mode(regs)) {
 		mm_flags |= FAULT_FLAG_USER;
+		trace_page_fault_user(addr, regs, esr);
+	} else {
+		trace_page_fault_kernel(addr, regs, esr);
+	}
 
 	/*
 	 * vm_flags tells us what bits we must have in vma->vm_flags
-- 
2.51.0
Re: [PATCH] arch/arm64/mm/fault: Implement exceptions tracepoints
Posted by Catalin Marinas 1 month, 2 weeks ago
On Mon, Oct 13, 2025 at 02:55:32PM +1100, Balbir Singh wrote:
> x86 and riscv provide trace points for page-faults (user and kernel
> tracepoints). Some scripts [1],[2] rely on these trace points. The
> tracepoint is useful for tracking faults and their reasons.
> 
> Adding the tracepoints is simple and straight-forward. For arm64
> use esr as error code and untagged memory address as addr.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Paul Walmsley <pjw@kernel.org>
> Cc: Palmer Dabbelt <palmer@dabbelt.com>
> Cc: Albert Ou <aou@eecs.berkeley.edu>
> Cc: Alexandre Ghiti <alex@ghiti.fr>
> 
> [1] https://www.brendangregg.com/FlameGraphs/memoryflamegraphs.html
> [2] https://taras.glek.net/posts/ebpf-mmap-page-fault-tracing/
> Signed-off-by: Balbir Singh <balbirs@nvidia.com>
> ---
> 
> Tested at my end with a kernel-compile and running a user space
> program to check that the trace points are indeed reported.
> 
>  arch/arm64/mm/fault.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index d816ff44faff..9d7b86e92434 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -44,6 +44,9 @@
>  #include <asm/tlbflush.h>
>  #include <asm/traps.h>
>  
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/exceptions.h>
> +
>  struct fault_info {
>  	int	(*fn)(unsigned long far, unsigned long esr,
>  		      struct pt_regs *regs);
> @@ -572,8 +575,12 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  	if (faulthandler_disabled() || !mm)
>  		goto no_context;
>  
> -	if (user_mode(regs))
> +	if (user_mode(regs)) {
>  		mm_flags |= FAULT_FLAG_USER;
> +		trace_page_fault_user(addr, regs, esr);
> +	} else {
> +		trace_page_fault_kernel(addr, regs, esr);
> +	}

This has come up before and rejected:

https://lore.kernel.org/all/aG0aIKzxApp9j7X0@willie-the-truck/

-- 
Catalin
Re: [PATCH] arch/arm64/mm/fault: Implement exceptions tracepoints
Posted by Balbir Singh 1 month, 2 weeks ago
On 11/4/25 04:26, Catalin Marinas wrote:
> On Mon, Oct 13, 2025 at 02:55:32PM +1100, Balbir Singh wrote:
>> x86 and riscv provide trace points for page-faults (user and kernel
>> tracepoints). Some scripts [1],[2] rely on these trace points. The
>> tracepoint is useful for tracking faults and their reasons.
>>
>> Adding the tracepoints is simple and straight-forward. For arm64
>> use esr as error code and untagged memory address as addr.
>>
>> Cc: Catalin Marinas <catalin.marinas@arm.com>
>> Cc: Will Deacon <will@kernel.org>
>> Cc: Paul Walmsley <pjw@kernel.org>
>> Cc: Palmer Dabbelt <palmer@dabbelt.com>
>> Cc: Albert Ou <aou@eecs.berkeley.edu>
>> Cc: Alexandre Ghiti <alex@ghiti.fr>
>>
>> [1] https://www.brendangregg.com/FlameGraphs/memoryflamegraphs.html
>> [2] https://taras.glek.net/posts/ebpf-mmap-page-fault-tracing/
>> Signed-off-by: Balbir Singh <balbirs@nvidia.com>
>> ---
>>
>> Tested at my end with a kernel-compile and running a user space
>> program to check that the trace points are indeed reported.
>>
>>  arch/arm64/mm/fault.c | 9 ++++++++-
>>  1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
>> index d816ff44faff..9d7b86e92434 100644
>> --- a/arch/arm64/mm/fault.c
>> +++ b/arch/arm64/mm/fault.c
>> @@ -44,6 +44,9 @@
>>  #include <asm/tlbflush.h>
>>  #include <asm/traps.h>
>>  
>> +#define CREATE_TRACE_POINTS
>> +#include <trace/events/exceptions.h>
>> +
>>  struct fault_info {
>>  	int	(*fn)(unsigned long far, unsigned long esr,
>>  		      struct pt_regs *regs);
>> @@ -572,8 +575,12 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>>  	if (faulthandler_disabled() || !mm)
>>  		goto no_context;
>>  
>> -	if (user_mode(regs))
>> +	if (user_mode(regs)) {
>>  		mm_flags |= FAULT_FLAG_USER;
>> +		trace_page_fault_user(addr, regs, esr);
>> +	} else {
>> +		trace_page_fault_kernel(addr, regs, esr);
>> +	}
> 
> This has come up before and rejected:
> 
> https://lore.kernel.org/all/aG0aIKzxApp9j7X0@willie-the-truck/
> 


Thanks for the pointer, since it's been five to six months since the discussion, I don't
see the kprobe handler being merged with the trace point. The real issue is that while
we fix the issue some scripts are broken by default on arm64, see [1] and [2] above and a simple
search for exceptions:page_fault will show up many more. It's just hard to be have all of
those break and fix them as and when needed.

Can we please have this fixed, so that trace-points scripts can work on arm64

Balbir
Re: [PATCH] arch/arm64/mm/fault: Implement exceptions tracepoints
Posted by Catalin Marinas 1 month, 1 week ago
On Wed, Nov 05, 2025 at 11:27:18AM +1100, Balbir Singh wrote:
> On 11/4/25 04:26, Catalin Marinas wrote:
> > On Mon, Oct 13, 2025 at 02:55:32PM +1100, Balbir Singh wrote:
> >> x86 and riscv provide trace points for page-faults (user and kernel
> >> tracepoints). Some scripts [1],[2] rely on these trace points. The
> >> tracepoint is useful for tracking faults and their reasons.
> >>
> >> Adding the tracepoints is simple and straight-forward. For arm64
> >> use esr as error code and untagged memory address as addr.
> >>
> >> Cc: Catalin Marinas <catalin.marinas@arm.com>
> >> Cc: Will Deacon <will@kernel.org>
> >> Cc: Paul Walmsley <pjw@kernel.org>
> >> Cc: Palmer Dabbelt <palmer@dabbelt.com>
> >> Cc: Albert Ou <aou@eecs.berkeley.edu>
> >> Cc: Alexandre Ghiti <alex@ghiti.fr>
> >>
> >> [1] https://www.brendangregg.com/FlameGraphs/memoryflamegraphs.html
> >> [2] https://taras.glek.net/posts/ebpf-mmap-page-fault-tracing/
> >> Signed-off-by: Balbir Singh <balbirs@nvidia.com>
> >> ---
> >>
> >> Tested at my end with a kernel-compile and running a user space
> >> program to check that the trace points are indeed reported.
> >>
> >>  arch/arm64/mm/fault.c | 9 ++++++++-
> >>  1 file changed, 8 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> >> index d816ff44faff..9d7b86e92434 100644
> >> --- a/arch/arm64/mm/fault.c
> >> +++ b/arch/arm64/mm/fault.c
> >> @@ -44,6 +44,9 @@
> >>  #include <asm/tlbflush.h>
> >>  #include <asm/traps.h>
> >>  
> >> +#define CREATE_TRACE_POINTS
> >> +#include <trace/events/exceptions.h>
> >> +
> >>  struct fault_info {
> >>  	int	(*fn)(unsigned long far, unsigned long esr,
> >>  		      struct pt_regs *regs);
> >> @@ -572,8 +575,12 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
> >>  	if (faulthandler_disabled() || !mm)
> >>  		goto no_context;
> >>  
> >> -	if (user_mode(regs))
> >> +	if (user_mode(regs)) {
> >>  		mm_flags |= FAULT_FLAG_USER;
> >> +		trace_page_fault_user(addr, regs, esr);
> >> +	} else {
> >> +		trace_page_fault_kernel(addr, regs, esr);
> >> +	}
> > 
> > This has come up before and rejected:
> > 
> > https://lore.kernel.org/all/aG0aIKzxApp9j7X0@willie-the-truck/
> 
> Thanks for the pointer, since it's been five to six months since the
> discussion, I don't see the kprobe handler being merged with the trace
> point.

I was hinting that whoever needs these tracepoints can do the work ;).

> The real issue is that while we fix the issue some scripts are broken
> by default on arm64, see [1] and [2] above and a simple search for
> exceptions:page_fault will show up many more. It's just hard to be
> have all of those break and fix them as and when needed.

Does Steve's proposal in the earlier discussion help with fixing these
scripts:

https://lore.kernel.org/all/20250519120837.794f6738@batman.local.home/

-- 
Catalin
Re: [PATCH] arch/arm64/mm/fault: Implement exceptions tracepoints
Posted by Anshuman Khandual 2 months, 1 week ago
On 13/10/25 9:25 AM, Balbir Singh wrote:
> x86 and riscv provide trace points for page-faults (user and kernel
> tracepoints). Some scripts [1],[2] rely on these trace points. The
> tracepoint is useful for tracking faults and their reasons.

Agreed.
> 
> Adding the tracepoints is simple and straight-forward. For arm64
> use esr as error code and untagged memory address as addr.

Providing the entire esr register value makes sense.
> 
> Cc: Catalin Marinas <catalin.marinas@arm.com>
> Cc: Will Deacon <will@kernel.org>
> Cc: Paul Walmsley <pjw@kernel.org>
> Cc: Palmer Dabbelt <palmer@dabbelt.com>
> Cc: Albert Ou <aou@eecs.berkeley.edu>
> Cc: Alexandre Ghiti <alex@ghiti.fr>
> 
> [1] https://www.brendangregg.com/FlameGraphs/memoryflamegraphs.html
> [2] https://taras.glek.net/posts/ebpf-mmap-page-fault-tracing/
> Signed-off-by: Balbir Singh <balbirs@nvidia.com>
> ---
> 
> Tested at my end with a kernel-compile and running a user space
> program to check that the trace points are indeed reported.
> 
>  arch/arm64/mm/fault.c | 9 ++++++++-
>  1 file changed, 8 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index d816ff44faff..9d7b86e92434 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -44,6 +44,9 @@
>  #include <asm/tlbflush.h>
>  #include <asm/traps.h>
>  
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/exceptions.h>
> +
>  struct fault_info {
>  	int	(*fn)(unsigned long far, unsigned long esr,
>  		      struct pt_regs *regs);
> @@ -572,8 +575,12 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  	if (faulthandler_disabled() || !mm)
>  		goto no_context;
>  
> -	if (user_mode(regs))
> +	if (user_mode(regs)) {
>  		mm_flags |= FAULT_FLAG_USER;
> +		trace_page_fault_user(addr, regs, esr);
> +	} else {
> +		trace_page_fault_kernel(addr, regs, esr);
> +	}
>  
>  	/*
>  	 * vm_flags tells us what bits we must have in vma->vm_flags

LGTM

Reviewed-by: Anshuman Khandual <anshuman.khandual@arm.com>