[PATCH v8 17/22] arm64: mm: Add page fault trace points

Nam Cao posted 22 patches 9 months ago
There is a newer version of this series
[PATCH v8 17/22] arm64: mm: Add page fault trace points
Posted by Nam Cao 9 months ago
Add page fault trace points, which are useful to implement RV monitor which
watches page faults.

Signed-off-by: Nam Cao <namcao@linutronix.de>
---
Cc: Catalin Marinas <catalin.marinas@arm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
---
 arch/arm64/mm/fault.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index ec0a337891dd..55094030e377 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -44,6 +44,9 @@
 #include <asm/tlbflush.h>
 #include <asm/traps.h>
 
+#define CREATE_TRACE_POINTS
+#include <trace/events/exceptions.h>
+
 struct fault_info {
 	int	(*fn)(unsigned long far, unsigned long esr,
 		      struct pt_regs *regs);
@@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
 	if (kprobe_page_fault(regs, esr))
 		return 0;
 
+	if (user_mode(regs))
+		trace_page_fault_user(addr, regs, esr);
+	else
+		trace_page_fault_kernel(addr, regs, esr);
+
 	/*
 	 * If we're in an interrupt or have no user context, we must not take
 	 * the fault.
-- 
2.39.5
Re: [PATCH v8 17/22] arm64: mm: Add page fault trace points
Posted by Catalin Marinas 8 months, 3 weeks ago
On Mon, May 12, 2025 at 12:51:00PM +0200, Nam Cao wrote:
> diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> index ec0a337891dd..55094030e377 100644
> --- a/arch/arm64/mm/fault.c
> +++ b/arch/arm64/mm/fault.c
> @@ -44,6 +44,9 @@
>  #include <asm/tlbflush.h>
>  #include <asm/traps.h>
>  
> +#define CREATE_TRACE_POINTS
> +#include <trace/events/exceptions.h>
> +
>  struct fault_info {
>  	int	(*fn)(unsigned long far, unsigned long esr,
>  		      struct pt_regs *regs);
> @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
>  	if (kprobe_page_fault(regs, esr))
>  		return 0;
>  
> +	if (user_mode(regs))
> +		trace_page_fault_user(addr, regs, esr);
> +	else
> +		trace_page_fault_kernel(addr, regs, esr);

What are the semantics for these tracepoints? When are they supposed to
be called? In the RV context context I guess you only care about the
benign, recoverable faults that would affect timing. These tracepoints
were generalised from the x86 code but I don't know enough about it to
tell when they would be invoked.

For arm64, we also have the do_translation_fault() path for example that
may or may not need to log such trace events.

-- 
Catalin
Re: [PATCH v8 17/22] arm64: mm: Add page fault trace points
Posted by Nam Cao 8 months, 3 weeks ago
On Mon, May 19, 2025 at 03:49:29PM +0100, Catalin Marinas wrote:
> On Mon, May 12, 2025 at 12:51:00PM +0200, Nam Cao wrote:
> > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > index ec0a337891dd..55094030e377 100644
> > --- a/arch/arm64/mm/fault.c
> > +++ b/arch/arm64/mm/fault.c
> > @@ -44,6 +44,9 @@
> >  #include <asm/tlbflush.h>
> >  #include <asm/traps.h>
> >  
> > +#define CREATE_TRACE_POINTS
> > +#include <trace/events/exceptions.h>
> > +
> >  struct fault_info {
> >  	int	(*fn)(unsigned long far, unsigned long esr,
> >  		      struct pt_regs *regs);
> > @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
> >  	if (kprobe_page_fault(regs, esr))
> >  		return 0;
> >  
> > +	if (user_mode(regs))
> > +		trace_page_fault_user(addr, regs, esr);
> > +	else
> > +		trace_page_fault_kernel(addr, regs, esr);
> 
> What are the semantics for these tracepoints? When are they supposed to
> be called? In the RV context context I guess you only care about the
> benign, recoverable faults that would affect timing. These tracepoints
> were generalised from the x86 code but I don't know enough about it to
> tell when they would be invoked.
> 
> For arm64, we also have the do_translation_fault() path for example that
> may or may not need to log such trace events.

These tracepoints are invoked for x86 page fault exceptions. Are arm64's
translation faults considered equivalent to x86 page faults?

Best regards,
Nam
Re: [PATCH v8 17/22] arm64: mm: Add page fault trace points
Posted by Catalin Marinas 8 months, 3 weeks ago
On Tue, May 20, 2025 at 02:25:48PM +0200, Nam Cao wrote:
> On Mon, May 19, 2025 at 03:49:29PM +0100, Catalin Marinas wrote:
> > On Mon, May 12, 2025 at 12:51:00PM +0200, Nam Cao wrote:
> > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > > index ec0a337891dd..55094030e377 100644
> > > --- a/arch/arm64/mm/fault.c
> > > +++ b/arch/arm64/mm/fault.c
> > > @@ -44,6 +44,9 @@
> > >  #include <asm/tlbflush.h>
> > >  #include <asm/traps.h>
> > >  
> > > +#define CREATE_TRACE_POINTS
> > > +#include <trace/events/exceptions.h>
> > > +
> > >  struct fault_info {
> > >  	int	(*fn)(unsigned long far, unsigned long esr,
> > >  		      struct pt_regs *regs);
> > > @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
> > >  	if (kprobe_page_fault(regs, esr))
> > >  		return 0;
> > >  
> > > +	if (user_mode(regs))
> > > +		trace_page_fault_user(addr, regs, esr);
> > > +	else
> > > +		trace_page_fault_kernel(addr, regs, esr);
> > 
> > What are the semantics for these tracepoints? When are they supposed to
> > be called? In the RV context context I guess you only care about the
> > benign, recoverable faults that would affect timing. These tracepoints
> > were generalised from the x86 code but I don't know enough about it to
> > tell when they would be invoked.
> > 
> > For arm64, we also have the do_translation_fault() path for example that
> > may or may not need to log such trace events.
> 
> These tracepoints are invoked for x86 page fault exceptions. Are arm64's
> translation faults considered equivalent to x86 page faults?

Probably. We route permission or access flag faults via do_page_fault()
directly while missing page table entries via do_translation_fault().
The latter end up in do_page_fault() only if the faulting address is in
the user address range.

My point was that we may not always invoke the trace callbacks if, for
example, the user tries to access the kernel space (and results in a
SIGSEGV). While that's fine for RV, I wanted to know what is expected of
these trace points in general. Do we need to log such SIGSEGV-generating
events? We do log them if there's a permission fault.

-- 
Catalin
Re: [PATCH v8 17/22] arm64: mm: Add page fault trace points
Posted by Nam Cao 8 months, 3 weeks ago
On Tue, May 20, 2025 at 03:15:01PM +0100, Catalin Marinas wrote:
> On Tue, May 20, 2025 at 02:25:48PM +0200, Nam Cao wrote:
> > On Mon, May 19, 2025 at 03:49:29PM +0100, Catalin Marinas wrote:
> > > On Mon, May 12, 2025 at 12:51:00PM +0200, Nam Cao wrote:
> > > > diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
> > > > index ec0a337891dd..55094030e377 100644
> > > > --- a/arch/arm64/mm/fault.c
> > > > +++ b/arch/arm64/mm/fault.c
> > > > @@ -44,6 +44,9 @@
> > > >  #include <asm/tlbflush.h>
> > > >  #include <asm/traps.h>
> > > >  
> > > > +#define CREATE_TRACE_POINTS
> > > > +#include <trace/events/exceptions.h>
> > > > +
> > > >  struct fault_info {
> > > >  	int	(*fn)(unsigned long far, unsigned long esr,
> > > >  		      struct pt_regs *regs);
> > > > @@ -559,6 +562,11 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr,
> > > >  	if (kprobe_page_fault(regs, esr))
> > > >  		return 0;
> > > >  
> > > > +	if (user_mode(regs))
> > > > +		trace_page_fault_user(addr, regs, esr);
> > > > +	else
> > > > +		trace_page_fault_kernel(addr, regs, esr);
> > > 
> > > What are the semantics for these tracepoints? When are they supposed to
> > > be called? In the RV context context I guess you only care about the
> > > benign, recoverable faults that would affect timing. These tracepoints
> > > were generalised from the x86 code but I don't know enough about it to
> > > tell when they would be invoked.
> > > 
> > > For arm64, we also have the do_translation_fault() path for example that
> > > may or may not need to log such trace events.
> > 
> > These tracepoints are invoked for x86 page fault exceptions. Are arm64's
> > translation faults considered equivalent to x86 page faults?
> 
> Probably. We route permission or access flag faults via do_page_fault()
> directly while missing page table entries via do_translation_fault().
> The latter end up in do_page_fault() only if the faulting address is in
> the user address range.
> 
> My point was that we may not always invoke the trace callbacks if, for
> example, the user tries to access the kernel space (and results in a
> SIGSEGV). While that's fine for RV, I wanted to know what is expected of
> these trace points in general. Do we need to log such SIGSEGV-generating
> events? We do log them if there's a permission fault.

I'm not sure. Digging into history, these tracepoints were added for LTTng.
So maybe LTTng's developer could answer this.

Added to the conversation: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Can you please give some insight to the above question?

Best regards,
Nam