DEN0154 states that writes to PMBPTR_EL1 or PMBSR_EL1 must be done while
the buffer is disabled (PMBLIMITR_EL1.E == 0). Re-arrange the interrupt
handler to always disable the buffer for non-spurious interrupts before
doing either.
Most of arm_spe_pmu_disable_and_drain_local() is now always done, so for
faults the only thing left to do is clear PMSCR_EL1.
Elaborate the comment in arm_spe_pmu_disable_and_drain_local() to
explain the ramifications of not doing it in the right order.
Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension")
Signed-off-by: James Clark <james.clark@linaro.org>
---
drivers/perf/arm_spe_pmu.c | 33 +++++++++++++++++++++------------
1 file changed, 21 insertions(+), 12 deletions(-)
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 6235ca7ecd48..5829947c8871 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -559,7 +559,12 @@ static void arm_spe_perf_aux_output_end(struct perf_output_handle *handle)
static void arm_spe_pmu_disable_and_drain_local(void)
{
- /* Disable profiling at EL0 and EL1 */
+ /*
+ * To prevent the CONSTRAINED UNPREDICTABLE behavior of either writing
+ * to memory after the buffer is disabled, or SPE reporting an access
+ * not allowed event, we must disable sampling before draining the
+ * buffer.
+ */
write_sysreg_s(0, SYS_PMSCR_EL1);
isb();
@@ -661,16 +666,24 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev)
*/
irq_work_run();
+ /*
+ * arm_spe_pmu_buf_get_fault_act() already drained, and PMBSR_EL1.S == 1
+ * means that StatisticalProfilingEnabled() == false. So now we can
+ * safely disable the buffer.
+ */
+ write_sysreg_s(0, SYS_PMBLIMITR_EL1);
+ isb();
+
+ /* Status can be cleared now that PMBLIMITR_EL1.E == 0 */
+ write_sysreg_s(0, SYS_PMBSR_EL1);
+
switch (act) {
case SPE_PMU_BUF_FAULT_ACT_FATAL:
/*
- * If a fatal exception occurred then leaving the profiling
- * buffer enabled is a recipe waiting to happen. Since
- * fatal faults don't always imply truncation, make sure
- * that the profiling buffer is disabled explicitly before
- * clearing the syndrome register.
+ * To complete the full disable sequence, also disable profiling
+ * at EL0 and EL1, we don't want to continue at all anymore.
*/
- arm_spe_pmu_disable_and_drain_local();
+ write_sysreg_s(0, SYS_PMSCR_EL1);
break;
case SPE_PMU_BUF_FAULT_ACT_OK:
/*
@@ -679,18 +692,14 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev)
* PMBPTR might be misaligned, but we'll burn that bridge
* when we get to it.
*/
- if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) {
+ if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED))
arm_spe_perf_aux_output_begin(handle, event);
- isb();
- }
break;
case SPE_PMU_BUF_FAULT_ACT_SPURIOUS:
/* We've seen you before, but GCC has the memory of a sieve. */
break;
}
- /* The buffer pointers are now sane, so resume profiling. */
- write_sysreg_s(0, SYS_PMBSR_EL1);
return IRQ_HANDLED;
}
--
2.34.1
On Tue, Jul 01, 2025 at 04:31:58PM +0100, James Clark wrote: > DEN0154 states that writes to PMBPTR_EL1 or PMBSR_EL1 must be done while > the buffer is disabled (PMBLIMITR_EL1.E == 0). Re-arrange the interrupt > handler to always disable the buffer for non-spurious interrupts before > doing either. Why is DEN0154 not part of the Arm ARM? Trying to tighten the programmers model retrospectively in a separate spec is a lovely idea but I'm not hugely interested in it. The driver is written to the text in the Arm ARM and is going to stay that way. Any future changes in the architecture need to be backwards compatible. If we have to poke the thing differently in a VM, we might as well use hypercalls/PV. Will
Hi James, On Tue, Jul 01, 2025 at 04:31:58PM +0100, James Clark wrote: > DEN0154 states that writes to PMBPTR_EL1 or PMBSR_EL1 must be done while > the buffer is disabled (PMBLIMITR_EL1.E == 0). Re-arrange the interrupt > handler to always disable the buffer for non-spurious interrupts before > doing either. > > Most of arm_spe_pmu_disable_and_drain_local() is now always done, so for > faults the only thing left to do is clear PMSCR_EL1. > > Elaborate the comment in arm_spe_pmu_disable_and_drain_local() to > explain the ramifications of not doing it in the right order. > > Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension") > Signed-off-by: James Clark <james.clark@linaro.org> > --- > drivers/perf/arm_spe_pmu.c | 33 +++++++++++++++++++++------------ > 1 file changed, 21 insertions(+), 12 deletions(-) > > diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c > index 6235ca7ecd48..5829947c8871 100644 > --- a/drivers/perf/arm_spe_pmu.c > +++ b/drivers/perf/arm_spe_pmu.c > @@ -559,7 +559,12 @@ static void arm_spe_perf_aux_output_end(struct perf_output_handle *handle) > > static void arm_spe_pmu_disable_and_drain_local(void) > { > - /* Disable profiling at EL0 and EL1 */ > + /* > + * To prevent the CONSTRAINED UNPREDICTABLE behavior of either writing > + * to memory after the buffer is disabled, or SPE reporting an access > + * not allowed event, we must disable sampling before draining the > + * buffer. > + */ > write_sysreg_s(0, SYS_PMSCR_EL1); > isb(); > > @@ -661,16 +666,24 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) > */ > irq_work_run(); > > + /* > + * arm_spe_pmu_buf_get_fault_act() already drained, and PMBSR_EL1.S == 1 > + * means that StatisticalProfilingEnabled() == false. So now we can > + * safely disable the buffer. > + */ > + write_sysreg_s(0, SYS_PMBLIMITR_EL1); > + isb(); > + > + /* Status can be cleared now that PMBLIMITR_EL1.E == 0 */ > + write_sysreg_s(0, SYS_PMBSR_EL1); I've been trying to figure out if we need an ISB here to order clearing the service bit before the PMBLIMITR_EL1 write in arm_spe_perf_aux_output_begin(). arm_spe_perf_aux_output_begin() is called only when the buffer is full, and this rules out the event having the discard attribute (buffer management events are not generated in discard mode). If a new buffer cannot be allocated (perf_aux_output_begin() returns NULL), then PMBLIMITR_EL1.E remains 0, so no need to order the two writes. The only other path remaining in arm_spe_perf_aux_output_begin() is reprogramming the buffer, in which case there's an ISB before the write to PMBLIMITR_EL1. In conclusion, I think it's correct not to have an ISB here. > + > switch (act) { > case SPE_PMU_BUF_FAULT_ACT_FATAL: > /* > - * If a fatal exception occurred then leaving the profiling > - * buffer enabled is a recipe waiting to happen. Since > - * fatal faults don't always imply truncation, make sure > - * that the profiling buffer is disabled explicitly before > - * clearing the syndrome register. > + * To complete the full disable sequence, also disable profiling > + * at EL0 and EL1, we don't want to continue at all anymore. > */ > - arm_spe_pmu_disable_and_drain_local(); > + write_sysreg_s(0, SYS_PMSCR_EL1); Before: arm_spe_pmu_buf_get_fault_act: <drain buffer> ISB arm_spe_pmu_disable_and_drain_local: PMSCR_EL1 = 0 ISB # disables profiling <drain buffer> PMBLIMITR_EL1=0 PMBSR_EL1=0 ERET # synchronizes the two writes above Now: arm_spe_pmu_buf_get_fault_act: <drain buffer> ISB PMBLIMITR_EL1=0 ISB # disables profiling PMBSR_EL1=0 PMSCR_EL1=0 ERET # synchronizes the two writes above This looks correct to me. Thanks, Alex > break; > case SPE_PMU_BUF_FAULT_ACT_OK: > /* > @@ -679,18 +692,14 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) > * PMBPTR might be misaligned, but we'll burn that bridge > * when we get to it. > */ > - if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) { > + if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) > arm_spe_perf_aux_output_begin(handle, event); > - isb(); > - } > break; > case SPE_PMU_BUF_FAULT_ACT_SPURIOUS: > /* We've seen you before, but GCC has the memory of a sieve. */ > break; > } > > - /* The buffer pointers are now sane, so resume profiling. */ > - write_sysreg_s(0, SYS_PMBSR_EL1); > return IRQ_HANDLED; > } > > > -- > 2.34.1 >
On Tue, Jul 01, 2025 at 04:31:58PM +0100, James Clark wrote: [...] > @@ -661,16 +666,24 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) > */ > irq_work_run(); > > + /* > + * arm_spe_pmu_buf_get_fault_act() already drained, and PMBSR_EL1.S == 1 > + * means that StatisticalProfilingEnabled() == false. So now we can > + * safely disable the buffer. > + */ > + write_sysreg_s(0, SYS_PMBLIMITR_EL1); > + isb(); > + > + /* Status can be cleared now that PMBLIMITR_EL1.E == 0 */ > + write_sysreg_s(0, SYS_PMBSR_EL1); > + An important thing is about sequence: As described in arm_spe_pmu_disable_and_drain_local(), should we always clear ELs bits in PMSCR_EL1 before clear PMBLIMITR_EL1.E bit? As a reference, we could see TRBE always clear ELx bits before disable trace buffer. And a trivial flaw: If the TRUNCATED flag has been set, the irq_work_run() above runs the IRQ work to invoke the arm_spe_pmu_stop() to disable trace buffer, which clear SYS_PMBLIMITR_EL1.E bit. This is why the current code does not explictly clear SYS_PMBLIMITR_EL1.E bit. With this patch, the interrupt handler will clear SYS_PMBLIMITR_EL1.E bit twice for a trunacated case. > switch (act) { > case SPE_PMU_BUF_FAULT_ACT_FATAL: > /* > - * If a fatal exception occurred then leaving the profiling > - * buffer enabled is a recipe waiting to happen. Since > - * fatal faults don't always imply truncation, make sure > - * that the profiling buffer is disabled explicitly before > - * clearing the syndrome register. > + * To complete the full disable sequence, also disable profiling > + * at EL0 and EL1, we don't want to continue at all anymore. > */ > - arm_spe_pmu_disable_and_drain_local(); > + write_sysreg_s(0, SYS_PMSCR_EL1); > break; > case SPE_PMU_BUF_FAULT_ACT_OK: > /* > @@ -679,18 +692,14 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) > * PMBPTR might be misaligned, but we'll burn that bridge > * when we get to it. > */ > - if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) { > + if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) > arm_spe_perf_aux_output_begin(handle, event); > - isb(); I am a bit suspecious we can remove this isb(). As a reference to the software usage PKLXF in Arm ARM (DDI 0487 L.a), after enable TRBE trace unit, an ISB is mandatory. Maybe check a bit for this? Thanks, Leo > - } > break; > case SPE_PMU_BUF_FAULT_ACT_SPURIOUS: > /* We've seen you before, but GCC has the memory of a sieve. */ > break; > } > > - /* The buffer pointers are now sane, so resume profiling. */ > - write_sysreg_s(0, SYS_PMBSR_EL1); > return IRQ_HANDLED; > } > > > -- > 2.34.1 > >
On 04/07/2025 4:50 pm, Leo Yan wrote: > On Tue, Jul 01, 2025 at 04:31:58PM +0100, James Clark wrote: > > [...] > >> @@ -661,16 +666,24 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) >> */ >> irq_work_run(); >> >> + /* >> + * arm_spe_pmu_buf_get_fault_act() already drained, and PMBSR_EL1.S == 1 >> + * means that StatisticalProfilingEnabled() == false. So now we can >> + * safely disable the buffer. >> + */ >> + write_sysreg_s(0, SYS_PMBLIMITR_EL1); >> + isb(); >> + >> + /* Status can be cleared now that PMBLIMITR_EL1.E == 0 */ >> + write_sysreg_s(0, SYS_PMBSR_EL1); >> + > > An important thing is about sequence: > As described in arm_spe_pmu_disable_and_drain_local(), should we always > clear ELs bits in PMSCR_EL1 before clear PMBLIMITR_EL1.E bit? As a > reference, we could see TRBE always clear ELx bits before disable trace > buffer. > > And a trivial flaw: > > If the TRUNCATED flag has been set, the irq_work_run() above runs the > IRQ work to invoke the arm_spe_pmu_stop() to disable trace buffer, which > clear SYS_PMBLIMITR_EL1.E bit. This is why the current code does not > explictly clear SYS_PMBLIMITR_EL1.E bit. > > With this patch, the interrupt handler will clear SYS_PMBLIMITR_EL1.E > bit twice for a trunacated case. > > I suppose that's a rarer case that we don't necessarily have to optimize for. I don't think it will do any harm, but is it even possible to avoid? There are already some other duplications in the driver, for example in arm_spe_pmu_stop() we call arm_spe_pmu_disable_and_drain_local() which drains, and then arm_spe_pmu_buf_get_fault_act() which also drains again. >> switch (act) { >> case SPE_PMU_BUF_FAULT_ACT_FATAL: >> /* >> - * If a fatal exception occurred then leaving the profiling >> - * buffer enabled is a recipe waiting to happen. Since >> - * fatal faults don't always imply truncation, make sure >> - * that the profiling buffer is disabled explicitly before >> - * clearing the syndrome register. >> + * To complete the full disable sequence, also disable profiling >> + * at EL0 and EL1, we don't want to continue at all anymore. >> */ >> - arm_spe_pmu_disable_and_drain_local(); >> + write_sysreg_s(0, SYS_PMSCR_EL1); >> break; >> case SPE_PMU_BUF_FAULT_ACT_OK: >> /* >> @@ -679,18 +692,14 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) >> * PMBPTR might be misaligned, but we'll burn that bridge >> * when we get to it. >> */ >> - if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) { >> + if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) >> arm_spe_perf_aux_output_begin(handle, event); >> - isb(); > > I am a bit suspecious we can remove this isb(). > > As a reference to the software usage PKLXF in Arm ARM (DDI 0487 L.a), > after enable TRBE trace unit, an ISB is mandatory. Maybe check a bit > for this? > > Thanks, > Leo > Wasn't this isb() to separate the programming of the registers with the status register clear at the end of this function to enable profiling? But now we enable profiling with the write to PMBLIMITR_EL1 in arm_spe_perf_aux_output_begin() and the last thing here is the ERET. That's specifically mentioned as enough synchronization in PKLXF: In the common case, this is an ERET instruction that returns to a different Exception level where tracing is allowed. >> - } >> break; >> case SPE_PMU_BUF_FAULT_ACT_SPURIOUS: >> /* We've seen you before, but GCC has the memory of a sieve. */ >> break; >> } >> >> - /* The buffer pointers are now sane, so resume profiling. */ >> - write_sysreg_s(0, SYS_PMBSR_EL1); >> return IRQ_HANDLED; >> } >> >> >> -- >> 2.34.1 >> >>
On Mon, Jul 07, 2025 at 12:39:57PM +0100, James Clark wrote: [...] > > > @@ -661,16 +666,24 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) > > > */ > > > irq_work_run(); > > > + /* > > > + * arm_spe_pmu_buf_get_fault_act() already drained, and PMBSR_EL1.S == 1 > > > + * means that StatisticalProfilingEnabled() == false. So now we can > > > + * safely disable the buffer. > > > + */ > > > + write_sysreg_s(0, SYS_PMBLIMITR_EL1); > > > + isb(); > > > + > > > + /* Status can be cleared now that PMBLIMITR_EL1.E == 0 */ > > > + write_sysreg_s(0, SYS_PMBSR_EL1); > > > + > > > > An important thing is about sequence: > > As described in arm_spe_pmu_disable_and_drain_local(), should we always > > clear ELs bits in PMSCR_EL1 before clear PMBLIMITR_EL1.E bit? As a > > reference, we could see TRBE always clear ELx bits before disable trace > > buffer. > > > > And a trivial flaw: > > > > If the TRUNCATED flag has been set, the irq_work_run() above runs the > > IRQ work to invoke the arm_spe_pmu_stop() to disable trace buffer, which > > clear SYS_PMBLIMITR_EL1.E bit. This is why the current code does not > > explictly clear SYS_PMBLIMITR_EL1.E bit. > > > > With this patch, the interrupt handler will clear SYS_PMBLIMITR_EL1.E > > bit twice for a trunacated case. > > I suppose that's a rarer case that we don't necessarily have to optimize > for. I don't think it will do any harm, but is it even possible to avoid? > > There are already some other duplications in the driver, for example in > arm_spe_pmu_stop() we call arm_spe_pmu_disable_and_drain_local() which > drains, and then arm_spe_pmu_buf_get_fault_act() which also drains again. If we don't need to worry about duplicated operations in the truncated case, then for easier maintenance and better readability, I'm wondering if we could simplify the interrupt handler as follows: arm_spe_pmu_irq_handler() { ... act = arm_spe_pmu_buf_get_fault_act(handle); if (act == SPE_PMU_BUF_FAULT_ACT_SPURIOUS) return IRQ_NONE; arm_spe_pmu_disable_and_drain_local(); /* Status can be cleared now that PMBLIMITR_EL1.E == 0 */ write_sysreg_s(0, SYS_PMBSR_EL1); isb(); switch (act) { ... } } This approach complies with DEN0154 - we must clear PMBLIMITR_EL1.E before writing to other SPE system registers (e.g., PMBSR). The reason for using arm_spe_pmu_disable_and_drain_local() is that we first need to disable profiling instructions by clearing PMSCR_EL1/EL2, and then is it safe to disable the profiling buffer. [...] > > > case SPE_PMU_BUF_FAULT_ACT_OK: > > > /* > > > @@ -679,18 +692,14 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) > > > * PMBPTR might be misaligned, but we'll burn that bridge > > > * when we get to it. > > > */ > > > - if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) { > > > + if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) > > > arm_spe_perf_aux_output_begin(handle, event); > > > - isb(); > > > > I am a bit suspecious we can remove this isb(). > > > > As a reference to the software usage PKLXF in Arm ARM (DDI 0487 L.a), > > after enable TRBE trace unit, an ISB is mandatory. Maybe check a bit > > for this? > > Wasn't this isb() to separate the programming of the registers with the > status register clear at the end of this function to enable profiling? Enabling profiling buffer followed an isb() is not only for separating other register programming. As described in section D17.9, Synchronization and Statistical Profiling in Arm ARM: "A Context Synchronization event guarantees that a direct write to a System register made by the PE in program order before the Context synchronization event are observable by indirect reads and indirect writes of the same System register made by a profiling operation relating to a sampled operation in program order after the Context synchronization event." My understanding is: after the ARM SPE profiling is enabled, the followed ISB is a Synchronization to make sure the system register values are observed by SPE. And we cannot rely on ERET, especially if we are tracing the kernel mode. Thanks, Leo > But now we enable profiling with the write to PMBLIMITR_EL1 in > arm_spe_perf_aux_output_begin() and the last thing here is the ERET. That's > specifically mentioned as enough synchronization in PKLXF: > > In the common case, this is an ERET instruction that returns to a > different Exception level where tracing is allowed. > > > > - } > > > break; > > > case SPE_PMU_BUF_FAULT_ACT_SPURIOUS: > > > /* We've seen you before, but GCC has the memory of a sieve. */ > > > break; > > > } > > > - /* The buffer pointers are now sane, so resume profiling. */ > > > - write_sysreg_s(0, SYS_PMBSR_EL1); > > > return IRQ_HANDLED; > > > } > > > > > > -- > > > 2.34.1 > > > > > > >
Hi Leo, James, On Mon, Jul 07, 2025 at 04:37:10PM +0100, Leo Yan wrote: > On Mon, Jul 07, 2025 at 12:39:57PM +0100, James Clark wrote: > > [...] > > > > > @@ -661,16 +666,24 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) > > > > */ > > > > irq_work_run(); > > > > + /* > > > > + * arm_spe_pmu_buf_get_fault_act() already drained, and PMBSR_EL1.S == 1 > > > > + * means that StatisticalProfilingEnabled() == false. So now we can > > > > + * safely disable the buffer. > > > > + */ > > > > + write_sysreg_s(0, SYS_PMBLIMITR_EL1); > > > > + isb(); > > > > + > > > > + /* Status can be cleared now that PMBLIMITR_EL1.E == 0 */ > > > > + write_sysreg_s(0, SYS_PMBSR_EL1); > > > > + > > > > > > An important thing is about sequence: > > > As described in arm_spe_pmu_disable_and_drain_local(), should we always > > > clear ELs bits in PMSCR_EL1 before clear PMBLIMITR_EL1.E bit? As a > > > reference, we could see TRBE always clear ELx bits before disable trace > > > buffer. > > > > > > And a trivial flaw: > > > > > > If the TRUNCATED flag has been set, the irq_work_run() above runs the > > > IRQ work to invoke the arm_spe_pmu_stop() to disable trace buffer, which > > > clear SYS_PMBLIMITR_EL1.E bit. This is why the current code does not > > > explictly clear SYS_PMBLIMITR_EL1.E bit. > > > > > > With this patch, the interrupt handler will clear SYS_PMBLIMITR_EL1.E > > > bit twice for a trunacated case. > > > > I suppose that's a rarer case that we don't necessarily have to optimize > > for. I don't think it will do any harm, but is it even possible to avoid? > > > > There are already some other duplications in the driver, for example in > > arm_spe_pmu_stop() we call arm_spe_pmu_disable_and_drain_local() which > > drains, and then arm_spe_pmu_buf_get_fault_act() which also drains again. > > If we don't need to worry about duplicated operations in the truncated > case, then for easier maintenance and better readability, I'm wondering > if we could simplify the interrupt handler as follows: > > arm_spe_pmu_irq_handler() > { > ... > > act = arm_spe_pmu_buf_get_fault_act(handle); > if (act == SPE_PMU_BUF_FAULT_ACT_SPURIOUS) > return IRQ_NONE; > > arm_spe_pmu_disable_and_drain_local(); > > /* Status can be cleared now that PMBLIMITR_EL1.E == 0 */ > write_sysreg_s(0, SYS_PMBSR_EL1); > isb(); > > switch (act) { > ... > } > } > > This approach complies with DEN0154 - we must clear PMBLIMITR_EL1.E > before writing to other SPE system registers (e.g., PMBSR). > > The reason for using arm_spe_pmu_disable_and_drain_local() is that we > first need to disable profiling instructions by clearing PMSCR_EL1/EL2, > and then is it safe to disable the profiling buffer. > > [...] > > > > > case SPE_PMU_BUF_FAULT_ACT_OK: > > > > /* > > > > @@ -679,18 +692,14 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) > > > > * PMBPTR might be misaligned, but we'll burn that bridge > > > > * when we get to it. > > > > */ > > > > - if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) { > > > > + if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) > > > > arm_spe_perf_aux_output_begin(handle, event); > > > > - isb(); > > > > > > I am a bit suspecious we can remove this isb(). > > > > > > As a reference to the software usage PKLXF in Arm ARM (DDI 0487 L.a), > > > after enable TRBE trace unit, an ISB is mandatory. Maybe check a bit > > > for this? > > > > Wasn't this isb() to separate the programming of the registers with the > > status register clear at the end of this function to enable profiling? > > Enabling profiling buffer followed an isb() is not only for separating > other register programming. > > As described in section D17.9, Synchronization and Statistical Profiling > in Arm ARM: > > "A Context Synchronization event guarantees that a direct write to a > System register made by the PE in program order before the Context > synchronization event are observable by indirect reads and indirect > writes of the same System register made by a profiling operation > relating to a sampled operation in program order after the Context > synchronization event." > > My understanding is: after the ARM SPE profiling is enabled, the > followed ISB is a Synchronization to make sure the system register > values are observed by SPE. And we cannot rely on ERET, especially if > we are tracing the kernel mode. Thought about this some more. Before: arm_spe_pmu_buf_get_fault_act: <drain buffer> ISB arm_spe_perf_aux_output_begin: PMBLIMITR_EL1.E = 1 ISB PMBSR_EL1.S = 0 ERET Now: PMBLIMITR_EL1 = 0 ISB PMBSR_EL1.S = 0 arm_spe_perf_aux_output_begin: ISB PMBLIMITR_EL1.E = 1 ERET I don't see much of a difference between the two sequences - the point after which we can be sure that profiling is enabled remains the ERET from the exception return. The only difference is that, before this change, the ERET synchronized clearing the PMBSR_EL1.S bit, now it synchronizes setting the PMBLIMITR_EL1.E bit. Thoughts? Thanks, Alex
Hi Alexandru, On Wed, Jul 09, 2025 at 11:08:57AM +0100, Alexandru Elisei wrote: [...] > > > > > case SPE_PMU_BUF_FAULT_ACT_OK: > > > > > /* > > > > > @@ -679,18 +692,14 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) > > > > > * PMBPTR might be misaligned, but we'll burn that bridge > > > > > * when we get to it. > > > > > */ > > > > > - if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) { > > > > > + if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) > > > > > arm_spe_perf_aux_output_begin(handle, event); > > > > > - isb(); > > > > > > > > I am a bit suspecious we can remove this isb(). > > > > > > > > As a reference to the software usage PKLXF in Arm ARM (DDI 0487 L.a), > > > > after enable TRBE trace unit, an ISB is mandatory. Maybe check a bit > > > > for this? > > > > > > Wasn't this isb() to separate the programming of the registers with the > > > status register clear at the end of this function to enable profiling? > > > > Enabling profiling buffer followed an isb() is not only for separating > > other register programming. > > > > As described in section D17.9, Synchronization and Statistical Profiling > > in Arm ARM: > > > > "A Context Synchronization event guarantees that a direct write to a > > System register made by the PE in program order before the Context > > synchronization event are observable by indirect reads and indirect > > writes of the same System register made by a profiling operation > > relating to a sampled operation in program order after the Context > > synchronization event." > > > > My understanding is: after the ARM SPE profiling is enabled, the > > followed ISB is a Synchronization to make sure the system register > > values are observed by SPE. And we cannot rely on ERET, especially if > > we are tracing the kernel mode. > > Thought about this some more. > > Before: > > arm_spe_pmu_buf_get_fault_act: > <drain buffer> > ISB > arm_spe_perf_aux_output_begin: > PMBLIMITR_EL1.E = 1 > ISB > PMBSR_EL1.S = 0 > ERET > > Now: > > PMBLIMITR_EL1 = 0 > ISB > > PMBSR_EL1.S = 0 > arm_spe_perf_aux_output_begin: > ISB > PMBLIMITR_EL1.E = 1 > ERET > > I don't see much of a difference between the two sequences - the point after > which we can be sure that profiling is enabled remains the ERET from the > exception return. The only difference is that, before this change, the ERET > synchronized clearing the PMBSR_EL1.S bit, now it synchronizes setting the > PMBLIMITR_EL1.E bit. > > Thoughts? To make the discussion easier, I'll focus on the trace enabling flow in this reply. My understanding of a sane flow would be: arm_spe_pmu_irq_handler() { arm_spe_perf_aux_output_begin() { SYS_PMBPTR_EL1 = base; ISB // Synchronization between SPE register setting and // enabling profiling buffer. PMBLIMITR_EL1.E = 1; } ISB // Context synchronization event to ensure visibility to SPE } ... start trace ... (Bottom half, e.g., softirq, etc) ERET In the current code, arm_spe_perf_aux_output_begin() is followed by an ISB, which serves as a context synchronization event to ensure visibility to the SPE. After that, it ensures that the trace unit will function correctly. I understand that the Software Usage PKLXF recommends using an ERET as the synchronization point. However, between enabling the profiling buffer and the ERET, the kernel might execute other operations (e.g., softirqs, tasklets, etc.). Therefore, it seems to me that using ERET as the synchronization point may be too late. This is why I think we should keep an ISB after arm_spe_perf_aux_output_begin(). Thanks, Leo
On 14/07/2025 9:58 am, Leo Yan wrote: > Hi Alexandru, > > On Wed, Jul 09, 2025 at 11:08:57AM +0100, Alexandru Elisei wrote: > > [...] > >>>>>> case SPE_PMU_BUF_FAULT_ACT_OK: >>>>>> /* >>>>>> @@ -679,18 +692,14 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) >>>>>> * PMBPTR might be misaligned, but we'll burn that bridge >>>>>> * when we get to it. >>>>>> */ >>>>>> - if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) { >>>>>> + if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) >>>>>> arm_spe_perf_aux_output_begin(handle, event); >>>>>> - isb(); >>>>> >>>>> I am a bit suspecious we can remove this isb(). >>>>> >>>>> As a reference to the software usage PKLXF in Arm ARM (DDI 0487 L.a), >>>>> after enable TRBE trace unit, an ISB is mandatory. Maybe check a bit >>>>> for this? >>>> >>>> Wasn't this isb() to separate the programming of the registers with the >>>> status register clear at the end of this function to enable profiling? >>> >>> Enabling profiling buffer followed an isb() is not only for separating >>> other register programming. >>> >>> As described in section D17.9, Synchronization and Statistical Profiling >>> in Arm ARM: >>> >>> "A Context Synchronization event guarantees that a direct write to a >>> System register made by the PE in program order before the Context >>> synchronization event are observable by indirect reads and indirect >>> writes of the same System register made by a profiling operation >>> relating to a sampled operation in program order after the Context >>> synchronization event." >>> >>> My understanding is: after the ARM SPE profiling is enabled, the >>> followed ISB is a Synchronization to make sure the system register >>> values are observed by SPE. And we cannot rely on ERET, especially if >>> we are tracing the kernel mode. >> >> Thought about this some more. >> >> Before: >> >> arm_spe_pmu_buf_get_fault_act: >> <drain buffer> >> ISB >> arm_spe_perf_aux_output_begin: >> PMBLIMITR_EL1.E = 1 >> ISB >> PMBSR_EL1.S = 0 >> ERET >> >> Now: >> >> PMBLIMITR_EL1 = 0 >> ISB >> >> PMBSR_EL1.S = 0 >> arm_spe_perf_aux_output_begin: >> ISB >> PMBLIMITR_EL1.E = 1 >> ERET >> >> I don't see much of a difference between the two sequences - the point after >> which we can be sure that profiling is enabled remains the ERET from the >> exception return. The only difference is that, before this change, the ERET >> synchronized clearing the PMBSR_EL1.S bit, now it synchronizes setting the >> PMBLIMITR_EL1.E bit. >> >> Thoughts? > > To make the discussion easier, I'll focus on the trace enabling flow > in this reply. > > My understanding of a sane flow would be: > > arm_spe_pmu_irq_handler() { > arm_spe_perf_aux_output_begin() { > SYS_PMBPTR_EL1 = base; > > ISB // Synchronization between SPE register setting and > // enabling profiling buffer. > PMBLIMITR_EL1.E = 1; > } > ISB // Context synchronization event to ensure visibility to SPE > } > > ... start trace ... (Bottom half, e.g., softirq, etc) > > ERET > > In the current code, arm_spe_perf_aux_output_begin() is followed by an > ISB, which serves as a context synchronization event to ensure > visibility to the SPE. After that, it ensures that the trace unit will > function correctly. > But I think Alex's point is that in the existing code the thing that finally enables trace (PMBSR_EL1.S = 0) isn't followed by an isb(), only an ERET. So the new flow isn't any different in that regard. > I understand that the Software Usage PKLXF recommends using an ERET as > the synchronization point. However, between enabling the profiling > buffer and the ERET, the kernel might execute other operations (e.g., > softirqs, tasklets, etc.). Isn't preemption disabled? Surely that's not possible. Even if something did run it wouldn't be anything that touches the SPE registers, and we're sure there's an isb() between setting up the buffer and the final PMBLIMITR_EL1.E = 1 > > Therefore, it seems to me that using ERET as the synchronization point > may be too late. This is why I think we should keep an ISB after > arm_spe_perf_aux_output_begin(). > > Thanks, > Leo Wouldn't that make the ERET too late even in the current code then? But I think we're agreeing there's no issue there? James
On Mon, Jul 21, 2025 at 02:20:15PM +0100, James Clark wrote: [...] > > > Thought about this some more. > > > > > > Before: > > > > > > arm_spe_pmu_buf_get_fault_act: > > > <drain buffer> > > > ISB > > > arm_spe_perf_aux_output_begin: > > > PMBLIMITR_EL1.E = 1 > > > ISB > > > PMBSR_EL1.S = 0 > > > ERET > > > > > > Now: > > > > > > PMBLIMITR_EL1 = 0 > > > ISB > > > > > > PMBSR_EL1.S = 0 > > > arm_spe_perf_aux_output_begin: > > > ISB > > > PMBLIMITR_EL1.E = 1 > > > ERET > > > > > > I don't see much of a difference between the two sequences - the point after > > > which we can be sure that profiling is enabled remains the ERET from the > > > exception return. The only difference is that, before this change, the ERET > > > synchronized clearing the PMBSR_EL1.S bit, now it synchronizes setting the > > > PMBLIMITR_EL1.E bit. > > > > > > Thoughts? > > > > To make the discussion easier, I'll focus on the trace enabling flow > > in this reply. > > > > My understanding of a sane flow would be: > > > > arm_spe_pmu_irq_handler() { > > arm_spe_perf_aux_output_begin() { > > SYS_PMBPTR_EL1 = base; > > > > ISB // Synchronization between SPE register setting and > > // enabling profiling buffer. > > PMBLIMITR_EL1.E = 1; > > } > > ISB // Context synchronization event to ensure visibility to SPE > > } > > > > ... start trace ... (Bottom half, e.g., softirq, etc) > > > > ERET > > > > In the current code, arm_spe_perf_aux_output_begin() is followed by an > > ISB, which serves as a context synchronization event to ensure > > visibility to the SPE. After that, it ensures that the trace unit will > > function correctly. > > > > But I think Alex's point is that in the existing code the thing that finally > enables trace (PMBSR_EL1.S = 0) isn't followed by an isb(), only an ERET. So > the new flow isn't any different in that regard. Thanks for explanation. > > I understand that the Software Usage PKLXF recommends using an ERET as > > the synchronization point. However, between enabling the profiling > > buffer and the ERET, the kernel might execute other operations (e.g., > > softirqs, tasklets, etc.). > > Isn't preemption disabled? Surely that's not possible. Even if something did > run it wouldn't be anything that touches the SPE registers, and we're sure > there's an isb() between setting up the buffer and the final PMBLIMITR_EL1.E > = 1 Yes, bottom half runs in preemtion disabled state. See: el1_interrupt() -> __el1_irq() -> irq_exit_rcu() -> invoke_softirq() In some cases, sotfirq and tasklet might even cause long latency (I believe it can be milliseconds or even longer), this is why ftrace "irqsoff" tracer is used to profile latency caused by irq off state. Bottom half does not modify SPE registsers, but it can be a long road between enabling SPE trace and ERET. > > Therefore, it seems to me that using ERET as the synchronization point > > may be too late. This is why I think we should keep an ISB after > > arm_spe_perf_aux_output_begin(). > > Wouldn't that make the ERET too late even in the current code then? But I > think we're agreeing there's no issue there? I would say ERET is too late for current code as well. Thanks, Leo
On 21/07/2025 4:21 pm, Leo Yan wrote: > On Mon, Jul 21, 2025 at 02:20:15PM +0100, James Clark wrote: > > [...] > >>>> Thought about this some more. >>>> >>>> Before: >>>> >>>> arm_spe_pmu_buf_get_fault_act: >>>> <drain buffer> >>>> ISB >>>> arm_spe_perf_aux_output_begin: >>>> PMBLIMITR_EL1.E = 1 >>>> ISB >>>> PMBSR_EL1.S = 0 >>>> ERET >>>> >>>> Now: >>>> >>>> PMBLIMITR_EL1 = 0 >>>> ISB >>>> >>>> PMBSR_EL1.S = 0 >>>> arm_spe_perf_aux_output_begin: >>>> ISB >>>> PMBLIMITR_EL1.E = 1 >>>> ERET >>>> >>>> I don't see much of a difference between the two sequences - the point after >>>> which we can be sure that profiling is enabled remains the ERET from the >>>> exception return. The only difference is that, before this change, the ERET >>>> synchronized clearing the PMBSR_EL1.S bit, now it synchronizes setting the >>>> PMBLIMITR_EL1.E bit. >>>> >>>> Thoughts? >>> >>> To make the discussion easier, I'll focus on the trace enabling flow >>> in this reply. >>> >>> My understanding of a sane flow would be: >>> >>> arm_spe_pmu_irq_handler() { >>> arm_spe_perf_aux_output_begin() { >>> SYS_PMBPTR_EL1 = base; >>> >>> ISB // Synchronization between SPE register setting and >>> // enabling profiling buffer. >>> PMBLIMITR_EL1.E = 1; >>> } >>> ISB // Context synchronization event to ensure visibility to SPE >>> } >>> >>> ... start trace ... (Bottom half, e.g., softirq, etc) >>> >>> ERET >>> >>> In the current code, arm_spe_perf_aux_output_begin() is followed by an >>> ISB, which serves as a context synchronization event to ensure >>> visibility to the SPE. After that, it ensures that the trace unit will >>> function correctly. >>> >> >> But I think Alex's point is that in the existing code the thing that finally >> enables trace (PMBSR_EL1.S = 0) isn't followed by an isb(), only an ERET. So >> the new flow isn't any different in that regard. > > Thanks for explanation. > >>> I understand that the Software Usage PKLXF recommends using an ERET as >>> the synchronization point. However, between enabling the profiling >>> buffer and the ERET, the kernel might execute other operations (e.g., >>> softirqs, tasklets, etc.). >> >> Isn't preemption disabled? Surely that's not possible. Even if something did >> run it wouldn't be anything that touches the SPE registers, and we're sure >> there's an isb() between setting up the buffer and the final PMBLIMITR_EL1.E >> = 1 > > Yes, bottom half runs in preemtion disabled state. See: > > el1_interrupt() -> __el1_irq() -> irq_exit_rcu() -> invoke_softirq() > > In some cases, sotfirq and tasklet might even cause long latency (I > believe it can be milliseconds or even longer), this is why ftrace > "irqsoff" tracer is used to profile latency caused by irq off state. > > Bottom half does not modify SPE registsers, but it can be a long road > between enabling SPE trace and ERET. > >>> Therefore, it seems to me that using ERET as the synchronization point >>> may be too late. This is why I think we should keep an ISB after >>> arm_spe_perf_aux_output_begin(). >> >> Wouldn't that make the ERET too late even in the current code then? But I >> think we're agreeing there's no issue there? > > I would say ERET is too late for current code as well. > > Thanks, > Leo Ok I get it now. The point is that there is stuff in between the return in the SPE IRQ handler and the actual ERET. If someone is interested in sampling the kernel then they might miss sampling a small amount of that. It's not a correctness thing, just reducing potential gaps in the samples. I can add another commit to add it, but it doesn't look like it would be a fixes commit either, just an improvement. And the same issue applies to the existing code too. James
Hi Leo, James, On Tue, Jul 22, 2025 at 03:46:11PM +0100, James Clark wrote: > > > On 21/07/2025 4:21 pm, Leo Yan wrote: > > On Mon, Jul 21, 2025 at 02:20:15PM +0100, James Clark wrote: > > > > [...] > > > > > > > Thought about this some more. > > > > > > > > > > Before: > > > > > > > > > > arm_spe_pmu_buf_get_fault_act: > > > > > <drain buffer> > > > > > ISB > > > > > arm_spe_perf_aux_output_begin: > > > > > PMBLIMITR_EL1.E = 1 > > > > > ISB > > > > > PMBSR_EL1.S = 0 > > > > > ERET > > > > > > > > > > Now: > > > > > > > > > > PMBLIMITR_EL1 = 0 > > > > > ISB > > > > > > > > > > PMBSR_EL1.S = 0 > > > > > arm_spe_perf_aux_output_begin: > > > > > ISB > > > > > PMBLIMITR_EL1.E = 1 > > > > > ERET > > > > > > > > > > I don't see much of a difference between the two sequences - the point after > > > > > which we can be sure that profiling is enabled remains the ERET from the > > > > > exception return. The only difference is that, before this change, the ERET > > > > > synchronized clearing the PMBSR_EL1.S bit, now it synchronizes setting the > > > > > PMBLIMITR_EL1.E bit. > > > > > > > > > > Thoughts? > > > > > > > > To make the discussion easier, I'll focus on the trace enabling flow > > > > in this reply. > > > > > > > > My understanding of a sane flow would be: > > > > > > > > arm_spe_pmu_irq_handler() { > > > > arm_spe_perf_aux_output_begin() { > > > > SYS_PMBPTR_EL1 = base; > > > > > > > > ISB // Synchronization between SPE register setting and > > > > // enabling profiling buffer. > > > > PMBLIMITR_EL1.E = 1; > > > > } > > > > ISB // Context synchronization event to ensure visibility to SPE > > > > } > > > > > > > > ... start trace ... (Bottom half, e.g., softirq, etc) > > > > > > > > ERET > > > > > > > > In the current code, arm_spe_perf_aux_output_begin() is followed by an > > > > ISB, which serves as a context synchronization event to ensure > > > > visibility to the SPE. After that, it ensures that the trace unit will > > > > function correctly. > > > > > > > > > > But I think Alex's point is that in the existing code the thing that finally > > > enables trace (PMBSR_EL1.S = 0) isn't followed by an isb(), only an ERET. So > > > the new flow isn't any different in that regard. > > > > Thanks for explanation. > > > > > > I understand that the Software Usage PKLXF recommends using an ERET as > > > > the synchronization point. However, between enabling the profiling > > > > buffer and the ERET, the kernel might execute other operations (e.g., > > > > softirqs, tasklets, etc.). > > > > > > Isn't preemption disabled? Surely that's not possible. Even if something did > > > run it wouldn't be anything that touches the SPE registers, and we're sure > > > there's an isb() between setting up the buffer and the final PMBLIMITR_EL1.E > > > = 1 > > > > Yes, bottom half runs in preemtion disabled state. See: > > > > el1_interrupt() -> __el1_irq() -> irq_exit_rcu() -> invoke_softirq() > > > > In some cases, sotfirq and tasklet might even cause long latency (I > > believe it can be milliseconds or even longer), this is why ftrace > > "irqsoff" tracer is used to profile latency caused by irq off state. > > > > Bottom half does not modify SPE registsers, but it can be a long road > > between enabling SPE trace and ERET. > > > > > > Therefore, it seems to me that using ERET as the synchronization point > > > > may be too late. This is why I think we should keep an ISB after > > > > arm_spe_perf_aux_output_begin(). > > > > > > Wouldn't that make the ERET too late even in the current code then? But I > > > think we're agreeing there's no issue there? > > > > I would say ERET is too late for current code as well. > > > > Thanks, > > Leo Thanks for the explanation and the analysis. I think we were looking at the patch from different point of views: I was interested in not changing the current behaviour, you were saying that the existing behaviour can be improved upon. > Ok I get it now. The point is that there is stuff in between the return in > the SPE IRQ handler and the actual ERET. If someone is interested in > sampling the kernel then they might miss sampling a small amount of that. > > It's not a correctness thing, just reducing potential gaps in the samples. I > can add another commit to add it, but it doesn't look like it would be a > fixes commit either, just an improvement. And the same issue applies to the > existing code too. I agree here, this looks on an improvement on existing behaviour. I would keep it as a patch separate from this series, as it's not a fix, and it's not related to to the rules from DEN0154. Thanks, Alex
Hi Leo, On Mon, Jul 07, 2025 at 04:37:10PM +0100, Leo Yan wrote: > On Mon, Jul 07, 2025 at 12:39:57PM +0100, James Clark wrote: > > [...] > > > > > @@ -661,16 +666,24 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) > > > > */ > > > > irq_work_run(); > > > > + /* > > > > + * arm_spe_pmu_buf_get_fault_act() already drained, and PMBSR_EL1.S == 1 > > > > + * means that StatisticalProfilingEnabled() == false. So now we can > > > > + * safely disable the buffer. > > > > + */ > > > > + write_sysreg_s(0, SYS_PMBLIMITR_EL1); > > > > + isb(); > > > > + > > > > + /* Status can be cleared now that PMBLIMITR_EL1.E == 0 */ > > > > + write_sysreg_s(0, SYS_PMBSR_EL1); > > > > + > > > > > > An important thing is about sequence: > > > As described in arm_spe_pmu_disable_and_drain_local(), should we always > > > clear ELs bits in PMSCR_EL1 before clear PMBLIMITR_EL1.E bit? As a > > > reference, we could see TRBE always clear ELx bits before disable trace > > > buffer. > > > > > > And a trivial flaw: > > > > > > If the TRUNCATED flag has been set, the irq_work_run() above runs the > > > IRQ work to invoke the arm_spe_pmu_stop() to disable trace buffer, which > > > clear SYS_PMBLIMITR_EL1.E bit. This is why the current code does not > > > explictly clear SYS_PMBLIMITR_EL1.E bit. > > > > > > With this patch, the interrupt handler will clear SYS_PMBLIMITR_EL1.E > > > bit twice for a trunacated case. > > > > I suppose that's a rarer case that we don't necessarily have to optimize > > for. I don't think it will do any harm, but is it even possible to avoid? > > > > There are already some other duplications in the driver, for example in > > arm_spe_pmu_stop() we call arm_spe_pmu_disable_and_drain_local() which > > drains, and then arm_spe_pmu_buf_get_fault_act() which also drains again. > > If we don't need to worry about duplicated operations in the truncated > case, then for easier maintenance and better readability, I'm wondering > if we could simplify the interrupt handler as follows: > > arm_spe_pmu_irq_handler() > { > ... > > act = arm_spe_pmu_buf_get_fault_act(handle); > if (act == SPE_PMU_BUF_FAULT_ACT_SPURIOUS) > return IRQ_NONE; > > arm_spe_pmu_disable_and_drain_local(); > > /* Status can be cleared now that PMBLIMITR_EL1.E == 0 */ > write_sysreg_s(0, SYS_PMBSR_EL1); > isb(); > > switch (act) { > ... > } > } > > This approach complies with DEN0154 - we must clear PMBLIMITR_EL1.E > before writing to other SPE system registers (e.g., PMBSR). > > The reason for using arm_spe_pmu_disable_and_drain_local() is that we > first need to disable profiling instructions by clearing PMSCR_EL1/EL2, > and then is it safe to disable the profiling buffer. Sampling is disabled when ProfilingBufferEnabled() = false (look at the pseudocode for SPEPreExecution() in the Arm ARM). ProfilingBufferEnabled() = false if PMBLIMITR_EL1.E = 0 *or* PMBSR_EL1.S = 1. One way to think about it is: if PMBLIMITR_EL1.E = 0, then there's no point in sampling instructions and creating records because the profiling unit cannot write them to memory; and if PMBSR_EL1.S = 1, something went wrong with the buffer, or the buffer is full, and similarly there's no point in sampling instructions and creating records because, again, the profiling unit cannot write them to memory. > > [...] > > > > > case SPE_PMU_BUF_FAULT_ACT_OK: > > > > /* > > > > @@ -679,18 +692,14 @@ static irqreturn_t arm_spe_pmu_irq_handler(int irq, void *dev) > > > > * PMBPTR might be misaligned, but we'll burn that bridge > > > > * when we get to it. > > > > */ > > > > - if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) { > > > > + if (!(handle->aux_flags & PERF_AUX_FLAG_TRUNCATED)) > > > > arm_spe_perf_aux_output_begin(handle, event); > > > > - isb(); > > > > > > I am a bit suspecious we can remove this isb(). > > > > > > As a reference to the software usage PKLXF in Arm ARM (DDI 0487 L.a), > > > after enable TRBE trace unit, an ISB is mandatory. Maybe check a bit > > > for this? > > > > Wasn't this isb() to separate the programming of the registers with the > > status register clear at the end of this function to enable profiling? > > Enabling profiling buffer followed an isb() is not only for separating > other register programming. > > As described in section D17.9, Synchronization and Statistical Profiling > in Arm ARM: > > "A Context Synchronization event guarantees that a direct write to a > System register made by the PE in program order before the Context > synchronization event are observable by indirect reads and indirect > writes of the same System register made by a profiling operation > relating to a sampled operation in program order after the Context > synchronization event." > > My understanding is: after the ARM SPE profiling is enabled, the > followed ISB is a Synchronization to make sure the system register > values are observed by SPE. And we cannot rely on ERET, especially if > we are tracing the kernel mode. DDI 0487L.a, rule R_BWCFK: If the Effective value of SCTLR_ELx.EOS is 1, the exception return is a Context synchronization event. Linux always sets EOS. Thanks, Alex > > Thanks, > Leo > > > But now we enable profiling with the write to PMBLIMITR_EL1 in > > arm_spe_perf_aux_output_begin() and the last thing here is the ERET. That's > > specifically mentioned as enough synchronization in PKLXF: > > > > In the common case, this is an ERET instruction that returns to a > > different Exception level where tracing is allowed. > > > > > > - } > > > > break; > > > > case SPE_PMU_BUF_FAULT_ACT_SPURIOUS: > > > > /* We've seen you before, but GCC has the memory of a sieve. */ > > > > break; > > > > } > > > > - /* The buffer pointers are now sane, so resume profiling. */ > > > > - write_sysreg_s(0, SYS_PMBSR_EL1); > > > > return IRQ_HANDLED; > > > > } > > > > > > > > -- > > > > 2.34.1 > > > > > > > > > >
© 2016 - 2025 Red Hat, Inc.