DEN0154 states that PMBPTR_EL1 must not be modified while the profiling
buffer is enabled. Ensure that enabling the buffer comes after setting
PMBPTR_EL1 by inserting an isb().
This only applies to guests for now, but in future versions of the
architecture the PE will be allowed to behave in the same way.
Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension")
Signed-off-by: James Clark <james.clark@linaro.org>
---
drivers/perf/arm_spe_pmu.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c
index 3efed8839a4e..6235ca7ecd48 100644
--- a/drivers/perf/arm_spe_pmu.c
+++ b/drivers/perf/arm_spe_pmu.c
@@ -537,6 +537,7 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle,
limit += (u64)buf->base;
base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf);
write_sysreg_s(base, SYS_PMBPTR_EL1);
+ isb();
out_write_limit:
write_sysreg_s(limit, SYS_PMBLIMITR_EL1);
--
2.34.1
On Tue, Jul 01, 2025 at 04:31:57PM +0100, James Clark wrote: > DEN0154 states that PMBPTR_EL1 must not be modified while the profiling > buffer is enabled. Ensure that enabling the buffer comes after setting > PMBPTR_EL1 by inserting an isb(). > > This only applies to guests for now, but in future versions of the > architecture the PE will be allowed to behave in the same way. > > Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension") > Signed-off-by: James Clark <james.clark@linaro.org> > --- > drivers/perf/arm_spe_pmu.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c > index 3efed8839a4e..6235ca7ecd48 100644 > --- a/drivers/perf/arm_spe_pmu.c > +++ b/drivers/perf/arm_spe_pmu.c > @@ -537,6 +537,7 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle, > limit += (u64)buf->base; > base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf); > write_sysreg_s(base, SYS_PMBPTR_EL1); > + isb(); Hmm. arm_spe_perf_aux_output_begin() is only called in two places: 1. From arm_spe_pmu_start() 2. From arm_spe_pmu_irq_handler() For (1), we know that profiling is disabled by PMSCR_EL1.ExSPE. For (2), we know that profiling is disabled by PMBSR_EL1.S. In both cases, we already have an isb() before enabling profiling again so I don't understand what this additional isb() is achieving. Will
On 08/09/2025 2:41 pm, Will Deacon wrote: > On Tue, Jul 01, 2025 at 04:31:57PM +0100, James Clark wrote: >> DEN0154 states that PMBPTR_EL1 must not be modified while the profiling >> buffer is enabled. Ensure that enabling the buffer comes after setting >> PMBPTR_EL1 by inserting an isb(). >> >> This only applies to guests for now, but in future versions of the >> architecture the PE will be allowed to behave in the same way. >> >> Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension") >> Signed-off-by: James Clark <james.clark@linaro.org> >> --- >> drivers/perf/arm_spe_pmu.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c >> index 3efed8839a4e..6235ca7ecd48 100644 >> --- a/drivers/perf/arm_spe_pmu.c >> +++ b/drivers/perf/arm_spe_pmu.c >> @@ -537,6 +537,7 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle, >> limit += (u64)buf->base; >> base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf); >> write_sysreg_s(base, SYS_PMBPTR_EL1); >> + isb(); > > > Hmm. > > arm_spe_perf_aux_output_begin() is only called in two places: > > 1. From arm_spe_pmu_start() > 2. From arm_spe_pmu_irq_handler() > > For (1), we know that profiling is disabled by PMSCR_EL1.ExSPE. > For (2), we know that profiling is disabled by PMBSR_EL1.S. > > In both cases, we already have an isb() before enabling profiling again > so I don't understand what this additional isb() is achieving. > > Will It's to prevent PMBPTR_EL1 from being written to after the PMBLIMITR_EL1 write than enables the buffer again. So you're right it's already disabled up to this point, which is why we didn't need to add another isb(). This change is only for the re-enabling bit. If the instructions were reordered you could get this ordering at the end of arm_spe_perf_aux_output_begin(): write_sysreg_s(limit, SYS_PMBLIMITR_EL1); // Enables buffer write_sysreg_s(base, SYS_PMBPTR_EL1); // Invalid write to PMBPTR Instead of the new version with the barrier where PMBPTR must come before: write_sysreg_s(base, SYS_PMBPTR_EL1); isb() write_sysreg_s(limit, SYS_PMBLIMITR_EL1); Thanks James
On Mon, Sep 08, 2025 at 02:54:32PM +0100, James Clark wrote: > > > On 08/09/2025 2:41 pm, Will Deacon wrote: > > On Tue, Jul 01, 2025 at 04:31:57PM +0100, James Clark wrote: > > > DEN0154 states that PMBPTR_EL1 must not be modified while the profiling > > > buffer is enabled. Ensure that enabling the buffer comes after setting > > > PMBPTR_EL1 by inserting an isb(). > > > > > > This only applies to guests for now, but in future versions of the > > > architecture the PE will be allowed to behave in the same way. > > > > > > Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension") > > > Signed-off-by: James Clark <james.clark@linaro.org> > > > --- > > > drivers/perf/arm_spe_pmu.c | 1 + > > > 1 file changed, 1 insertion(+) > > > > > > diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c > > > index 3efed8839a4e..6235ca7ecd48 100644 > > > --- a/drivers/perf/arm_spe_pmu.c > > > +++ b/drivers/perf/arm_spe_pmu.c > > > @@ -537,6 +537,7 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle, > > > limit += (u64)buf->base; > > > base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf); > > > write_sysreg_s(base, SYS_PMBPTR_EL1); > > > + isb(); > > > > > > Hmm. > > > > arm_spe_perf_aux_output_begin() is only called in two places: > > > > 1. From arm_spe_pmu_start() > > 2. From arm_spe_pmu_irq_handler() > > > > For (1), we know that profiling is disabled by PMSCR_EL1.ExSPE. > > For (2), we know that profiling is disabled by PMBSR_EL1.S. > > > > In both cases, we already have an isb() before enabling profiling again > > so I don't understand what this additional isb() is achieving. > > > > It's to prevent PMBPTR_EL1 from being written to after the PMBLIMITR_EL1 > write than enables the buffer again. So you're right it's already disabled > up to this point, which is why we didn't need to add another isb(). This > change is only for the re-enabling bit. > > If the instructions were reordered you could get this ordering at the end of > arm_spe_perf_aux_output_begin(): > > write_sysreg_s(limit, SYS_PMBLIMITR_EL1); // Enables buffer > > write_sysreg_s(base, SYS_PMBPTR_EL1); // Invalid write to PMBPTR > > Instead of the new version with the barrier where PMBPTR must come before: > > write_sysreg_s(base, SYS_PMBPTR_EL1); > isb() > write_sysreg_s(limit, SYS_PMBLIMITR_EL1); ... but my point is that profiling is still disabled after writing to PMBLIMITR_EL1. Will
On 08/09/2025 2:57 pm, Will Deacon wrote: > On Mon, Sep 08, 2025 at 02:54:32PM +0100, James Clark wrote: >> >> >> On 08/09/2025 2:41 pm, Will Deacon wrote: >>> On Tue, Jul 01, 2025 at 04:31:57PM +0100, James Clark wrote: >>>> DEN0154 states that PMBPTR_EL1 must not be modified while the profiling >>>> buffer is enabled. Ensure that enabling the buffer comes after setting >>>> PMBPTR_EL1 by inserting an isb(). >>>> >>>> This only applies to guests for now, but in future versions of the >>>> architecture the PE will be allowed to behave in the same way. >>>> >>>> Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension") >>>> Signed-off-by: James Clark <james.clark@linaro.org> >>>> --- >>>> drivers/perf/arm_spe_pmu.c | 1 + >>>> 1 file changed, 1 insertion(+) >>>> >>>> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c >>>> index 3efed8839a4e..6235ca7ecd48 100644 >>>> --- a/drivers/perf/arm_spe_pmu.c >>>> +++ b/drivers/perf/arm_spe_pmu.c >>>> @@ -537,6 +537,7 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle, >>>> limit += (u64)buf->base; >>>> base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf); >>>> write_sysreg_s(base, SYS_PMBPTR_EL1); >>>> + isb(); >>> >>> >>> Hmm. >>> >>> arm_spe_perf_aux_output_begin() is only called in two places: >>> >>> 1. From arm_spe_pmu_start() >>> 2. From arm_spe_pmu_irq_handler() >>> >>> For (1), we know that profiling is disabled by PMSCR_EL1.ExSPE. >>> For (2), we know that profiling is disabled by PMBSR_EL1.S. >>> >>> In both cases, we already have an isb() before enabling profiling again >>> so I don't understand what this additional isb() is achieving. >>> >> >> It's to prevent PMBPTR_EL1 from being written to after the PMBLIMITR_EL1 >> write than enables the buffer again. So you're right it's already disabled >> up to this point, which is why we didn't need to add another isb(). This >> change is only for the re-enabling bit. >> >> If the instructions were reordered you could get this ordering at the end of >> arm_spe_perf_aux_output_begin(): >> >> write_sysreg_s(limit, SYS_PMBLIMITR_EL1); // Enables buffer >> >> write_sysreg_s(base, SYS_PMBPTR_EL1); // Invalid write to PMBPTR >> >> Instead of the new version with the barrier where PMBPTR must come before: >> >> write_sysreg_s(base, SYS_PMBPTR_EL1); >> isb() >> write_sysreg_s(limit, SYS_PMBLIMITR_EL1); > > ... but my point is that profiling is still disabled after writing to > PMBLIMITR_EL1. > > Will Oh I see what you mean, I misunderstood that. You might be right, but I'm looking at statement SFDXJJ and it only says "...PMBLIMITR_EL1.E is 0b1, meaning the Profiling Buffer is enabled...", so it's just the buffer rather than "profiling is enabled" which would require both bits PMBLIMITR_EL1.E = 1 and PMBSR_EL1.S = 0: SFDXJJ When PMBLIMITR_EL1.E is 0b1, meaning the Profiling Buffer is enabled, software must behave as if the PE can do all of the following: * Ignore writes to the Profiling Buffer controls, other than a write to PMBLIMITR_EL1.E that disables the Profiling Buffer. The Statistical Profiling Unit registers affected are: - PMBPTR_EL1. - PMBLIMITR_EL1. - PMBSR_EL1. - If FEAT_SPE_nVM is implemented, PMBMAR_EL1. I'm trying to read Alex's other reply to this patch with it in mind that profiling is still disabled, and it feels like your same point might apply. Even if it's incorrectly programmed according to the existing Arm ARM it doesn't matter if it's disabled. We did discuss internally about the difference between just the buffer being enabled or profiling altogether being enabled. Looks like I need to check again. James
Hi James, On Tue, Jul 01, 2025 at 04:31:57PM +0100, James Clark wrote: > DEN0154 states that PMBPTR_EL1 must not be modified while the profiling > buffer is enabled. Ensure that enabling the buffer comes after setting > PMBPTR_EL1 by inserting an isb(). > > This only applies to guests for now, but in future versions of the > architecture the PE will be allowed to behave in the same way. > > Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension") A note on why I think this is a fix. The write to PMBLIMITR_EL1 serves two purposes: to enable the buffer and to set the limit for the new buffer. The statistical profiling unit (I'm calling it SPU) performs indirect reads of the registers. Without the ISB between the buffer pointer write and the write that enables + sets limit for the buffer, I think it's possible for the SPU to observe the PMBLIMITR_EL1 write before the PMBPTR_EL1 write. During this time, from the point of view of the SPU, the buffer is incorrectly programmed, and potentially the old PMBPTR_EL1.PTR > new PMBLIMITR_EL1.Limit. arm_spe_perf_aux_output_begin() can be called after sampling has been enabled (PMSCR_EL1.E1SPE = 1). Putting it all together, this means that we have all the conditions to break the restrictions on the current write pointer and the behaviour is UNPREDICTABLE, as per section D17.7.1, ARM DDI 0487L. I think it's worth pointing out that the SPU can do any of the folling in this situation: writing to any writable virtual address in the current owning translation regime (!), generate a profiling management event, discard all data or don't enable profiling. Thanks, Alex > Signed-off-by: James Clark <james.clark@linaro.org> > --- > drivers/perf/arm_spe_pmu.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c > index 3efed8839a4e..6235ca7ecd48 100644 > --- a/drivers/perf/arm_spe_pmu.c > +++ b/drivers/perf/arm_spe_pmu.c > @@ -537,6 +537,7 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle, > limit += (u64)buf->base; > base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf); > write_sysreg_s(base, SYS_PMBPTR_EL1); > + isb(); > > out_write_limit: > write_sysreg_s(limit, SYS_PMBLIMITR_EL1); > > -- > 2.34.1 >
On 08/07/2025 3:40 pm, Alexandru Elisei wrote: > Hi James, > > On Tue, Jul 01, 2025 at 04:31:57PM +0100, James Clark wrote: >> DEN0154 states that PMBPTR_EL1 must not be modified while the profiling >> buffer is enabled. Ensure that enabling the buffer comes after setting >> PMBPTR_EL1 by inserting an isb(). >> >> This only applies to guests for now, but in future versions of the >> architecture the PE will be allowed to behave in the same way. >> >> Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension") > > A note on why I think this is a fix. > > The write to PMBLIMITR_EL1 serves two purposes: to enable the buffer and to set > the limit for the new buffer. The statistical profiling unit (I'm calling it > SPU) performs indirect reads of the registers. Without the ISB between the > buffer pointer write and the write that enables + sets limit for the buffer, I > think it's possible for the SPU to observe the PMBLIMITR_EL1 write before the > PMBPTR_EL1 write. During this time, from the point of view of the SPU, the > buffer is incorrectly programmed, and potentially the old PMBPTR_EL1.PTR > new > PMBLIMITR_EL1.Limit. > > arm_spe_perf_aux_output_begin() can be called after sampling has been enabled > (PMSCR_EL1.E1SPE = 1). > > Putting it all together, this means that we have all the conditions to break the > restrictions on the current write pointer and the behaviour is UNPREDICTABLE, as > per section D17.7.1, ARM DDI 0487L. I think it's worth pointing out that the SPU > can do any of the folling in this situation: writing to any writable virtual > address in the current owning translation regime (!), generate a profiling > management event, discard all data or don't enable profiling. D17.7.1, ARM DDI 0487L only states this is an issue at the point of enabling, not writing: "When profiling becomes enabled, all the following must be true..." I think Will has a point that it's not enabled at this point and it only gets enabled after some existing isb()s. This is only an issue if we're following DEN0154 which seems to be more strict. > > Thanks, > Alex > >> Signed-off-by: James Clark <james.clark@linaro.org> >> --- >> drivers/perf/arm_spe_pmu.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c >> index 3efed8839a4e..6235ca7ecd48 100644 >> --- a/drivers/perf/arm_spe_pmu.c >> +++ b/drivers/perf/arm_spe_pmu.c >> @@ -537,6 +537,7 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle, >> limit += (u64)buf->base; >> base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf); >> write_sysreg_s(base, SYS_PMBPTR_EL1); >> + isb(); >> >> out_write_limit: >> write_sysreg_s(limit, SYS_PMBLIMITR_EL1); >> >> -- >> 2.34.1 >>
On Tue, Jul 01, 2025 at 04:31:57PM +0100, James Clark wrote: > DEN0154 states that PMBPTR_EL1 must not be modified while the profiling > buffer is enabled. Ensure that enabling the buffer comes after setting > PMBPTR_EL1 by inserting an isb(). > > This only applies to guests for now, but in future versions of the > architecture the PE will be allowed to behave in the same way. > > Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension") > Signed-off-by: James Clark <james.clark@linaro.org> > --- > drivers/perf/arm_spe_pmu.c | 1 + > 1 file changed, 1 insertion(+) > > diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c > index 3efed8839a4e..6235ca7ecd48 100644 > --- a/drivers/perf/arm_spe_pmu.c > +++ b/drivers/perf/arm_spe_pmu.c > @@ -537,6 +537,7 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle, > limit += (u64)buf->base; > base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf); > write_sysreg_s(base, SYS_PMBPTR_EL1); > + isb(); I know that you and Alexandru have discussed whether the isb() should be placed here or after the out_write_limit label. I should have engaged in the discussion earlier. Sorry for raising the question now. My understanding is that isb() is not only for synchronizing the write to PMBPTR_EL1. It also serves as a context synchronization event between any other SPE register writes and the write to SYS_PMBLIMITR_EL1. Let me give an example (perhaps a rare one): if we use perf snapshot mode or the AUX pause/resume mode, it's possible that the flow does not trigger an interrupt via overflow. Instead, the sequence might look like this: arm_spe_pmu_stop() `> arm_spe_pmu_start() `> arm_spe_perf_aux_output_begin() In this case, to ensure that all SPE system registers are properly written to the hardware, the safest approach is to always execute isb() just before writing to SYS_PMBLIMITR_EL1. (In other words, after the label out_write_limit). Thanks, Leo > out_write_limit: > write_sysreg_s(limit, SYS_PMBLIMITR_EL1); > > -- > 2.34.1 > >
On 04/07/2025 3:04 pm, Leo Yan wrote: > On Tue, Jul 01, 2025 at 04:31:57PM +0100, James Clark wrote: >> DEN0154 states that PMBPTR_EL1 must not be modified while the profiling >> buffer is enabled. Ensure that enabling the buffer comes after setting >> PMBPTR_EL1 by inserting an isb(). >> >> This only applies to guests for now, but in future versions of the >> architecture the PE will be allowed to behave in the same way. >> >> Fixes: d5d9696b0380 ("drivers/perf: Add support for ARMv8.2 Statistical Profiling Extension") >> Signed-off-by: James Clark <james.clark@linaro.org> >> --- >> drivers/perf/arm_spe_pmu.c | 1 + >> 1 file changed, 1 insertion(+) >> >> diff --git a/drivers/perf/arm_spe_pmu.c b/drivers/perf/arm_spe_pmu.c >> index 3efed8839a4e..6235ca7ecd48 100644 >> --- a/drivers/perf/arm_spe_pmu.c >> +++ b/drivers/perf/arm_spe_pmu.c >> @@ -537,6 +537,7 @@ static void arm_spe_perf_aux_output_begin(struct perf_output_handle *handle, >> limit += (u64)buf->base; >> base = (u64)buf->base + PERF_IDX2OFF(handle->head, buf); >> write_sysreg_s(base, SYS_PMBPTR_EL1); >> + isb(); > > I know that you and Alexandru have discussed whether the isb() should > be placed here or after the out_write_limit label. I should have engaged > in the discussion earlier. Sorry for raising the question now. > > My understanding is that isb() is not only for synchronizing the write > to PMBPTR_EL1. It also serves as a context synchronization event > between any other SPE register writes and the write to > SYS_PMBLIMITR_EL1. > > Let me give an example (perhaps a rare one): if we use perf snapshot > mode or the AUX pause/resume mode, it's possible that the flow does > not trigger an interrupt via overflow. Instead, the sequence might > look like this: > > arm_spe_pmu_stop() > `> arm_spe_pmu_start() > `> arm_spe_perf_aux_output_begin() > > In this case, to ensure that all SPE system registers are properly > written to the hardware, the safest approach is to always execute isb() > just before writing to SYS_PMBLIMITR_EL1. (In other words, after the > label out_write_limit). > > Thanks, > Leo > I think the point is that any write that enables the buffer must come last, but not necessarily all writes. And not all paths in arm_spe_perf_aux_output_begin() enable it so the isb() was only added on the path that does. I couldn't see an issue with your example, are you saying arm_spe_pmu_stop() could call arm_spe_pmu_start()? It doesn't call it directly. Or do you mean the aux pause/resume thing can cause a arm_spe_pmu_start() from any point in time? If that was true then it doesn't matter where the isb() is because you can never be sure it will be before the write. James >> out_write_limit: >> write_sysreg_s(limit, SYS_PMBLIMITR_EL1); >> >> -- >> 2.34.1 >> >> >
© 2016 - 2025 Red Hat, Inc.