Documentation/devicetree/bindings/arm/pmu.yaml | 6 + arch/arm64/boot/dts/apple/s5l8960x.dtsi | 8 + arch/arm64/boot/dts/apple/s800-0-3.dtsi | 8 + arch/arm64/boot/dts/apple/s8001.dtsi | 8 + arch/arm64/boot/dts/apple/t7000.dtsi | 8 + arch/arm64/boot/dts/apple/t7001.dtsi | 9 + arch/arm64/boot/dts/apple/t8010.dtsi | 8 + arch/arm64/boot/dts/apple/t8011.dtsi | 9 + arch/arm64/boot/dts/apple/t8012.dtsi | 8 + arch/arm64/boot/dts/apple/t8015.dtsi | 24 + arch/arm64/include/asm/apple_m1_pmu.h | 3 + drivers/perf/apple_m1_cpu_pmu.c | 807 +++++++++++++++++++++++-- 12 files changed, 871 insertions(+), 35 deletions(-)
This series adds support for the CPU PMU in the older Apple A7-A11, T2 SoCs. These PMUs may have a different event layout, less counters, or deliver their interrupts via IRQ instead of a FIQ. Since some of those older SoCs support 32-bit EL0, counting for 32-bit EL0 also need to be enabled by the driver where applicable. Patch 1 adds the DT bindings. Patch 2-7 prepares the driver to allow adding support for those older SoCs. Patch 8-12 adds support for the older SoCs. Patch 13-21 are the DT changes. Signed-off-by: Nick Chan <towinchenmi@gmail.com> --- Changes in v7: - Fix a W=1 compile warning in apple_pmu_get_event_idx() as appearently using GENMASK() in a function prototype causes a warning in GCC. - Link to v6: https://lore.kernel.org/r/20250407-apple-cpmu-v6-0-ae8c2f225c1f@gmail.com Changes in v6: - Rebased on top of v6.15-rc1 (Conflict with FEAT_PMUv3 support for KVM on Apple Hardware) - Add patch to skip initialization of PMUv3 remap in EL1 even though not strictly needed - Include DT patches - Link to v5: https://lore.kernel.org/r/20250228-apple-cpmu-v5-0-9e124cd28ed4@gmail.com Changes in v5: - Slightly change "drivers/perf: apple_m1: Add Apple A11 Support", to keep things in chronological order. - Link to v4: https://lore.kernel.org/r/20250214-apple-cpmu-v4-0-ffca0e45147e@gmail.com Changes in v4: - Support per-implementation event attr group - Fix Apple A7 event attr groups - Link to v3: https://lore.kernel.org/r/20250213-apple-cpmu-v3-0-be7f8aded81f@gmail.com Changes in v3: - Configure PMC8 and PMC9 for 32-bit EL0 - Remove redundant _common suffix from shared functions - Link to v2: https://lore.kernel.org/r/20250213-apple-cpmu-v2-0-87b361932e88@gmail.com Changes in v2: - Remove unused flags parameter from apple_pmu_init_common() - Link to v1: https://lore.kernel.org/r/20250212-apple-cpmu-v1-0-f8c7f2ac1743@gmail.com --- Nick Chan (21): dt-bindings: arm: pmu: Add Apple A7-A11 SoC CPU PMU compatibles drivers/perf: apple_m1: Only init PMUv3 remap when EL2 is available drivers/perf: apple_m1: Support per-implementation event tables drivers/perf: apple_m1: Support a per-implementation number of counters drivers/perf: apple_m1: Support configuring counters for 32-bit EL0 drivers/perf: apple_m1: Support per-implementation PMU startup drivers/perf: apple_m1: Support per-implementation event attr group drivers/perf: apple_m1: Add Apple A7 support drivers/perf: apple_m1: Add Apple A8/A8X support drivers/perf: apple_m1: Add A9/A9X support drivers/perf: apple_m1: Add Apple A10/A10X/T2 Support drivers/perf: apple_m1: Add Apple A11 Support arm64: dts: apple: s5l8960x: Add CPU PMU nodes arm64: dts: apple: t7000: Add CPU PMU nodes arm64: dts: apple: t7001: Add CPU PMU nodes arm64: dts: apple: s800-0-3: Add CPU PMU nodes arm64: dts: apple: s8001: Add CPU PMU nodes arm64: dts: apple: t8010: Add CPU PMU nodes arm64: dts: apple: t8011: Add CPU PMU nodes arm64: dts: apple: t8012: Add CPU PMU nodes arm64: dts: apple: t8015: Add CPU PMU nodes Documentation/devicetree/bindings/arm/pmu.yaml | 6 + arch/arm64/boot/dts/apple/s5l8960x.dtsi | 8 + arch/arm64/boot/dts/apple/s800-0-3.dtsi | 8 + arch/arm64/boot/dts/apple/s8001.dtsi | 8 + arch/arm64/boot/dts/apple/t7000.dtsi | 8 + arch/arm64/boot/dts/apple/t7001.dtsi | 9 + arch/arm64/boot/dts/apple/t8010.dtsi | 8 + arch/arm64/boot/dts/apple/t8011.dtsi | 9 + arch/arm64/boot/dts/apple/t8012.dtsi | 8 + arch/arm64/boot/dts/apple/t8015.dtsi | 24 + arch/arm64/include/asm/apple_m1_pmu.h | 3 + drivers/perf/apple_m1_cpu_pmu.c | 807 +++++++++++++++++++++++-- 12 files changed, 871 insertions(+), 35 deletions(-) --- base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8 change-id: 20250211-apple-cpmu-5a5a3da39483 Best regards, -- Nick Chan <towinchenmi@gmail.com>
On Mon, Jun 16, 2025 at 09:31:49AM +0800, Nick Chan wrote: > This series adds support for the CPU PMU in the older Apple A7-A11, T2 > SoCs. These PMUs may have a different event layout, less counters, or > deliver their interrupts via IRQ instead of a FIQ. Since some of those > older SoCs support 32-bit EL0, counting for 32-bit EL0 also need to > be enabled by the driver where applicable. > > Patch 1 adds the DT bindings. > Patch 2-7 prepares the driver to allow adding support for those > older SoCs. Modulo my nits, the patches look alright to this point... > Patch 8-12 adds support for the older SoCs. ... but I'm not sure if anybody actually cares about these older SoCs and, even if they do, what the state of the rest of Linux is on those parts. I recall horror stories about the OS being quietly migrated between CPUs with incompatible features, at which point I think we have to question whether we actually care about supporting this hardware. On the other hand, if it all works swimmingly and it's just the PMU driver that needs updating, then I could get on board with it. Will
Will Deacon 於 2025/7/14 夜晚11:12 寫道: > On Mon, Jun 16, 2025 at 09:31:49AM +0800, Nick Chan wrote: >> This series adds support for the CPU PMU in the older Apple A7-A11, T2 >> SoCs. These PMUs may have a different event layout, less counters, or >> deliver their interrupts via IRQ instead of a FIQ. Since some of those >> older SoCs support 32-bit EL0, counting for 32-bit EL0 also need to >> be enabled by the driver where applicable. >> >> Patch 1 adds the DT bindings. >> Patch 2-7 prepares the driver to allow adding support for those >> older SoCs. > Modulo my nits, the patches look alright to this point... > >> Patch 8-12 adds support for the older SoCs. > ... but I'm not sure if anybody actually cares about these older SoCs > and, even if they do, what the state of the rest of Linux is on those > parts. I recall horror stories about the OS being quietly migrated > between CPUs with incompatible features, at which point I think we have > to question whether we actually care about supporting this hardware. The "horror" story you mentioned is about Apple A10/A10X/T2, which has a big little switcher integrated into the cpufreq block, so when the cpufreq driver switch between states in the same way as on other SoCs, on these SoCs that would silently cause a CPU migration. There is only one incompatible feature that I am aware of which is 32-bit EL0 support. However, since the CPUs in these SoCs does not support 4K pages anyways in practice this is not an issue for as long as CONFIG_EXPERT is disabled. > > On the other hand, if it all works swimmingly and it's just the PMU > driver that needs updating, then I could get on board with it. As mentioned above, it does all work fine when CONFIG_EXPERT is not enabled, and if it is enabled, then 32-bit process may crash with illegal instruction but everything else will still works fine. > > Will > Nick Chan
On Mon, Jul 14, 2025 at 11:59:36PM +0800, Nick Chan wrote: > > Will Deacon 於 2025/7/14 夜晚11:12 寫道: > > On Mon, Jun 16, 2025 at 09:31:49AM +0800, Nick Chan wrote: > >> This series adds support for the CPU PMU in the older Apple A7-A11, T2 > >> SoCs. These PMUs may have a different event layout, less counters, or > >> deliver their interrupts via IRQ instead of a FIQ. Since some of those > >> older SoCs support 32-bit EL0, counting for 32-bit EL0 also need to > >> be enabled by the driver where applicable. > >> > >> Patch 1 adds the DT bindings. > >> Patch 2-7 prepares the driver to allow adding support for those > >> older SoCs. > > Modulo my nits, the patches look alright to this point... > > > >> Patch 8-12 adds support for the older SoCs. > > ... but I'm not sure if anybody actually cares about these older SoCs > > and, even if they do, what the state of the rest of Linux is on those > > parts. I recall horror stories about the OS being quietly migrated > > between CPUs with incompatible features, at which point I think we have > > to question whether we actually care about supporting this hardware. > The "horror" story you mentioned is about Apple A10/A10X/T2, which > has a big little switcher integrated into the cpufreq block, so when the > cpufreq driver switch between states in the same way as on other > SoCs, on these SoCs that would silently cause a CPU migration. There > is only one incompatible feature that I am aware of which is 32-bit EL0 > support. Surely the MIDR/REVIDR/AIDR also change? In general, silent migration isn't acceptable for the kernel, even if you largely happen to get away with that today. It is not acceptable for architectural feature support to change dynamically. > However, since the CPUs in these SoCs does not support > 4K pages anyways in practice this is not an issue for as long as > CONFIG_EXPERT is disabled. Do these parts have EL2? > > On the other hand, if it all works swimmingly and it's just the PMU > > driver that needs updating, then I could get on board with it. > > As mentioned above, it does all work fine when CONFIG_EXPERT is not > enabled, and if it is enabled, then 32-bit process may crash with illegal > instruction but everything else will still works fine. I don't think that's quite true, unless these parts are also violating the architecture. If the CPU doesn't implement AArch32, then an ERET to AArch32 is illegal. The way illegal exception returns are handled means that this will result in a (fatal) illegal execution state exception being taken from the exception return code in the kernel, not an UNDEF being taken from userspace that would result in a SIGILL. I do not think that we should pretend to support hardware with silent microarchitectural migration. So at the very least, we do not care about A10/A10X/T2. Mark.
On 17/7/2025 23:05, Mark Rutland wrote: > On Mon, Jul 14, 2025 at 11:59:36PM +0800, Nick Chan wrote: >> >> Will Deacon 於 2025/7/14 夜晚11:12 寫道: >>> On Mon, Jun 16, 2025 at 09:31:49AM +0800, Nick Chan wrote: >>>> This series adds support for the CPU PMU in the older Apple A7-A11, T2 >>>> SoCs. These PMUs may have a different event layout, less counters, or >>>> deliver their interrupts via IRQ instead of a FIQ. Since some of those >>>> older SoCs support 32-bit EL0, counting for 32-bit EL0 also need to >>>> be enabled by the driver where applicable. >>>> >>>> Patch 1 adds the DT bindings. >>>> Patch 2-7 prepares the driver to allow adding support for those >>>> older SoCs. >>> Modulo my nits, the patches look alright to this point... >>> >>>> Patch 8-12 adds support for the older SoCs. >>> ... but I'm not sure if anybody actually cares about these older SoCs >>> and, even if they do, what the state of the rest of Linux is on those >>> parts. I recall horror stories about the OS being quietly migrated >>> between CPUs with incompatible features, at which point I think we have >>> to question whether we actually care about supporting this hardware. >> The "horror" story you mentioned is about Apple A10/A10X/T2, which >> has a big little switcher integrated into the cpufreq block, so when the >> cpufreq driver switch between states in the same way as on other >> SoCs, on these SoCs that would silently cause a CPU migration. There >> is only one incompatible feature that I am aware of which is 32-bit EL0 >> support. > > Surely the MIDR/REVIDR/AIDR also change? They do not change. ID_AA64PFR0_EL1 also does not change (fixed 0x12). What *does* change however is MPIDR. (P-cores has bit 16 set while E-cores do not) > > In general, silent migration isn't acceptable for the kernel, even if > you largely happen to get away with that today. It is not acceptable for > architectural feature support to change dynamically. > >> However, since the CPUs in these SoCs does not support >> 4K pages anyways in practice this is not an issue for as long as >> CONFIG_EXPERT is disabled. > > Do these parts have EL2? No. > >>> On the other hand, if it all works swimmingly and it's just the PMU >>> driver that needs updating, then I could get on board with it. >> >> As mentioned above, it does all work fine when CONFIG_EXPERT is not >> enabled, and if it is enabled, then 32-bit process may crash with illegal >> instruction but everything else will still works fine. > > I don't think that's quite true, unless these parts are also violating > the architecture. > > If the CPU doesn't implement AArch32, then an ERET to AArch32 is > illegal. The way illegal exception returns are handled means that this > will result in a (fatal) illegal execution state exception being taken > from the exception return code in the kernel, not an UNDEF being taken > from userspace that would result in a SIGILL. Speaking from experience, when testing with the userspace cpufreq governor, trying to run AArch32 code on the ecores really does result in illegal instruction for that process while everything else remains fine. Referencing ID_AA64PFR0_EL1, the E-cores does claim to support AArch32 EL0, even though they could not execute it for real. > > I do not think that we should pretend to support hardware with silent > microarchitectural migration. So at the very least, we do not care about > A10/A10X/T2. As explained above, what actually happens on the hardware is different from what you believed, so please do reconsider. > > Mark. Nick Chan
On Fri, Jul 18, 2025 at 01:00:45AM +0800, Nick Chan wrote: > On 17/7/2025 23:05, Mark Rutland wrote: > > On Mon, Jul 14, 2025 at 11:59:36PM +0800, Nick Chan wrote: > >> Will Deacon 於 2025/7/14 夜晚11:12 寫道: > >>> On Mon, Jun 16, 2025 at 09:31:49AM +0800, Nick Chan wrote: > >>>> Patch 8-12 adds support for the older SoCs. > >>> ... but I'm not sure if anybody actually cares about these older SoCs > >>> and, even if they do, what the state of the rest of Linux is on those > >>> parts. I recall horror stories about the OS being quietly migrated > >>> between CPUs with incompatible features, at which point I think we have > >>> to question whether we actually care about supporting this hardware. > >> The "horror" story you mentioned is about Apple A10/A10X/T2, which > >> has a big little switcher integrated into the cpufreq block, so when the > >> cpufreq driver switch between states in the same way as on other > >> SoCs, on these SoCs that would silently cause a CPU migration. There > >> is only one incompatible feature that I am aware of which is 32-bit EL0 > >> support. > > > > Surely the MIDR/REVIDR/AIDR also change? > They do not change. ID_AA64PFR0_EL1 also does not change (fixed 0x12). > What *does* change however is MPIDR. (P-cores has bit 16 set while > E-cores do not) The MPIDR changing isn't ok either. You might get away with that today, but that's not supposed to change behind the back of the kernel. Is there anything else that can change, or are we absolutley certain that *only* MPIDR changes? > >> As mentioned above, it does all work fine when CONFIG_EXPERT is not > >> enabled, and if it is enabled, then 32-bit process may crash with illegal > >> instruction but everything else will still works fine. > > > > I don't think that's quite true, unless these parts are also violating > > the architecture. > > > > If the CPU doesn't implement AArch32, then an ERET to AArch32 is > > illegal. The way illegal exception returns are handled means that this > > will result in a (fatal) illegal execution state exception being taken > > from the exception return code in the kernel, not an UNDEF being taken > > from userspace that would result in a SIGILL. > Speaking from experience, when testing with the userspace cpufreq governor, > trying to run AArch32 code on the ecores really does result in illegal > instruction for that process while everything else remains fine. > > Referencing ID_AA64PFR0_EL1, the E-cores does claim to support > AArch32 EL0, even though they could not execute it for real. Ok, so that's a clear violation of the architecture, and doesn't fill me with confidence about anything else. > > I do not think that we should pretend to support hardware with silent > > microarchitectural migration. So at the very least, we do not care about > > A10/A10X/T2. > As explained above, what actually happens on the hardware is different > from what you believed, so please do reconsider. Different certainly, but still problematic. I maintain that we should not pretend to support this hardware. Mark.
Mark Rutland 於 2025/7/18 夜晚11:01 寫道: > On Fri, Jul 18, 2025 at 01:00:45AM +0800, Nick Chan wrote: >> On 17/7/2025 23:05, Mark Rutland wrote: >>> On Mon, Jul 14, 2025 at 11:59:36PM +0800, Nick Chan wrote: >>>> Will Deacon 於 2025/7/14 夜晚11:12 寫道: >>>>> On Mon, Jun 16, 2025 at 09:31:49AM +0800, Nick Chan wrote: >>>>>> Patch 8-12 adds support for the older SoCs. >>>>> ... but I'm not sure if anybody actually cares about these older SoCs >>>>> and, even if they do, what the state of the rest of Linux is on those >>>>> parts. I recall horror stories about the OS being quietly migrated >>>>> between CPUs with incompatible features, at which point I think we have >>>>> to question whether we actually care about supporting this hardware. >>>> The "horror" story you mentioned is about Apple A10/A10X/T2, which >>>> has a big little switcher integrated into the cpufreq block, so when the >>>> cpufreq driver switch between states in the same way as on other >>>> SoCs, on these SoCs that would silently cause a CPU migration. There >>>> is only one incompatible feature that I am aware of which is 32-bit EL0 >>>> support. >>> Surely the MIDR/REVIDR/AIDR also change? >> They do not change. ID_AA64PFR0_EL1 also does not change (fixed 0x12). >> What *does* change however is MPIDR. (P-cores has bit 16 set while >> E-cores do not) > The MPIDR changing isn't ok either. You might get away with that today, > but that's not supposed to change behind the back of the kernel. > > Is there anything else that can change, or are we absolutley certain > that *only* MPIDR changes? Only MPIDR changes, and the state of bit 16 in MPIDR is consistent across all PEs. (At any given moment, either all PEs are backed by efficiency cores, or all backed by performance cores) > >>>> As mentioned above, it does all work fine when CONFIG_EXPERT is not >>>> enabled, and if it is enabled, then 32-bit process may crash with illegal >>>> instruction but everything else will still works fine. >>> I don't think that's quite true, unless these parts are also violating >>> the architecture. >>> >>> If the CPU doesn't implement AArch32, then an ERET to AArch32 is >>> illegal. The way illegal exception returns are handled means that this >>> will result in a (fatal) illegal execution state exception being taken >>> from the exception return code in the kernel, not an UNDEF being taken >>> from userspace that would result in a SIGILL. >> Speaking from experience, when testing with the userspace cpufreq governor, >> trying to run AArch32 code on the ecores really does result in illegal >> instruction for that process while everything else remains fine. >> >> Referencing ID_AA64PFR0_EL1, the E-cores does claim to support >> AArch32 EL0, even though they could not execute it for real. > Ok, so that's a clear violation of the architecture, and doesn't fill me > with confidence about anything else. Regarding this, the hardware also needs to handle the case where the PE is already in AArch32 EL0 and migration to E-cores is attempted. In this case there is no exception return happening so the behavior of the hardware is not as bad as it sounds. > >>> I do not think that we should pretend to support hardware with silent >>> microarchitectural migration. So at the very least, we do not care about >>> A10/A10X/T2. >> As explained above, what actually happens on the hardware is different >> from what you believed, so please do reconsider. > Different certainly, but still problematic. > > I maintain that we should not pretend to support this hardware. > > Mark. > Nick Chan
On Sun, Jun 15, 2025 at 6:32 PM Nick Chan <towinchenmi@gmail.com> wrote: > > This series adds support for the CPU PMU in the older Apple A7-A11, T2 > SoCs. These PMUs may have a different event layout, less counters, or > deliver their interrupts via IRQ instead of a FIQ. Since some of those > older SoCs support 32-bit EL0, counting for 32-bit EL0 also need to > be enabled by the driver where applicable. > > Patch 1 adds the DT bindings. > Patch 2-7 prepares the driver to allow adding support for those > older SoCs. > Patch 8-12 adds support for the older SoCs. > Patch 13-21 are the DT changes. > > Signed-off-by: Nick Chan <towinchenmi@gmail.com> Hi Nick, This is substantial work and it looks good to me. Do you know why there's been little progress on landing these patches? Buggy Apple ARM PMU support in the kernel has led to reworking the perf tool. It seems best that we can have the best drivers possible. Thanks, Ian > --- > Changes in v7: > - Fix a W=1 compile warning in apple_pmu_get_event_idx() as appearently using GENMASK() > in a function prototype causes a warning in GCC. > - Link to v6: https://lore.kernel.org/r/20250407-apple-cpmu-v6-0-ae8c2f225c1f@gmail.com > > Changes in v6: > - Rebased on top of v6.15-rc1 (Conflict with FEAT_PMUv3 support for KVM on Apple Hardware) > - Add patch to skip initialization of PMUv3 remap in EL1 even though not strictly needed > - Include DT patches > - Link to v5: https://lore.kernel.org/r/20250228-apple-cpmu-v5-0-9e124cd28ed4@gmail.com > > Changes in v5: > - Slightly change "drivers/perf: apple_m1: Add Apple A11 Support", to keep things in > chronological order. > - Link to v4: https://lore.kernel.org/r/20250214-apple-cpmu-v4-0-ffca0e45147e@gmail.com > > Changes in v4: > - Support per-implementation event attr group > - Fix Apple A7 event attr groups > - Link to v3: https://lore.kernel.org/r/20250213-apple-cpmu-v3-0-be7f8aded81f@gmail.com > > Changes in v3: > - Configure PMC8 and PMC9 for 32-bit EL0 > - Remove redundant _common suffix from shared functions > - Link to v2: https://lore.kernel.org/r/20250213-apple-cpmu-v2-0-87b361932e88@gmail.com > > Changes in v2: > - Remove unused flags parameter from apple_pmu_init_common() > - Link to v1: https://lore.kernel.org/r/20250212-apple-cpmu-v1-0-f8c7f2ac1743@gmail.com > > --- > Nick Chan (21): > dt-bindings: arm: pmu: Add Apple A7-A11 SoC CPU PMU compatibles > drivers/perf: apple_m1: Only init PMUv3 remap when EL2 is available > drivers/perf: apple_m1: Support per-implementation event tables > drivers/perf: apple_m1: Support a per-implementation number of counters > drivers/perf: apple_m1: Support configuring counters for 32-bit EL0 > drivers/perf: apple_m1: Support per-implementation PMU startup > drivers/perf: apple_m1: Support per-implementation event attr group > drivers/perf: apple_m1: Add Apple A7 support > drivers/perf: apple_m1: Add Apple A8/A8X support > drivers/perf: apple_m1: Add A9/A9X support > drivers/perf: apple_m1: Add Apple A10/A10X/T2 Support > drivers/perf: apple_m1: Add Apple A11 Support > arm64: dts: apple: s5l8960x: Add CPU PMU nodes > arm64: dts: apple: t7000: Add CPU PMU nodes > arm64: dts: apple: t7001: Add CPU PMU nodes > arm64: dts: apple: s800-0-3: Add CPU PMU nodes > arm64: dts: apple: s8001: Add CPU PMU nodes > arm64: dts: apple: t8010: Add CPU PMU nodes > arm64: dts: apple: t8011: Add CPU PMU nodes > arm64: dts: apple: t8012: Add CPU PMU nodes > arm64: dts: apple: t8015: Add CPU PMU nodes > > Documentation/devicetree/bindings/arm/pmu.yaml | 6 + > arch/arm64/boot/dts/apple/s5l8960x.dtsi | 8 + > arch/arm64/boot/dts/apple/s800-0-3.dtsi | 8 + > arch/arm64/boot/dts/apple/s8001.dtsi | 8 + > arch/arm64/boot/dts/apple/t7000.dtsi | 8 + > arch/arm64/boot/dts/apple/t7001.dtsi | 9 + > arch/arm64/boot/dts/apple/t8010.dtsi | 8 + > arch/arm64/boot/dts/apple/t8011.dtsi | 9 + > arch/arm64/boot/dts/apple/t8012.dtsi | 8 + > arch/arm64/boot/dts/apple/t8015.dtsi | 24 + > arch/arm64/include/asm/apple_m1_pmu.h | 3 + > drivers/perf/apple_m1_cpu_pmu.c | 807 +++++++++++++++++++++++-- > 12 files changed, 871 insertions(+), 35 deletions(-) > --- > base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8 > change-id: 20250211-apple-cpmu-5a5a3da39483 > > Best regards, > -- > Nick Chan <towinchenmi@gmail.com> > >
Ian Rogers 於 2025/6/16 下晝5:36 寫道: > On Sun, Jun 15, 2025 at 6:32 PM Nick Chan <towinchenmi@gmail.com> wrote: >> This series adds support for the CPU PMU in the older Apple A7-A11, T2 >> SoCs. These PMUs may have a different event layout, less counters, or >> deliver their interrupts via IRQ instead of a FIQ. Since some of those >> older SoCs support 32-bit EL0, counting for 32-bit EL0 also need to >> be enabled by the driver where applicable. >> >> Patch 1 adds the DT bindings. >> Patch 2-7 prepares the driver to allow adding support for those >> older SoCs. >> Patch 8-12 adds support for the older SoCs. >> Patch 13-21 are the DT changes. >> >> Signed-off-by: Nick Chan <towinchenmi@gmail.com> > Hi Nick, > > This is substantial work and it looks good to me. Do you know why > there's been little progress on landing these patches? Buggy Apple ARM > PMU support in the kernel has led to reworking the perf tool. It seems > best that we can have the best drivers possible. I have no idea why the patches are taking so long. As for the buggy part I think only notable bug has been M2's performance counter length increase from 48 to 64 (which for linux's purposes is from 47 to 63) being overlooked[1], and I don't think there have been regressions. It is not so much bugs, but rather lack of features. For the longest time we knew almost nothing about the PMU events, and it is not until someone managed to extract the event names from macOS and Apple's Apple Silicon CPU Optimization Guide[2] that we know quite a bit more about the PMU. The event names are then added when it was determined that it is okay from a copyright perspective[3] (it's the same as being allowed to use registers names from proprietary ARM ARM). As for the description the guide does have descriptions, but descriptions are more doubtful than names from a copyright perspective so I do not know if they could be ever be added to the userspace perf tool. [1]: https://lore.kernel.org/all/20230528080205.288446-1-maz@kernel.org/ [2]: https://github.com/cyyself/m1-pmu-gen [3]: https://lore.kernel.org/all/tencent_C5DA658E64B8D13125210C8D707CD8823F08@qq.com/ Best regards, Nick Chan > > Thanks, > Ian > >> --- >> Changes in v7: >> - Fix a W=1 compile warning in apple_pmu_get_event_idx() as appearently using GENMASK() >> in a function prototype causes a warning in GCC. >> - Link to v6: https://lore.kernel.org/r/20250407-apple-cpmu-v6-0-ae8c2f225c1f@gmail.com >> >> Changes in v6: >> - Rebased on top of v6.15-rc1 (Conflict with FEAT_PMUv3 support for KVM on Apple Hardware) >> - Add patch to skip initialization of PMUv3 remap in EL1 even though not strictly needed >> - Include DT patches >> - Link to v5: https://lore.kernel.org/r/20250228-apple-cpmu-v5-0-9e124cd28ed4@gmail.com >> >> Changes in v5: >> - Slightly change "drivers/perf: apple_m1: Add Apple A11 Support", to keep things in >> chronological order. >> - Link to v4: https://lore.kernel.org/r/20250214-apple-cpmu-v4-0-ffca0e45147e@gmail.com >> >> Changes in v4: >> - Support per-implementation event attr group >> - Fix Apple A7 event attr groups >> - Link to v3: https://lore.kernel.org/r/20250213-apple-cpmu-v3-0-be7f8aded81f@gmail.com >> >> Changes in v3: >> - Configure PMC8 and PMC9 for 32-bit EL0 >> - Remove redundant _common suffix from shared functions >> - Link to v2: https://lore.kernel.org/r/20250213-apple-cpmu-v2-0-87b361932e88@gmail.com >> >> Changes in v2: >> - Remove unused flags parameter from apple_pmu_init_common() >> - Link to v1: https://lore.kernel.org/r/20250212-apple-cpmu-v1-0-f8c7f2ac1743@gmail.com >> >> --- >> Nick Chan (21): >> dt-bindings: arm: pmu: Add Apple A7-A11 SoC CPU PMU compatibles >> drivers/perf: apple_m1: Only init PMUv3 remap when EL2 is available >> drivers/perf: apple_m1: Support per-implementation event tables >> drivers/perf: apple_m1: Support a per-implementation number of counters >> drivers/perf: apple_m1: Support configuring counters for 32-bit EL0 >> drivers/perf: apple_m1: Support per-implementation PMU startup >> drivers/perf: apple_m1: Support per-implementation event attr group >> drivers/perf: apple_m1: Add Apple A7 support >> drivers/perf: apple_m1: Add Apple A8/A8X support >> drivers/perf: apple_m1: Add A9/A9X support >> drivers/perf: apple_m1: Add Apple A10/A10X/T2 Support >> drivers/perf: apple_m1: Add Apple A11 Support >> arm64: dts: apple: s5l8960x: Add CPU PMU nodes >> arm64: dts: apple: t7000: Add CPU PMU nodes >> arm64: dts: apple: t7001: Add CPU PMU nodes >> arm64: dts: apple: s800-0-3: Add CPU PMU nodes >> arm64: dts: apple: s8001: Add CPU PMU nodes >> arm64: dts: apple: t8010: Add CPU PMU nodes >> arm64: dts: apple: t8011: Add CPU PMU nodes >> arm64: dts: apple: t8012: Add CPU PMU nodes >> arm64: dts: apple: t8015: Add CPU PMU nodes >> >> Documentation/devicetree/bindings/arm/pmu.yaml | 6 + >> arch/arm64/boot/dts/apple/s5l8960x.dtsi | 8 + >> arch/arm64/boot/dts/apple/s800-0-3.dtsi | 8 + >> arch/arm64/boot/dts/apple/s8001.dtsi | 8 + >> arch/arm64/boot/dts/apple/t7000.dtsi | 8 + >> arch/arm64/boot/dts/apple/t7001.dtsi | 9 + >> arch/arm64/boot/dts/apple/t8010.dtsi | 8 + >> arch/arm64/boot/dts/apple/t8011.dtsi | 9 + >> arch/arm64/boot/dts/apple/t8012.dtsi | 8 + >> arch/arm64/boot/dts/apple/t8015.dtsi | 24 + >> arch/arm64/include/asm/apple_m1_pmu.h | 3 + >> drivers/perf/apple_m1_cpu_pmu.c | 807 +++++++++++++++++++++++-- >> 12 files changed, 871 insertions(+), 35 deletions(-) >> --- >> base-commit: 0af2f6be1b4281385b618cb86ad946eded089ac8 >> change-id: 20250211-apple-cpmu-5a5a3da39483 >> >> Best regards, >> -- >> Nick Chan <towinchenmi@gmail.com> >> >>
On Mon, Jun 16, 2025 at 02:36:18AM -0700, Ian Rogers wrote: > On Sun, Jun 15, 2025 at 6:32 PM Nick Chan <towinchenmi@gmail.com> wrote: > > > > This series adds support for the CPU PMU in the older Apple A7-A11, T2 > > SoCs. These PMUs may have a different event layout, less counters, or > > deliver their interrupts via IRQ instead of a FIQ. Since some of those > > older SoCs support 32-bit EL0, counting for 32-bit EL0 also need to > > be enabled by the driver where applicable. > > > > Patch 1 adds the DT bindings. > > Patch 2-7 prepares the driver to allow adding support for those > > older SoCs. > > Patch 8-12 adds support for the older SoCs. > > Patch 13-21 are the DT changes. > > > > Signed-off-by: Nick Chan <towinchenmi@gmail.com> > > Hi Nick, > > This is substantial work and it looks good to me. Do you know why > there's been little progress on landing these patches? Buggy Apple ARM > PMU support in the kernel has led to reworking the perf tool. It seems > best that we can have the best drivers possible. You reworked the perf tool to support these things? Why? These changes are targetting chips in old iPhones afaict (as opposed to "Apple Silicon"). I think that (a) most people don't particularly care about them and (b) they're not fully supported _anyway_ because of crazy stuff like [1]. Will [1] https://lore.kernel.org/r/20240909091425.16258-1-towinchenmi@gmail.com
On Mon, Jun 16, 2025 at 3:29 AM Will Deacon <will@kernel.org> wrote: > > On Mon, Jun 16, 2025 at 02:36:18AM -0700, Ian Rogers wrote: > > On Sun, Jun 15, 2025 at 6:32 PM Nick Chan <towinchenmi@gmail.com> wrote: > > > > > > This series adds support for the CPU PMU in the older Apple A7-A11, T2 > > > SoCs. These PMUs may have a different event layout, less counters, or > > > deliver their interrupts via IRQ instead of a FIQ. Since some of those > > > older SoCs support 32-bit EL0, counting for 32-bit EL0 also need to > > > be enabled by the driver where applicable. > > > > > > Patch 1 adds the DT bindings. > > > Patch 2-7 prepares the driver to allow adding support for those > > > older SoCs. > > > Patch 8-12 adds support for the older SoCs. > > > Patch 13-21 are the DT changes. > > > > > > Signed-off-by: Nick Chan <towinchenmi@gmail.com> > > > > Hi Nick, > > > > This is substantial work and it looks good to me. Do you know why > > there's been little progress on landing these patches? Buggy Apple ARM > > PMU support in the kernel has led to reworking the perf tool. It seems > > best that we can have the best drivers possible. > > You reworked the perf tool to support these things? Why? These changes > are targetting chips in old iPhones afaict (as opposed to "Apple Silicon"). > I think that (a) most people don't particularly care about them and (b) > they're not fully supported _anyway_ because of crazy stuff like [1]. I was meaning that we reworked the perf tool to work around the Apple ARM PMU driver expecting to work as if it were an uncore rather than a core PMU driver. More context here: "[REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5" https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@marcan.st/ But in general it would be nice Apple ARM PMU support were well loved. I think we went 2 or 3 minor releases with the perf tool not working, threats of substantial reverts to avoid the PMU driver bug being exposed, etc. As for which Apple ARM devices should have perf support, it seems the more the merrier. Thanks, Ian > Will > > [1] https://lore.kernel.org/r/20240909091425.16258-1-towinchenmi@gmail.com
On Mon, Jun 16, 2025 at 03:44:49AM -0700, Ian Rogers wrote: > On Mon, Jun 16, 2025 at 3:29 AM Will Deacon <will@kernel.org> wrote: > > > > On Mon, Jun 16, 2025 at 02:36:18AM -0700, Ian Rogers wrote: > > > On Sun, Jun 15, 2025 at 6:32 PM Nick Chan <towinchenmi@gmail.com> wrote: > > > > > > > > This series adds support for the CPU PMU in the older Apple A7-A11, T2 > > > > SoCs. These PMUs may have a different event layout, less counters, or > > > > deliver their interrupts via IRQ instead of a FIQ. Since some of those > > > > older SoCs support 32-bit EL0, counting for 32-bit EL0 also need to > > > > be enabled by the driver where applicable. > > > > > > > > Patch 1 adds the DT bindings. > > > > Patch 2-7 prepares the driver to allow adding support for those > > > > older SoCs. > > > > Patch 8-12 adds support for the older SoCs. > > > > Patch 13-21 are the DT changes. > > > > > > > > Signed-off-by: Nick Chan <towinchenmi@gmail.com> > > > > > > Hi Nick, > > > > > > This is substantial work and it looks good to me. Do you know why > > > there's been little progress on landing these patches? Buggy Apple ARM > > > PMU support in the kernel has led to reworking the perf tool. It seems > > > best that we can have the best drivers possible. > > > > You reworked the perf tool to support these things? Why? These changes > > are targetting chips in old iPhones afaict (as opposed to "Apple Silicon"). > > I think that (a) most people don't particularly care about them and (b) > > they're not fully supported _anyway_ because of crazy stuff like [1]. > > I was meaning that we reworked the perf tool to work around the Apple > ARM PMU driver expecting to work as if it were an uncore rather than a > core PMU driver. More context here: > "[REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5" > https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@marcan.st/ > But in general it would be nice Apple ARM PMU support were well loved. > I think we went 2 or 3 minor releases with the perf tool not working, > threats of substantial reverts to avoid the PMU driver bug being > exposed, etc. It's unfortunate that you've had a torrid time with the Apple PMU driver, but I think it's important to realise that it's both unmaintained (it ends up with me via the catch-all for drivers/perf/) and was written based off whatever reverse-engineering people could be bothered to do in their spare time. It's frankly remarkable that it works as well as it does. Despite all of that, I still don't think that your concerns apply to the patches in _this_ series, which is about adding support for older Apple chips. > As for which Apple ARM devices should have perf support, it seems the > more the merrier. Easy to say when you don't have to maintain the driver! Will
On Tue, 17 Jun 2025 15:16:50 +0100, Will Deacon <will@kernel.org> wrote: > > On Mon, Jun 16, 2025 at 03:44:49AM -0700, Ian Rogers wrote: > > On Mon, Jun 16, 2025 at 3:29 AM Will Deacon <will@kernel.org> wrote: > > > > > > On Mon, Jun 16, 2025 at 02:36:18AM -0700, Ian Rogers wrote: > > > > On Sun, Jun 15, 2025 at 6:32 PM Nick Chan <towinchenmi@gmail.com> wrote: > > > > > > > > > > This series adds support for the CPU PMU in the older Apple A7-A11, T2 > > > > > SoCs. These PMUs may have a different event layout, less counters, or > > > > > deliver their interrupts via IRQ instead of a FIQ. Since some of those > > > > > older SoCs support 32-bit EL0, counting for 32-bit EL0 also need to > > > > > be enabled by the driver where applicable. > > > > > > > > > > Patch 1 adds the DT bindings. > > > > > Patch 2-7 prepares the driver to allow adding support for those > > > > > older SoCs. > > > > > Patch 8-12 adds support for the older SoCs. > > > > > Patch 13-21 are the DT changes. > > > > > > > > > > Signed-off-by: Nick Chan <towinchenmi@gmail.com> > > > > > > > > Hi Nick, > > > > > > > > This is substantial work and it looks good to me. Do you know why > > > > there's been little progress on landing these patches? Buggy Apple ARM > > > > PMU support in the kernel has led to reworking the perf tool. It seems > > > > best that we can have the best drivers possible. > > > > > > You reworked the perf tool to support these things? Why? These changes > > > are targetting chips in old iPhones afaict (as opposed to "Apple Silicon"). > > > I think that (a) most people don't particularly care about them and (b) > > > they're not fully supported _anyway_ because of crazy stuff like [1]. > > > > I was meaning that we reworked the perf tool to work around the Apple > > ARM PMU driver expecting to work as if it were an uncore rather than a > > core PMU driver. More context here: > > "[REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5" > > https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@marcan.st/ > > But in general it would be nice Apple ARM PMU support were well loved. > > I think we went 2 or 3 minor releases with the perf tool not working, > > threats of substantial reverts to avoid the PMU driver bug being > > exposed, etc. > > It's unfortunate that you've had a torrid time with the Apple PMU driver, > but I think it's important to realise that it's both unmaintained (it > ends up with me via the catch-all for drivers/perf/) and was written > based off whatever reverse-engineering people could be bothered to do in > their spare time. It's frankly remarkable that it works as well as it > does. Also, the "broken" driver actually works as expected. Ian blames the userspace breakage on that driver, but that's only because the way we deal with PMUs on ARM doesn't match his mental model. Oh well. M. -- Without deviation from the norm, progress is not possible.
On Tue, Jun 17, 2025 at 9:47 AM Marc Zyngier <maz@kernel.org> wrote: > > On Tue, 17 Jun 2025 15:16:50 +0100, > Will Deacon <will@kernel.org> wrote: > > > > On Mon, Jun 16, 2025 at 03:44:49AM -0700, Ian Rogers wrote: > > > On Mon, Jun 16, 2025 at 3:29 AM Will Deacon <will@kernel.org> wrote: > > > > > > > > On Mon, Jun 16, 2025 at 02:36:18AM -0700, Ian Rogers wrote: > > > > > On Sun, Jun 15, 2025 at 6:32 PM Nick Chan <towinchenmi@gmail.com> wrote: > > > > > > > > > > > > This series adds support for the CPU PMU in the older Apple A7-A11, T2 > > > > > > SoCs. These PMUs may have a different event layout, less counters, or > > > > > > deliver their interrupts via IRQ instead of a FIQ. Since some of those > > > > > > older SoCs support 32-bit EL0, counting for 32-bit EL0 also need to > > > > > > be enabled by the driver where applicable. > > > > > > > > > > > > Patch 1 adds the DT bindings. > > > > > > Patch 2-7 prepares the driver to allow adding support for those > > > > > > older SoCs. > > > > > > Patch 8-12 adds support for the older SoCs. > > > > > > Patch 13-21 are the DT changes. > > > > > > > > > > > > Signed-off-by: Nick Chan <towinchenmi@gmail.com> > > > > > > > > > > Hi Nick, > > > > > > > > > > This is substantial work and it looks good to me. Do you know why > > > > > there's been little progress on landing these patches? Buggy Apple ARM > > > > > PMU support in the kernel has led to reworking the perf tool. It seems > > > > > best that we can have the best drivers possible. > > > > > > > > You reworked the perf tool to support these things? Why? These changes > > > > are targetting chips in old iPhones afaict (as opposed to "Apple Silicon"). > > > > I think that (a) most people don't particularly care about them and (b) > > > > they're not fully supported _anyway_ because of crazy stuff like [1]. > > > > > > I was meaning that we reworked the perf tool to work around the Apple > > > ARM PMU driver expecting to work as if it were an uncore rather than a > > > core PMU driver. More context here: > > > "[REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5" > > > https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@marcan.st/ > > > But in general it would be nice Apple ARM PMU support were well loved. > > > I think we went 2 or 3 minor releases with the perf tool not working, > > > threats of substantial reverts to avoid the PMU driver bug being > > > exposed, etc. > > > > It's unfortunate that you've had a torrid time with the Apple PMU driver, > > but I think it's important to realise that it's both unmaintained (it > > ends up with me via the catch-all for drivers/perf/) and was written > > based off whatever reverse-engineering people could be bothered to do in > > their spare time. It's frankly remarkable that it works as well as it > > does. > > Also, the "broken" driver actually works as expected. Ian blames the > userspace breakage on that driver, but that's only because the way we > deal with PMUs on ARM doesn't match his mental model. Oh well. I'm not sure what this is in reference to or what you mean by my mental model. The linked patch was that legacy events didn't support the extended type bits added by Intel for hybrid. Prior to this legacy events were broken on ARM BIG.little PMUs and would select an arbitrary PMU - not good by anybody's mental model. I'm happy to chat with whatever issues you think I'm creating. I think you are making reference to situations where I've cleaned up a mess with Intel hybrid and then cleaned up a mess on ARM. I continue to try to clean up a mess for RISC-V. Sorry this makes you think I'm a bad guy. Ian > M. > > -- > Without deviation from the norm, progress is not possible.
On Tue, Jun 17, 2025 at 7:16 AM Will Deacon <will@kernel.org> wrote: > > On Mon, Jun 16, 2025 at 03:44:49AM -0700, Ian Rogers wrote: > > On Mon, Jun 16, 2025 at 3:29 AM Will Deacon <will@kernel.org> wrote: > > > > > > On Mon, Jun 16, 2025 at 02:36:18AM -0700, Ian Rogers wrote: > > > > On Sun, Jun 15, 2025 at 6:32 PM Nick Chan <towinchenmi@gmail.com> wrote: > > > > > > > > > > This series adds support for the CPU PMU in the older Apple A7-A11, T2 > > > > > SoCs. These PMUs may have a different event layout, less counters, or > > > > > deliver their interrupts via IRQ instead of a FIQ. Since some of those > > > > > older SoCs support 32-bit EL0, counting for 32-bit EL0 also need to > > > > > be enabled by the driver where applicable. > > > > > > > > > > Patch 1 adds the DT bindings. > > > > > Patch 2-7 prepares the driver to allow adding support for those > > > > > older SoCs. > > > > > Patch 8-12 adds support for the older SoCs. > > > > > Patch 13-21 are the DT changes. > > > > > > > > > > Signed-off-by: Nick Chan <towinchenmi@gmail.com> > > > > > > > > Hi Nick, > > > > > > > > This is substantial work and it looks good to me. Do you know why > > > > there's been little progress on landing these patches? Buggy Apple ARM > > > > PMU support in the kernel has led to reworking the perf tool. It seems > > > > best that we can have the best drivers possible. > > > > > > You reworked the perf tool to support these things? Why? These changes > > > are targetting chips in old iPhones afaict (as opposed to "Apple Silicon"). > > > I think that (a) most people don't particularly care about them and (b) > > > they're not fully supported _anyway_ because of crazy stuff like [1]. > > > > I was meaning that we reworked the perf tool to work around the Apple > > ARM PMU driver expecting to work as if it were an uncore rather than a > > core PMU driver. More context here: > > "[REGRESSION] Perf (userspace) broken on big.LITTLE systems since v6.5" > > https://lore.kernel.org/lkml/08f1f185-e259-4014-9ca4-6411d5c1bc65@marcan.st/ > > But in general it would be nice Apple ARM PMU support were well loved. > > I think we went 2 or 3 minor releases with the perf tool not working, > > threats of substantial reverts to avoid the PMU driver bug being > > exposed, etc. > > It's unfortunate that you've had a torrid time with the Apple PMU driver, > but I think it's important to realise that it's both unmaintained (it > ends up with me via the catch-all for drivers/perf/) and was written > based off whatever reverse-engineering people could be bothered to do in > their spare time. It's frankly remarkable that it works as well as it > does. > > Despite all of that, I still don't think that your concerns apply to the > patches in _this_ series, which is about adding support for older Apple > chips. > > > As for which Apple ARM devices should have perf support, it seems the > > more the merrier. > > Easy to say when you don't have to maintain the driver! Well I do send patches ([1] is based on a patch I sent and James reworked), but yeah. It is a bit strange in this case that we have something that is both unmaintained but not taking a patch series due to the cost of maintenance :-) Hopefully it can land. Thanks, Ian [1] https://lore.kernel.org/lkml/20230710122138.1450930-2-james.clark@arm.com/ > > Will
© 2016 - 2025 Red Hat, Inc.