[PATCH v2 01/11] arm64: dts: amlogic: Add cache information to the Amlogic GXBB and GXL SoC

Anand Moon posted 11 patches 1 month, 1 week ago
[PATCH v2 01/11] arm64: dts: amlogic: Add cache information to the Amlogic GXBB and GXL SoC
Posted by Anand Moon 1 month, 1 week ago
As per S905 and S905X datasheet add missing cache information to
the Amlogic GXBB and GXL SoC.

- Each Cortex-A53 core has 32KB of L1 instruction cache available and
	32KB of L1 data cache available.
- Along with 512KB Unified L2 cache.

Cache memory significantly reduces the time it takes for the CPU
to access data and instructions, leading to faster program execution
and overall system responsiveness.

Signed-off-by: Anand Moon <linux.amoon@gmail.com>
---
 arch/arm64/boot/dts/amlogic/meson-gx.dtsi | 27 +++++++++++++++++++++++
 1 file changed, 27 insertions(+)

diff --git a/arch/arm64/boot/dts/amlogic/meson-gx.dtsi b/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
index 7d99ca44e660..c1d8e81d95cb 100644
--- a/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
+++ b/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
@@ -95,6 +95,12 @@ cpu0: cpu@0 {
 			compatible = "arm,cortex-a53";
 			reg = <0x0 0x0>;
 			enable-method = "psci";
+			d-cache-line-size = <32>;
+			d-cache-size = <0x8000>;
+			d-cache-sets = <32>;
+			i-cache-line-size = <32>;
+			i-cache-size = <0x8000>;
+			i-cache-sets = <32>;
 			next-level-cache = <&l2>;
 			clocks = <&scpi_dvfs 0>;
 			#cooling-cells = <2>;
@@ -105,6 +111,12 @@ cpu1: cpu@1 {
 			compatible = "arm,cortex-a53";
 			reg = <0x0 0x1>;
 			enable-method = "psci";
+			d-cache-line-size = <32>;
+			d-cache-size = <0x8000>;
+			d-cache-sets = <32>;
+			i-cache-line-size = <32>;
+			i-cache-size = <0x8000>;
+			i-cache-sets = <32>;
 			next-level-cache = <&l2>;
 			clocks = <&scpi_dvfs 0>;
 			#cooling-cells = <2>;
@@ -115,6 +127,12 @@ cpu2: cpu@2 {
 			compatible = "arm,cortex-a53";
 			reg = <0x0 0x2>;
 			enable-method = "psci";
+			d-cache-line-size = <32>;
+			d-cache-size = <0x8000>;
+			d-cache-sets = <32>;
+			i-cache-line-size = <32>;
+			i-cache-size = <0x8000>;
+			i-cache-sets = <32>;
 			next-level-cache = <&l2>;
 			clocks = <&scpi_dvfs 0>;
 			#cooling-cells = <2>;
@@ -125,6 +143,12 @@ cpu3: cpu@3 {
 			compatible = "arm,cortex-a53";
 			reg = <0x0 0x3>;
 			enable-method = "psci";
+			d-cache-line-size = <32>;
+			d-cache-size = <0x8000>;
+			d-cache-sets = <32>;
+			i-cache-line-size = <32>;
+			i-cache-size = <0x8000>;
+			i-cache-sets = <32>;
 			next-level-cache = <&l2>;
 			clocks = <&scpi_dvfs 0>;
 			#cooling-cells = <2>;
@@ -134,6 +158,9 @@ l2: l2-cache0 {
 			compatible = "cache";
 			cache-level = <2>;
 			cache-unified;
+			cache-size = <0x80000>; /* L2. 512 KB */
+			cache-line-size = <64>;
+			cache-sets = <512>;
 		};
 	};
 
-- 
2.50.1
Re: [PATCH v2 01/11] arm64: dts: amlogic: Add cache information to the Amlogic GXBB and GXL SoC
Posted by Christian Hewitt 1 month, 1 week ago
> On 25 Aug 2025, at 10:51 am, Anand Moon <linux.amoon@gmail.com> wrote:
> 
> As per S905 and S905X datasheet add missing cache information to
> the Amlogic GXBB and GXL SoC.
> 
> - Each Cortex-A53 core has 32KB of L1 instruction cache available and
> 32KB of L1 data cache available.
> - Along with 512KB Unified L2 cache.
> 
> Cache memory significantly reduces the time it takes for the CPU
> to access data and instructions, leading to faster program execution
> and overall system responsiveness.

Hello Anand,

I’m wondering if we are “enabling caching” in these patches (could be
a significant gain, as per text) or we are “optimising caching” meaning
the kernel currently assumes generic/safe defaults so having accurate
descriptions in dt allows better efficiency (marginal gain)?

Stats are also subjective to the workload used, but do you have any
kind of before/after benchmarks? (for any of the SoCs in the patchset)

Christian

> Signed-off-by: Anand Moon <linux.amoon@gmail.com>
> ---
> arch/arm64/boot/dts/amlogic/meson-gx.dtsi | 27 +++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
> 
> diff --git a/arch/arm64/boot/dts/amlogic/meson-gx.dtsi b/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
> index 7d99ca44e660..c1d8e81d95cb 100644
> --- a/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
> +++ b/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
> @@ -95,6 +95,12 @@ cpu0: cpu@0 {
> compatible = "arm,cortex-a53";
> reg = <0x0 0x0>;
> enable-method = "psci";
> + d-cache-line-size = <32>;
> + d-cache-size = <0x8000>;
> + d-cache-sets = <32>;
> + i-cache-line-size = <32>;
> + i-cache-size = <0x8000>;
> + i-cache-sets = <32>;
> next-level-cache = <&l2>;
> clocks = <&scpi_dvfs 0>;
> #cooling-cells = <2>;
> @@ -105,6 +111,12 @@ cpu1: cpu@1 {
> compatible = "arm,cortex-a53";
> reg = <0x0 0x1>;
> enable-method = "psci";
> + d-cache-line-size = <32>;
> + d-cache-size = <0x8000>;
> + d-cache-sets = <32>;
> + i-cache-line-size = <32>;
> + i-cache-size = <0x8000>;
> + i-cache-sets = <32>;
> next-level-cache = <&l2>;
> clocks = <&scpi_dvfs 0>;
> #cooling-cells = <2>;
> @@ -115,6 +127,12 @@ cpu2: cpu@2 {
> compatible = "arm,cortex-a53";
> reg = <0x0 0x2>;
> enable-method = "psci";
> + d-cache-line-size = <32>;
> + d-cache-size = <0x8000>;
> + d-cache-sets = <32>;
> + i-cache-line-size = <32>;
> + i-cache-size = <0x8000>;
> + i-cache-sets = <32>;
> next-level-cache = <&l2>;
> clocks = <&scpi_dvfs 0>;
> #cooling-cells = <2>;
> @@ -125,6 +143,12 @@ cpu3: cpu@3 {
> compatible = "arm,cortex-a53";
> reg = <0x0 0x3>;
> enable-method = "psci";
> + d-cache-line-size = <32>;
> + d-cache-size = <0x8000>;
> + d-cache-sets = <32>;
> + i-cache-line-size = <32>;
> + i-cache-size = <0x8000>;
> + i-cache-sets = <32>;
> next-level-cache = <&l2>;
> clocks = <&scpi_dvfs 0>;
> #cooling-cells = <2>;
> @@ -134,6 +158,9 @@ l2: l2-cache0 {
> compatible = "cache";
> cache-level = <2>;
> cache-unified;
> + cache-size = <0x80000>; /* L2. 512 KB */
> + cache-line-size = <64>;
> + cache-sets = <512>;
> };
> };
> 
> -- 
> 2.50.1
> 
> 
> _______________________________________________
> linux-amlogic mailing list
> linux-amlogic@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-amlogic
Re: [PATCH v2 01/11] arm64: dts: amlogic: Add cache information to the Amlogic GXBB and GXL SoC
Posted by Anand Moon 1 month, 1 week ago
Hi Christian,

On Mon, 25 Aug 2025 at 13:29, Christian Hewitt
<christian@hewittfamily.org.uk> wrote:
>
> > On 25 Aug 2025, at 10:51 am, Anand Moon <linux.amoon@gmail.com> wrote:
> >
> > As per S905 and S905X datasheet add missing cache information to
> > the Amlogic GXBB and GXL SoC.
> >
> > - Each Cortex-A53 core has 32KB of L1 instruction cache available and
> > 32KB of L1 data cache available.
> > - Along with 512KB Unified L2 cache.
> >
> > Cache memory significantly reduces the time it takes for the CPU
> > to access data and instructions, leading to faster program execution
> > and overall system responsiveness.
>
> Hello Anand,
>
> I’m wondering if we are “enabling caching” in these patches (could be
> a significant gain, as per text) or we are “optimising caching” meaning
> the kernel currently assumes generic/safe defaults so having accurate
> descriptions in dt allows better efficiency (marginal gain)?
>
> Stats are also subjective to the workload used, but do you have any
> kind of before/after benchmarks? (for any of the SoCs in the patchset)
>

This is a fundamental feature of Arm64 CPUs that tracks active instructions
and data within cache-mapped memory pages.
Enabling it can significantly enhance overall system performance.

We can configure more l2 cache memory which is confribable as per the
Arm TRM document.
Arm Cortex - A53  - Configurable L2 cache size of 128KB, 256KB, 512KB,
1MB and 2MB.
Arm Cortex - A55  - Configurable L2 cache size of 64KB, 128KB, or 256KB
Arm Cortex - A73 -  Configurable L2 cache size of 256KB, 512KB, 1MB,
2MB, 4MB, or 8MB.

Here's an article that provides detailed insights into the cache feature.
[0] http://jake.dothome.co.kr/cache4/

I tested with a small benchmark to test factorial.

Before:>
alarm@archl-librecm:~$ sudo perf stat -e cache-references,cache-misses ./test
Simulated Cache Miss Time (avg): 589 ns
Factorial(10) = 3628800

 Performance counter stats for './test':

           3017286      cache-references
             45414      cache-misses                     #    1.51% of
all cache refs

       0.054512394 seconds time elapsed

       0.004209000 seconds user
       0.041866000 seconds sys

After:>
 # sudo perf stat -e cache-references,cache-misses ./test
Simulated Cache Miss Time (avg): 426 ns
Factorial(10) = 3628800

 Performance counter stats for './test':

           2814633      cache-references
             27054      cache-misses                     #    0.96% of
all cache refs

       0.041041585 seconds time elapsed

       0.007976000 seconds user
       0.032009000 seconds sys

> Christian

Thanks
-Anand